May 18, 2006

[SiN] Bug Postmortem #2

People seem surprised that I’m actually being open about bugs on this blog. I can understand why my being forthright about these issues would cause some confusion. In this industry, “bugs” is a four-letter word in more than one way.

But one of the big things that I have been an advocate of is information sharing. I believe that the more lessons we can learn from others, the less likely we are to repeat their mistakes. My goal throughout this process has been to make sure that everything I do is as transparent as possible. With that disclosure out of the way, let’s postmortem another bug.

The evening we released, we started hearing reports that the in-engine cinematic at the end of the second U4 Labs level wouldn’t progress for some people. This was extremely odd, because the only time we had seen this happen was back before Radek wasn’t able to walk. The bug hadn’t surfaced in over seven months of testing, and never surfaced in any of our playtests.

While many things could have caused this bug, we narrowed it down to three major causes, two of which were related. To talk about the causes, we need to first discuss abusive players.

Every game out there has abusive players. These are guys who will do everything they can to try to exploit the game; to try to beat the design of the game rather than the game itself. It’s like a guy playing volleyball who intentionally spikes the ball at the head of the slowest person on the other team in an attempt to get it to bounce off the guy’s head back into his area so he can try to keep the round going. It’s completely legal and within the rules, but a pretty piss-poor way to play.

Back to the game. During testing, we set aside time to essentially act like assholes towards the levels and do everything possible to break and/or exploit them. We had exploits ranging from being able to club mission-critical NPC’s to death using a cardboard box to being able to jump on top of the lighthouse to being able to flip the car up into the upper stories of Highrise and have it still be usable.

The vast majority of the exploits were easily fixable, but for the rest, we sat down and looked at the amount of effort that was required to trigger the exploit. Oh, this one can be done in two minutes easily? Must fix. This one requires ten minutes of split-second jumping and exploiting the physics push of four primed grenades? Fix if we have time. This one requires seventy-five minutes of preparation to break? No.

These last ones were the worst ones. It is possible to break interactive cinematics in pretty much every single game that has been released ever, especially if you really really really really really really try to. We decided that if someone was intentionally trying to break the game by doing a bunch of stupid shit, we’d let them. After all, this part was a single-player game, and if someone really wanted to trap themselves, why should we stop them? All they’d be affecting is their own experience. If someone wants to smash themselves in the genitals with a pickaxe while singing “O Come All Ye Faithful,” who are we to decide what is fun for them?

This led to cause number one: abusive players. We got a lot of save games from people related to this issue, and the number one cause that we saw was people taking tons of U4 storage containers up into the crane control room and dropping them there. Radek and his grunts had to be able to go through the room. If he couldn’t, the scene couldn’t progress. We timed it, and it took us about 20 minutes to get every single U4 container in that area into the room.

Unfortunately, we missed cause number two: physics. We received one save game where a cardboard box from inside the room shot over, collapsed, but collapsed in such a way that the door itself was jammed shut. We received another where one of the larger propane tanks had flown in and jammed the door. In both cases, the player wasn’t trying to break the sequence, but a confluence of one-in-a-billion events had caused them to be stuck in an 8x10 room unless they replayed the submarine area. Not good.

The third cause was odd to say the least though. For some people, Radek would never leave. This was a bug we had never seen. We had no idea what was causing it. We had two reviewers get stuck with it as well, which was discouraging to say the least. The only commonality we were able to find was that all of the people who were affected by the issue had preloaded the game.

There was a drastic solution that worked, but wasn’t really practical: deleting the ClientSettings.blob file, all "SiN Episodes" GCF’s, and redownloading the game. So, as part of the map recompile, we updated the nodegraph again and sent it down as well so everyone would get the update. Much better to send down a few kilobytes than force the redownload of several gigabytes.

Our lessons learned from this one: 1) If an abusive player can exploit it, it can also hammer an innocent. 2) There are some bugs that you will never see until it gets out into the real world, and you have to go out of your way to diagnose and fix those.

1 comment:

crashmstr said...

As a software developer for over 13 years, I know how difficult it can be to track down and fix bugs.

I've only played a bit of Emergence (bought a 360 and got distracted), but it is good to know that your team is responsive and already has a patch out to fix the problems you have found. I've seen way too many games that never get patches, let alone any acknowledgement that there are any problems.

Keep up the good work!