What does it take to write software that lives depend on and send rockets to space? Fast Company wrote a great article of the software engineers that delivered that software for the Space Shuttle. Particularly noteworthy is the observations of Quinn Larson, 34, had worked on shuttle software for seven years when he left to go to work for Micron Technology automating the saws that cut finished chip wafers to the right size. “It was up to me to decide what to do,” says Larson. “There were no meetings, there was no record-keeping.” He had freedom; it was a real kick. But to Larson’s way of thinking, the culture didn’t focus on, well, the right stuff. “Speed there was the biggest thing,”. Larson eventually went back at the shuttle group. “The people here are just of the highest caliber,” he said on his first day back in Clear Lake.
In interviewing the Shuttle team, they boiled down to 4 key principles that set the development team apart from other software teams:
1. The product is only as good as the plan for the product. At the on-board shuttle group, about one-third of the process of writing software happens before anyone writes a line of code. NASA and the Lockheed Martin group agree in the most minute detail about everything the new code is supposed to do — and they commit that understanding to paper, with the kind of specificity and precision usually found in blueprints. Nothing in the specs is changed without agreement and understanding from both sides. And no coder changes a single line of code without specs carefully outlining the change.
2. Within the whole software team, the team is broken into two seperate groups: the coders and the verifiers. The two outfits report to separate bosses and function under opposing marching orders. The development group is supposed to deliver completely error-free code, so perfect that the testers find no flaws at all. The testing group is supposed to pummel away at the code with flight scenarios and simulations that reveal as many flaws as possible.
3. The software consists of the code and two enormous databases. There is the software. And then there are the databases beneath the software, two enormous databases, encyclopedic in their comprehensiveness. One is the history of the code itself — with every line annotated, showing every time it was changed, why it was changed, when it was changed, what the purpose of the change was, what specifications documents detail the change. Everything that happens to the program is recorded in its master history.
The other database — the error database — stands as a kind of monument to the way the on-board shuttle group goes about its work. Here is recorded every single error ever made while writing or working on the software, going back almost 20 years. For every one of those errors, the database records when the error was discovered; what set of commands revealed the error; who discovered it; what activity was going on when it was discovered — testing, training, or flight. It tracks how the error was introduced into the program; how the error managed to slip past the filters set up at every stage to catch errors — why wasn’t it caught during design? during development inspections? during verification? Finally, the database records how the error was corrected, and whether similar errors might have slipped through the same holes.
4. Don’t just fix the mistakes — fix whatever permitted the mistake. Importantly, the group avoids blaming people for errors. The process assumes blame – and it’s the process that is analyzed to discover why and how an error got through. At the same time, accountability is a team concept: no one person is ever solely responsible for writing or inspecting code. “You don’t get punished for making errors,” says Marjorie Seiter, a senior member of the technical staff. “If I make a mistake, and others reviewed my work, then I’m not alone. I’m not being blamed for this.”