The current plight of Toyota points out a critical aspect of software development that is frequently overlooked (though I would suggest that Alistair Cockburn has been trying to raise awareness of this in his book, Crystal Clear). Software comes in many shapes and sizes and quality is always important. But when we couple software quality with circumstances where lift can be lost, our attention to detail has to be significantly sharper. Did the various production departments in Toyota realize that, as software becomes more and more ubiquitous in the operations of their vehicles, more and more of it becomes “mission-critical?”
We’ve spent many years in the fledgling software development industry getting products to market by shouting, pushing, cutting corners, working overtime, and shorting our testing. At what point, however, does the mission-criticality of software begin to override contemporary methods of corner cutting. Clearly, Toyota missed that key point when software become a critical element in their accelerator and brake control.
In Agile Development, specifically in Scrum, we focus on DONEness definitions, constant collaboration, constant integration and testing, and frequent software reviews to ensure that our software quality is high and our customers are satisfied with the software functionality they receive. While I may be repeating what Mr. Cockburn has already suggested, it clearly bears repeating.
DONEness definitions and automated testing is only as good as we choose to make them. If your DONEness definition is kept short because it’s too much trouble to get it all done during the Sprint and your tests don’t always work or you run them every three or four days because it’s takes too much time to run them, you’ll end up with poor quality in your software (and you’ll have even less time to get it right next time while you’re fixing more defects). On the other hand, if we load our DONEness definition with everything we could possibly think of, and set very high bars of code and function coverage for our tests, we can significantly improve software quality with a resulting significant increase in overall cost and duration — our software becomes too expensive for the market. In other words, one size of DONEness — one size of quality, does not fit all.
We need to start looking closely at the software capabilities we are creating and determine when those functions are life-threatening (i.e., a failure of the software could result in injuries or fatalities). The US FDA (Food and Drug Administration) has recognized this for years, maintaining a classification standard that has governed the development of (mostly) hardware medical devices. Class 1 devices have a minimal potential to cause harm and are thus produced and regulated under very simple rules.
Class 2 medical devices, however, operate under more strict controls. These devices are not invasive (they are not used in a patient) but could cause harm if used improperly or if they fail. Class 3 medical devices are invasive devices, like pacemakers or valve replacements, that can cause serious injury or death should they fail.
The FDA has been primarily focused on hardware in the past. However, their focus has more and more been turning to software as well. Whether or not this strict regulatory control should be extended to more kinds of software is a matter for debate. What should not be a matter of debate, however, is whether or not we, as developers, should not extend this rigour to our own development efforts.
The concept needn’t be complicated or extensive — it could be modeled after the FDA’s approach:
- Class 1 features – failure doesn’t result in injury or death nor will it result in more then minor financial loss. Our DONEness criteria and testing approach should address common failure and boundary conditions. Code coverage should be in the neighborhood of 85%, all features should be reasonably well tested with automated tests, and test-gap analysis should be conducted against any critical defects to ensure, if possible, that errors don’t reoccur.
- Class 2 features – failure or misuse could result in injury or some degree of financial loss. Our DONEness criteria and testing approach should address common failure and boundary conditions as well as specific performance criteria and risk conditions that may occur if the software is misused by the user. Code coverage should be in the neighborhood of 95%, all features should be rigourously tested with automated and manual tests, and test-gap analysis should be conducted against any defects to ensure that, if possible, errors caanot reoccur.
- Class 3 features – failure or misuse has a high likelihood of causing injury, death, or significant financial loss. Our DONEness criteria and testing approach should address common failure and boundary conditions as well as stringent performance criteria, risk conditions, and failure modes (how the software undergoes controlled degradation under failure conditions). Code coverage should be in the neighborhood of 100%, all features (including failure modes) should be rigourously testing with automated and manual tests.
In Toyota’s situation, an evaluation of the feature should have shown that the software features that interfaced with the acceleration and braking systems were class 3 features. This would have introduced failure-mode testing which I suspect wasn’t done when Southern Illinois Professor Dave Gilbert introduced a short circuit failure in the acceleration system. The short circuit kept the computer from identifying an error condition which resulted in uncontrolled acceleration.
When you don’t detect an error condition reading a customer record for display on a web site, people get annoyed, but they don’t die. This is an example of a class 1 feature. Miss an error condition on the acceleration system, however…
We must globally incorporate mission criticality into our feature design and testing. Toyota may already be on the hook for substantial government penalties (which will be followed by multiple wrongful death suits — perhaps even one or more class action suits).
Who wants to be next?