At work, you've created a service that half the company uses, and on the weekend you go camping the service crashes and nobody knows how to restart it. Half of the company is now in a failure mode. A failure mode is a degradation of quality, but it is not yet a catastrophe. In your personal life and in your work, you should always think about what kind of quality you'll be limping along with if some component or assumption were to fail. If you find that quality is unpalatable, then it's time to go back to the drawing board and try again. Fault recovery and insurance policies Back at home--cold, shivering, and with the neighbors watching--you grope around under the doormat and find the spare key you've hidden there. You've now recovered from the fault and exited the failure mode. And at the office, your service is being monitored by some kind of watchdog process that detects the crash and restarts the service. The spare key and the watchdog process are insurance policies, but they should not be confused with the quality of your failure mode. You enter failure mode when a component fails, but insurance policies are just another kind of component. Maybe the spare key wasn't there this time, or maybe the watchdog itself crashed. Either way, you're still in a condition of degraded quality. Having an insurance policy doesn't absolve you from engineering as much quality as possible into each level of failure mode. All that you've done is create more possible failure modes. Engineering quality into each failure mode Having learned your lesson, you now either get dressed and put on a coat before going outside to fetch the paper, or you switch to an online subscription, or you defer reading the newspaper until after you're dressed and on the train for work. You've now either eliminated or changed the quality of the failure mode to something more acceptable. At work it's not likely to be so simple, but consider some of the general techniques below as a starting point.
Every system has failure modes, all the way from the trivialities of your personal life to the global economy, and the truth is that we are always operating in at least one failure mode all of the time. My car's suspension needs work, we just lost an employee who walked out with a lot of unwritten knowledge, access to credit has dried up and the economy is shrinking. And yet my car still runs, the company is still in business, and we can still buy and sell things. We just cope with the degradation of quality and work to improve it somehow. |
Home >