Recovery Oriented Computing (ROC)
Recovery Oriented Computing (ROC) takes the perspective that hardware faults, software bugs, and operator errors are facts to be coped with, not problems to be solved. By concentrating on Mean Time to Repair rather than Mean Time to Failure, ROC reduces recovery time and thus offers higher availability. Since a large portion of system administration is dealing with failures, ROC may also reduce total cost of ownership. ROC principles include design for fast recovery, extensive error detection and diagnosis, systematic error insertion to test emergency systems, and recovery benchmarks to measure progress.
If we embrace availability and maintainability, systems of the future may compete on recovery performance rather than just processor performance, and on total cost of ownership rather than just system price. Such a change may restore our pride in the systems we craft.