Review of "Improving the Reliability of Commodity Operating Systems"

From: Song Xue (speedsx_at_hotmail.com)
Date: Tue Jan 20 2004 - 21:07:33 PST

  • Next message: Steve Arnold: "Review: Reliabilty of Commodity OSes"

    The paper "Improving the Reliability of Commodity Operating Systems" by Swift et al describes Nooks, a reliability subsystem that seeks to greatly enhance OS reliability by isolating the OS from driver failures. To reduce the threat of extension failures, Nooks executes each extension in a lightweight kernel protection domain - a privileged kernel-mode environment with restricted write access to kernel memory.
    Three factors motivated the research of Nooks. Computer system reliability remains a crucial but unsolved problem. OS extensions have increasingly prevalent in commodity systems such as Linux and Windows. Extensions are a leading cause of operating system failure. The Nooks architecture is based on two core principles: Design for fault resistance, not fault tolerance; Design for mistakes, not abuse. The Nooks architecture seeks three major goals: Isolation, Recovery, Backward Compatibility.
    Implementation-wise, the Nooks creates a new operating system reliability layer that is inserted between the extensions and the OS kernel. The reliability layer intercepts all interactions between the extensions and the kernel to facilitate isolation and recovery. A crucial property of this layer is transparency, I.e., to meet the compatibility goals, it must be largely invisible to existing components.
    To test the reliability of the Nooks, three types of extensions, device drivers, kernel subsystem and application-specific kernel extension are experimented with synthetic fault injection which rapidly inserts faults in Linux kernel extensions. Nooks eliminated 99% of the system crashes that occurred with native Linux. In addition, Nooks eliminated nearly 60% of non-fatal extension failures caused by our fault injection trials.
    Performance impact of the Nooks varies depending on the type of extensions tested. For sound and Ethernet drivers it is minimal. Overhead is higher in other extensions. Overall, Nooks provides a substantial reliability improvement at costs that depend on the extensions being isolated. The reliability/performance tradeoff is thus one that can be made on a case-by-case basis. For many computing environments, the authors believe it is worth it.


  • Next message: Steve Arnold: "Review: Reliabilty of Commodity OSes"

    This archive was generated by hypermail 2.1.6 : Tue Jan 20 2004 - 21:07:42 PST