Review of "Improving the Reliability of Commodity Operating Systems"

From: Jeff Duzak (jduzak_at_exchange.microsoft.com)
Date: Tue Jan 20 2004 - 22:32:38 PST

  • Next message: Praveen Rao: "Review of Nooks paper"

    In this paper, Swift et al describe Nooks, a system for improving the reliability of existing operating systems by isolating kernel extensions and recovering from kernel extension failures. The paper talks about the motivation for the system, its design and implementation, and lastly reports the results of both reliability and performance testing.
     
    The motivation for the system is clear: a huge proportion of system crashes in commercial operating systems such as Windows and Linux are caused by bugs in kernel extensions such as device drivers. These extensions are often written by external parties who don't have detailed knowledge of the inner workings of the kernel, and therefore it follows that they are more prone to bugs. Further, in order interact with the hardware, this code has to execute in kernel mode. Therefore, if it crashes, the system crashes.
     
    The Nooks system tries to prevent system crashes by creating protected domains in which to execute kernel extensions. Unlike traditional protected subsystems, these protected domains do not enforce any formal protection mechanisms such as capabilities or ACLs, but instead essentially keep track of an extension's interaction with the kernel and with data contained within the kernel. Nooks tries to detect faults in the extension's interaction with the kernel. Further, Nooks tries to gracefully handle occasions in which an exception is thrown from within the extension. In either case, Nooks tries to release data held by the extension, unload the extension, and then reload and restart the extension.
     
    The authors state in the very beginning that unlike protection systems described in other papers, this system is not intended to be foolproof or tamperproof. It is designed to handle as many cases as it can without affecting performance too much. The system is designed to be very practical, and unlike the other systems we have read about, this one seems as though it could very easily be marketed commercially.
     
    One interesting consideration with this system is that, although it may not require any code changes in the kernel extensions that it wraps, it does require those extensions to be re-compiled (for example, the authors mention changing macros and inline functions in the extensions). This might be a problem for kernel extensions for which the source code is not freely distributed. For Windows device drivers, it would require each individual company which writes device drivers to adopt the system.
     
    When talking about the performance of the system, the authors twice make reference to the impressive performance of modern processors. This makes me a little wary. It is very similar to the "memory is cheap" excuse.


  • Next message: Praveen Rao: "Review of Nooks paper"

    This archive was generated by hypermail 2.1.6 : Tue Jan 20 2004 - 22:32:47 PST