Review of Nooks paper - Swift et al...

From: Prasanna Kumar Jayapal (prasak_at_winse.microsoft.com)
Date: Mon Jan 26 2004 - 13:36:22 PST

  • Next message: Slavik Krassovsky: "Mike Swift, Brian Bershad, and Henry Levy. Improving the Reliability of Commodity Operating Systems."

     

    This paper "Improving the Reliability of Commodity Operating Systems", describes the architecture, implementation and performance of Nooks, a new OS subsystem that isolates the OS from driver or extension failures. Nooks intends to significally reduce the OS crashes that happen due to the extension-related failures. The authors very nicely describe the need for Nooks, its design and implementation and in the end summarize their testing with the performance numbers.

     

    Nooks executes each extension in a light weight kernel protection domain, a previleged kernel mode environment which has restricted write access to kernel memory. It also inserts a reliability layer between the OS kernel and the extensions which provides Isolation, Interposition, Object Tracking and Recovery mechanisms. Then, extensions (device drivers) would communicate with the kernel via the Nooks abstraction using XPC (Extended Procedure Call), which is some what similar to the LRPC and PPC. The Nooks abstraction itself is trasparent to the extensions and there would be no code changes involved in the extensions. The object tracking in Nooks, tracks the extension's access to the kernel objects which allows the changes to be rolled back in case of an extension failure.

     

    The performance numbers presented by the authors are very impressive. Recovering automatically from 99% of the faults in the Linux environment seems appealing. Also, the performance penalty of around 10% for the most common drivers like the sound cards and ethernet cards seems acceptable. In the case of kHTTPd where the performance loss is most significant, I thought it was a very unique case and unusual to be present in the kernel.

     

    Although Nooks does not detect all the faults and it is not a fault tolerant system, it attempts in solving the reliability issue with no change to the existing device drivers and a minimal change to the OS kernel. This makes it very practical and very easy to implement.

     

    Overall, this was a though provoking and an interesting paper to read through. I am really curious to see if Windows can make use of such a system.

     


  • Next message: Slavik Krassovsky: "Mike Swift, Brian Bershad, and Henry Levy. Improving the Reliability of Commodity Operating Systems."

    This archive was generated by hypermail 2.1.6 : Mon Jan 26 2004 - 13:36:20 PST