Review of Nooks

From: Gail Rahn (grahn_at_cs.washington.edu)
Date: Wed Jan 21 2004 - 15:53:55 PST

  • Next message: Ankur Rawat \(Excell Data Corporation\): "Nooks review"

    Review of Nooks

    This paper introduces Nooks, an OS subsystem that enhances reliability by
    isolating device drivers from other operating system resources. Using Nooks,
    device driver failures are protected from damaging the reliability of the
    running operating system. A device driver crash no longer jeapordizes the
    running environment. Nooks assumes a generally well-intentioned operating
    system extension, that extensions are well-behaved but may fail due to
    errors in design and implementation. Nooks does not protect against an
    extension process directly addressing kernel memory. At about 20,000 lines
    of code, Nooks does not introduce extensive complexity into the operating
    system.

    Device drivers are faulty because they are, generally, written by
    third-party developers inexperienced in operating system design. These
    developers are good at interfacing with a specific piece of hardware, but
    not necessarily experienced at producing software with the reliability
    desired of operating-system components. For a solution to effectively handle
    and recover from device driver crashes, it must run efficiently and
    integrate into existing systems. Nooks is a unique because it attempts to
    manage OS extension failures in existing systems, using existing drivers
    and the sofware development environment (C). Previous solutions force
    "newness" somewhere in the cycle - a new architecture, a new programming
    language, etc.

    The architecture of Nooks isolates the kernel from driver failures. Nooks
    implements an isolation manager (NIM) that handles isolation, tracking and
    recovery by intercepting all communication between the kernel and its
    extensions. Nooks implements the Extension Procedure Call (XPC) to manage
    this control transfer. Managing extension/kernel communication like
    marshaling and transporting RPCs, Nooks is able to capture and track the
    calls made between OS and extension. Nooks also inserts itself as an
    intermediate layer between references to kernel data structures.

    Because Nooks tracks function calls and data structure references between
    the operating system and its kernel, Nooks can attempt to recover resources
    in use by the extension at the time of its failure.

    An interesting part of Nooks' recovery is its set of configuration files,
    files that determine how to recover from a specific extension failure. Nooks
    knows about the default recovery policy for an extension, which can include
    system manager notification. Nooks can also notice when a driver fails too
    frequently, and escalate the recovery-handling policy, even including
    disabling of the system extension.

    A compelling point made about this transparent protection mechanism is its
    highly variable performance benchmarks. Nooks has a unique footprint for
    each protected extension. Using Nooks, the performance of some extensions
    dropped by less than 10%. With others, including a kernel-mode webserver,
    the performance penalty was almost 60%. That performance implies that
    deciding to protect a specific kernel extension using Nooks should be made
    on a case-by-case basis and after thorough. Couldn't this performance
    penalty be a drawback for acceptance? Is there any evidence that extensions
    that are expensive to protect are precisely the set of processes that should
    be managed?

    Gail Rahn
    grahn_at_cs.washington.edu


  • Next message: Ankur Rawat \(Excell Data Corporation\): "Nooks review"

    This archive was generated by hypermail 2.1.6 : Wed Jan 21 2004 - 15:54:18 PST