From: Sellakumaran (ksella_at_hotmail.com)
Date: Wed Jan 21 2004 - 22:41:35 PST
This paper describes a reliability subsystem Nooks which attempts to improve
the reliability of Linux by isolating OS from failures. The approach is
simple and effective and the paper describes in simple and readable
language. The paper quickly catches reader's attention by stating some
interesting facts about windows XP and Linux and a practical approach/goal
to the problem. In the current popular OSes, the system crashes if there
are any faults in OS extensions. Nooks isolates failures by running these
extensions in a lightweight kernel protection domain.
The Nooks architecture is based on two core principles:
1) Design for fault resistance, not fault tolerance
2) Design for mistakes, not abuse
And with these principles, Nook's architecture has the following goals:
1) Isolation (of failures)
2) Recovery and
3) Backward Compatibility
These goals are achieved by creating a reliability layer which separates
Kernel extensions from Kernel.
Isolation is achieved by
1) Light weight protection domains
2) Extension Procedure calls (XPC).
3) Copy-in/Copy-out
4) Wrappers
Recovery in Nooks consists of two parts. After a fault occurs, the recovery
manager releases resources in use by extension and the user-mode agent
coordinates recovery and determines what course of action to take.
Nooks achieves transparency for extensions by
1) Wrapper stubs for every function call in the extension-kernel
interface
2) Object-tracking code for every object type that passes between the
extension and the Kernel.
The paper then describes the test methodology used to prove the system
reliability which is at an excellent level of automatic recovery from 99% of
faults due to modules in Linux. The experiments use synthetic fault
injection to rapidly insert faults in Linux kernel extensions.
There is a performance hit because of Nooks but considering the increased
reliability I think that it does a wonderful job.
Nooks has taken a practical approach, modest goals and has come out with
excellent reliability .
This archive was generated by hypermail 2.1.6 : Wed Jan 21 2004 - 22:41:33 PST