From: Praveen Rao (psrao_at_windows.microsoft.com)
Date: Wed Jan 21 2004 - 00:50:07 PST
This paper discusses an approach to make commodity operating systems
more reliable by managing execution environments for kernel extensions,
namely, drivers. It mentions that driver related bugs are major cause of
system failures (85% in case of XP).
Authors contrast their approach with some of the other approaches taken
to address this issue. These are:
* Hardware based approaches: protection rings, capability based
architectures etc. - the latter requires hardware changes (as currently
popular architectures do not use it) and in general these approaches
don't address recovery.
* Micro-kernel approach to isolation - separate address space for
drivers. These do not address recovery either. In general, perf concerns
keep commodity OSes from adopting this approach despite improvements in
IPC.
* Type-safe languages and runtime: these haven't found acceptance
for system code and require starting over.
In the past virtual memory techniques have been used to protect database
and file systems. Nooks, the system discussed in this paper, attempts to
extend it to OS. Nooks takes an approach of virtualizing interface
between kernel and extensions.
It is designed for
* Fault resistance not tolerance
* Protecting against mistakes not malacious code
Authors state the following goals for nooks:
1. Isolation
2. Recovery
3. Backward compatibility
To achieve these goals nooks performs the following functions:
1. Isolation of extensions: isolation of address space and remoted
(XPC) calls to the kernel
2. Interposition: integrating existing extensions into nooks
environment using interposition
3. Object Tracking: keeping track of data structures that the
extension touches
4. Recovery: on software faults (bad params, excessive resource
usage), hardware faults (processor generates exception)
Isolation
Isolation has two aspects:
1) memory management - to implement lightwight protection domains with
virtual memory protection
2) XPC - to transfer control safely between extension and the kernel
Inetrposition:
* of control transfers - loader modification for kernel and
function pointer fixup for extension
* and of some data references - shadow copy optimization is used
for frequently touched data
* wrappers do the following 3 tasks:
1. parameter validation
2. provide call-by-value semantics
3. facilitate XPC
Object Tracking:
* Manages writes to kernel done by extensions
* Differentiates between objects of single XPC calls and long term
objects, most long term objects have predictable pattern - alloc/dealloc
of extension are known, in some cases semantics are known
Recovery
Recovery manager releases resources in case of failures and restarts the
extensions based on policy.
The paper states that nooks works well when there is a narrow
well-defined interface for interaction. This makes extensions suitable
for nooks. Extensions also deal with opaque data and often batch calls,
which matches nooks' implementation furthermore.
Nooks is better at dealing with fatal failures (which are easy to
detect) than non-fatal failures. The paper argues that in case of
non-fatal failures (e.g. deadlock/data corruption) the failure is
localized anyway. I am not convinced of this. The extension could
corrupt system structures and hang. Also, deadlock in an extension needs
to be recovered from as it might disable a significant functionality
(e.g. what if my display driver hangs) of the system and keeping system
alive may not server much purpose.
There is detailed discussion on perf impact of nooks (something that
system implementers would like to know if they were to consider
incorporating it). It is argued that most perf impact (when there is) is
because of TLB flushes in case of frequent XPC calls.
The paper makes the point that this approach requires no modification to
the extensions, which is very useful given the number of existing
extensions.
I contract the approach of Nooks with the approach of streamlining
drivers such that minimal work is done in kernel-mode and most logic is
in user-mode. In user-mode process isolation can be provided (which is
much easier to achieve given the current state of commodity OS's) and
such approach would work well for most non-performance critical drivers.
Though it is argued that Nooks is easy to implement, I feel that it does
require moderately complex tinkering of the OS.
This archive was generated by hypermail 2.1.6 : Wed Jan 21 2004 - 00:52:22 PST