From: Greg Green (ggreen_at_cs.washington.edu)
Date: Sun Feb 29 2004 - 21:52:49 PST
Denali is a kernel that provides a virtual machine. The goals were
performance, scaling, and simplicity. The simplicity criteria is for
security reasons, ie, the smaller the codebase, the easier it is to
make and verify correctness. The aim is to support many untrusted
internet services on a machine. The current www is making more and
more use of dynamic content, and service providers need to accomodate
this. The virtual machine was not intended to fully emulate a machine,
so current operating systems will not run on it.
There are 4 principles to isolation kernels: expose low-level
resources rather than high-level abstractions. This is similar to
exokernel design. Prevent direct sharing by only exposing private
namespaces. This means that vm's on the same machine can only
communicate via the network. The architecture must scale. Internet
applications are driven by zipf distributions. Therefore the kernel
must minimize the memory footprint of a machine. The fourth principle
is to modify the virtualized architecture for the 3 design critiria.
Denali consists of a small kernel running on x86 hardware. It exposes
a smaller subset of the x86 instruction set. It adds 2 new virtual
instructions. A idle-with-timeout, and a halt. Each VM get's it own
physical 32 bit address space. The kernel is mapped into this process
space but the VM cannot access it. It uses a software-loaded TLB
instead of mimic'ing the hardware page tables used by x86. The kernel
exposes several I/O devices, a NIC, disk, keyboard, console, and
timer. These all have simplified interfaces compared to the real
devices. This is to support performance and scaling. Interrupts are
collected and handed to a non-running VM all at once when the VM is
scheduled to run in it's normal quanta, instead of waking it up for
each interrupt. Virtual memory is divided into a protected kernel
space and a VM accessible portion. A MMU is currently in work but was
not used in the tests discussed in the paper. A fixed swap space is
assigned to each VM when created. There are virtual filesystems that
can be given to a VM. VM's can share the filesystems.
There is a supervisor virtual machine. This is used to start virtual
machines. Since the kernel itself doesn't have a network stack, the
supervisor is used if remote data is needed by the kernel. This
supervisor also maps virtual disks to VM's. A library OS is also
provided, Ilwaco, to give a higher-level os interface to
applications. It is linked into each application.
The second half of the paper is performance measurements. The
important points here were that the batched asynchronous interrupt
handling resulted in a large gain, until swap paging dominates (~800
VM's for a web-server application). The idle-with-timeout call
provides dramatic improvement, a factor of 2 over performance without
the feature. Source-code complexity was compared with that of linux.
I found this idea to be interesting. There was little discussion of
any problems encountered. We know from the exokernel paper that there
are significant problems with trying to come up with good abstractions
with this type of architecture. Also I found the static web server and
quake server performance measurements weak. I don't think security is
a major issue with static web servers. This is a problem with dynamic
content. So we don't know from the paper how these types of
applications would work. Another problem that wasn't discussed was
what happens when the non-virtual instructions are used. It seems
possible that would could cause a security hole. I suppose the kernel
could scan a binary for these types of instructions before executing
it.
-- Greg Green
This archive was generated by hypermail 2.1.6 : Sun Feb 29 2004 - 21:52:51 PST