From: Brian Milnes (brianmilnes_at_qwest.net)
Date: Mon Mar 01 2004 - 14:32:12 PST
Scale and Performance in the Denali Isolation Kernel - Whitaker, Shaw and
Gribble
The authors produced a virtual machine monitor that is designed
to isolate and to allow massive numbers of virtual servers. Instead of
exposing the often complicated machine interfaces they expose simplified
NIC, disk and x86 instruction sets at the cost of upward compatibility.
They propose four design principles: expose low level
representations, prevent sharing by exposing only virtualized name spaces,
Zipf's law for scale and produce simplified interfaces for scale and
performance. The last two differentiate them from systems like DISCO.
Denali exposes an idle-with-timeout instruction and special
registers to allow access to information such as the CPU speed and the size
of physical memory. There is no virtual MMU and the kernel lives in its own
untouchable piece of the 32 bit address space to prevent TLB flushes on
context switches. Denali simplifies devices and swapping to simplify the
host operating system and VM paging. They provide virtual disks and virtual
switched Ethernet NIC.
They simplify the VM by providing a supervising VM that can
create and destroy the other VMs and holds the network stack. The provide an
in library Ilwaco guest operating system in the style of the exokernel
libOSs. Ilwaco supports a subset of the Posix interfaces and has a port of
the Free BSD TCP/IP stack.
A worst case context switch is 9 micro seconds and a best case
is 1.4 micro seconds. The network benches were hampered by poor hardware but
HTTP throughput is not far off for both small and large files, but BSD has a
system call advantage for medium sized files and so outperforms by about
40%. The batched interrupts gain them as much at 30% for a large number of
machines.
The virtualization is very cheap in memory with 10,000 VMs
requiring only 81 MB of memory. The benchmarking is very unrealistic. They
should have compared their performance on a real Zipf page distribution from
a single server to many different virtual hosts serving the same
distribution. This in and out of core memory regime is pretty bogus. What
really matters here is the latency to get a rare site served while handling
the common ones.
This archive was generated by hypermail 2.1.6 : Mon Mar 01 2004 - 14:32:23 PST