From: Prasanna Kumar Jayapal (prasak_at_winse.microsoft.com)
Date: Mon Mar 01 2004 - 16:57:41 PST
This paper (Disco: Running Commodity OSs on Scalable Multiprocessors)
talks about problems faced in developing system software for shared
memory multi-processors and comes up with the solution of adding a level
of indirection ("Virtual Machine Monitors") between the OS and the
hardware.
The authors resurrect the concept of virtual machine monitors in this
paper. The VMM provides a thin layer of software that supports hardware
virtualization and management of hardware resources. In the case of
Disco the goal is to exploit a multiprocessor architecture and to hide
its specific quirks (NUMA). The VMM allows multiple operating systems to
run in their own virtual machines with their own virtualized view of
hardware resources. The basic assumption is that a VMM is a relatively
small piece of software that can be written in a relatively bug-free
manner.
Virtualization is achieved primarily through the interception of
privileged instructions issued by a VM OS (used also to provide I/O
virtualization) and intelligent TLB remapping. To achieve good
performance, direct execution for most operations are done.
Disco reduces the overhead in running VMs and enhances the data sharing
between VMs through shared buffer caches and copy-on-write disks. Also
the 2nd software-level TLB that translates virtual to machine pages to
reduce TLB misses, page migration and replication to deal with ccNUMA,
non-uniform access times, using standard NFS and TCP/IP to let VM's
communicate helps in reducing the overheads.
Minimal modifications to the OS Hal demonstrate improved performance
(reduce emulation), and they also provide a mechanism for the VM OS to
communicate with the VMM enabling a VM OS to indicate what pages are to
be freed etc. Disco also tries to provide UMA like behavior on top of a
NUMA architecture.
The experimental results show that overhead is indeed low when using
Disco, so the system achieves its goals. What is interesting in this
paper, is the fact that Disco is a compromise between an OS-intensive
and OS-light approach. In other words, Disco balances the trade-off
between development cost and perfect resource management of the
underlying multiprocessor system.
Overall, I thought this was an interesting paper and describes the
concepts and the design very nicely. One thing that I was a little
curious was the hardware failure scenarios and I didn't see the paper
addressing this issue.
This archive was generated by hypermail 2.1.6 : Mon Mar 01 2004 - 16:57:34 PST