DISCO Review; warning long

From: Brian Milnes (brianmilnes_at_qwest.net)
Date: Mon Mar 01 2004 - 14:33:56 PST

  • Next message: Justin Voskuhl: "Review for "Disco: Running Commodity Operating Systems on Scalable Multiprocessors""

    Disco: Running Commodity Operating Systems on Scalable Multiprocessors -
    Bugnion, Devine and Rosenbloom

    This is an unbelievably meaty paper full of many subtle implementation
    discussions. I'm not sure that I'm fully getting the details of this virtual
    machine and disk emulation correctly and would like you to walk some
    examples in class. This also does not produce a short paper review.

                The authors propose using virtual machine monitors to simply the
    construction of the operating systems for large shared memory
    multiprocessors. Their system DISCO minimizes some of the costs of virtual
    machines by sharing a buffer cache and allows their coupling with TCP/IP or
    NFS. NFS seems a pretty poor choice as it produces lock ups in every kernel
    that I've used it with. Their uniprocessor overhead is at most 16% and they
    speed up some workloads by 40% presumably by improving scalability of the
    operating system and hide the NUMA-ness of these systems.

                Their approach has a whole host of advantages and disadvantages.
    They allow fault containment, running different operating systems
    simultaneously, running different versions simultaneously, hiding the NUMA
    aspects, running a lighter weight operating system. They require more
    memory, different exception processing, privilege instruction and IO device
    remapping, a lack of coherent scheduling and memory management policy and
    additional communications overhead.

                DISCO virtualizes MIPS R10000 chips by emulating instructions,
    the MMU and the trap architecture. They use load and stores on special
    addresses to optimize frequent kernel operations. They dynamically relocate
    pages, virtualize disks and provide a large packet virtual network between
    machines. DISCO is implemented as a shared memory threaded program with a
    mere 13K lines of cache careful low synchronization code.

    They virtualized CPUs with direct execution and trap into their kernel to
    emulate faults. They operate DISCO in a physical memory mode, using the TLB
    directly to avoid emulation and flush the TLB when running a new processor
    to allow them to use its process tags. They minimize this overhead with a
    per virtual machine TLB cache.

                DISCO measures frequency of access to pages and moves hot pages
    to the physical machine running their accessing virtual machine. They
    support virtual IO devices by requiring a monitor call when reading or
    writing data to a driver and by catching DMA accesses. They partition disks
    for each machine but really use an shared buffer cache implemented with a
    B-tree. They mark shared disk blocks COW and TLB map them into a virtual
    machine to achieve a sharing very similar to a single operating system.

                They define a virtual MTU-less network driver that they
    implement using memory mapping. They respect the alignment of all data and
    allow any data on a read only page to be mapped between processors. This is
    a very cool idea that produces transfer rates at near TLB miss rate but
    required some alignment changes to Irix's mbuf mechanism.

                The made almost all of their changes at the hardware abstraction
    layer of Irix. The HAL layer was adjusted to map traps into accesses to
    privileged memory and to rewrite trap code to work this way dynamically. The
    HAL layer also provided hints to DISCO on page frees and idle CPUs. I
    wonder what happened when they tried this on Linux which does not have a
    clear HAL.

                They measured DISCO on a uniprocessor Irix machine and a
    parallel operating systems simulator. The achieved between 3% and 16%
    slowdown on four tasks and note that a restructuring of IRIX could reduce
    some of these overheads. They suffer slightly in memory allocation to
    virtual operating systems but can achieve up to 60% speedup when the virtual
    memory system locks of Irix are problematic.


  • Next message: Justin Voskuhl: "Review for "Disco: Running Commodity Operating Systems on Scalable Multiprocessors""

    This archive was generated by hypermail 2.1.6 : Mon Mar 01 2004 - 14:33:59 PST