Disco: Running Commodity Operating Systems on Scalable Multiprocessors

From: Greg Green (ggreen_at_cs.washington.edu)
Date: Mon Mar 01 2004 - 16:32:38 PST

  • Next message: Prasanna Kumar Jayapal: "Review of Disco - Running commodity OSs on Multi-processors (By Bugnion et al)"

    Disco is a virtual machine monitor aimed at running a normal operating
    system on a NUMA multiple processor. The goal is to substantially
    increase the performance of the operating system, without it being
    aware of NUMA architecture. This would allow applications to run and
    get the benefit from new architectures without having to wait for an
    operating system optimized for the machine is produced.

    The implementation of Disco is discussed. The R10000 processor is
    virtualized. Instructions that cannot be run in a vm, are
    reimplemented in the monitor. Each VM is given an abstraction of real
    memory starting at offset 0. Each VM is given a set of virtual
    devices, input, output, hard disks, network devices. The code runs on
    the actual cpu, and only priviliged and direct access to physical
    memory and devices is intercepted. Disco runs in kernel mode, it puts
    the commodity operating system in supervisor mode, and applications in
    user mode. Basically a 3 ring protection mechanism. There is a
    scheduler that maps virtual processors to the real processors in a
    round-robin algorithm.

    To support the virtual physical memory, there is a clever algorithm
    that intercepts TLB modifications and changes the virtual physical
    address to the real memory address. If the virtual memory address
    isn't in the TLB, there is another pmap structure that maintains a
    mapping from physical pages to the virtual physical pages TLB
    entry. So when a TLB miss is trapped, the monitor can quickly insert
    the proper TLB entry from this second table.

    NUMA memory management is facilitated by a dynamic page migration and
    replication scheme. This scheme attempts to maintain locality between
    a virtual CPU and the memory page's it is attempting to access. The
    hardware gives some queues on cache misses per processor that the
    monitor can use to move or copy pages. The I/O devices are virtualized
    by using special device drivers that give the data to the monitor
    which interacts directly with the devices.

    Disk blocks are shared with all applications that need them by
    copy-on-write. As long as a vm doesn't write a block, all vm's will
    use the same machine page for each block. Memory is also shared using
    the same machine memory page mapped into each vm. The copy-on-write
    mechanism only worked on non-persistent disks. The modified block was
    stored in main memory. Real writable disks can only be accessed by 1
    vm at a time, so it doesn't need to be virtualized. NFS pages were
    optimized so that server pages requested by a vm on the same machine
    had the page mapped into memory without a copy being made.

    The changes that needed to be made to IRIX 5.3 for Disco were
    itemized. Some calls to the monitor were inserted so that it had a
    better picture of resource utilization. Most of the changes were in
    the HAL of the operating system.

    The last 1/3 of the paper had experimental measurements of the
    performance on a simulator. The overhead added for 5 seperate
    applications is shown and analyzed. The worst overhead was for pmake
    which used a large number of priviliged system instructions that had
    to be emulated, this had a 16% overhead. The memory footprint is also
    shown for the applications, with various numbers of VM's. There is a
    measurement of performance gains created by the NUMA-awareness of
    DISCO for verilog and raytracing applications.

    I was very impressed by this paper. The benefits of the system seem
    quite good for very little cost. There was a good selection of
    applications used for metrics. I think that it would be very difficult
    to get the os vendors to put in the hooks that the virtual machines
    require however. Is there any consensus on that? It was interesting
    how you could optimize pages for the NUMA architecture without the os
    knowing about it.

    -- 
    Greg Green
    

  • Next message: Prasanna Kumar Jayapal: "Review of Disco - Running commodity OSs on Multi-processors (By Bugnion et al)"

    This archive was generated by hypermail 2.1.6 : Mon Mar 01 2004 - 16:34:42 PST