Review: Disco

From: Cem Paya 98 (Cem.Paya.98_at_Alum.Dartmouth.ORG)
Date: Mon Mar 01 2004 - 15:18:07 PST

  • Next message: Cem Paya 98: "Review: Disco"


    Paper review: Disco

    CSE 551P, Cem Paya


    Rosenblum et. al. describe an architecture codenamed Disco
    that can run commodity operating system designed for
    uniprocessors on a scalable, multi-processor architecture.
    Main idea is to virtualize the hardware as a collection of
    virtual processors. Each such processor can run its
    separate OS, independently of others. In other words the
    same machine can run Linux and Windows NT at the same time,
    with each OS tricked into believing that it is executing on
    a single processor machine. Main objective here is to take
    an unmodified, off-the-shelf-OS designed for single
    processor execution and transparently run it on a multi
    processor system. Proposed design supports having more VMs
    than there are physical processors so one could try to
    multiplex between large # VMs, similar to the Denali
    isolation kernel. But whereas Denali focused on scaling to
    hundreds of VMs on the same (possibly single proc)
    hardware, the novel aspect of Disco is the new approach to
    designing operating systems for scalable multi-processors.
    In simplified terms, one does not design for SMP or NUMA
    explicitly at all: instead multiple OSes—one for each CPU
    or group of processors on the NUMA-- built without
    awareness of these considerations are layered on top of a
    virtual machine monitor which becomes responsible for
    scaling. This paper lays out the blue-print for that VMM
    and Disco is a prototype implemented on one particular,
    experimental NUMA hardware, the Stanford FLASH machine.


    Authors bring out a good point about commodity OS support
    as contraining factor and obstacle for innovation in
    hardware design. Without wide-spread operating system
    support, new hardware functionality and design paradigms
    can’t gain traction with users. Growth of SMP on x86
    architecture coincides with Windows NT and more recently
    Linux support. More recently Windows has added support for
    NUMA architectures into the kernel and Linux will likely
    follow suit, enabling a new generation of server products.
    From here it is a natural conclusion that being able to add
    support for such features into to an existing OS without
    major effort is important. (Problem is, at least
    historically speaking, NT and Linux have both SMP support
    today so this is a sunk cost. It is also not true that
    systems overall becomes more complex/buggy as a result:
    users that are on uniprocessor architecture can be
    perfectly shielded from changes to support more advanced
    architecture. For example, NT has two different versions of
    the kernel, single and multiple-proc and installation
    process chooses the correct one. Linux likewise can be
    built out of a branch that is uniprocessor only. Also role
    of market forces can’t be ignored: if the hardware
    configuration is important—SMP/NUMA are both performance
    critical—there is competitive pressure to squeeze every
    last optimization. Even if VMM works as first
    approximation, for faster development times, the OS will
    eventually migrate towards native support.)


    Disco emulates the raw hardware of MIPS R10000 processor,
    with a few tweaks necessary to support unmapped kernel
    virtual memory. There are some extensions to the
    architecture but these are optimizations designed for Disco-
    aware operating systems—no changes are required in
    principle to existing code to run on Disco. R10000 did
    cause some problems because of the above memory access
    issue—authors point out this is not applicable to x86 or
    Alpha. Getting IRIX5.3 to run on Disco required changing
    some header files and recompiling/relinking the kernel. On
    most other architectures a “commodity” OS could run without
    modification. Any optimizations to run more efficiently on
    Disco could be confined to a hardware abstraction layer or
    HAL, which again is a common design feature.


    Similar challenges as Denali were faced here, such as
    necessity to virtualize I/O devices and provide “virtual”
    physical memory which the VM believes is the “real” memory
    of the underlying hardware. Unlike Denali which did not
    provide MMU functionality, Disco has a very sophisticated
    implementation that manipulates TLB entries in lock-step
    with the way the host OS running on the VM manipulates what
    it thinks are the TLB entries. It also interposes on disk
    accesses to implement efficient reads on shared data. Such
    sharing is only at the implementation level, using copy-on-
    write semantics—the actual VMs can not communicate with
    each other except through high level network protocols.


    Most remarkable aspect of Disco is the ability to abstract
    away NUMA-nature of the system by exposing a flat uniform
    address space to the VMs while combatting performance
    degradation with intelligent memory management. Among other
    tricks, it can move pages around in order to improve
    locality. With help from the hardware which keeps track
    reads, it chooses between migrating to the most frequently
    active node or duplicating when there are multiple readers.


  • Next message: Cem Paya 98: "Review: Disco"

    This archive was generated by hypermail 2.1.6 : Mon Mar 01 2004 - 15:18:27 PST