Review of Exokernel

From: David Coleman (dcoleman_at_cs.washington.edu)
Date: Mon Jan 19 2004 - 21:33:01 PST

  • Next message: Jeff Duzak: "Review of "Application Performance and Flexibility on Exokernel Systems""

    Resource management has traditionally been considered a basic kernel
    function and resided in the kernel. There are 2 basic reasons for this
    design: the kernel is a protected environment allowing it to enforce
    security policy and the kernel will provide a fair and balanced approach
    to multiplexing resources in theory preventing individual applications
    from hogging resources. Unfortunately this design does not allow for
    application-specific optimization of resources. An exokernel (Xok)
    system is one in which resources are protected by the kernel but managed
    by the applications using them. The kernel provides the protection
    system to ensure that unintentional sharing or corruption does not take
    place but leaves the management of resources up to the individual
    applications. An additional benefit or service that an operating system
    / kernel typically provides are abstractions such as files and virtual
    memory which free the applications from these burdens if unneeded.
    Unfortunately, in current operating system design, if this functionality
    can be better provided by the applications, it does not have access to
    it. To alleviate the necessity for all applications to implement
    operating system abstractions they do not wish to, libraries (LibOSes)
    can be built to provide the relevant operating system abstractions.
    Presented in this paper is a library providing the UNIX operating system
    abstractions called ExOS.

    Protected sharing is implemented via four mechanisms: software regions
    (analogous to segments in earlier readings), hierarchically named
    capabilities, wakeup predicates, critical sections. The capabilities are
    more like UNIX protection mechanisms than traditional capabilities.
    Wakeup predicates are interesting in that they are done in a simple
    Xok-created programming language and downloaded and compiled in the
    kernel. Thus the kernel can, in theory, ensure that they are safe to
    operate. Critical sections are used instead of locks because locks
    require mutual effort. I’m not sure how much thought was given to
    multiprocessor issues when critical sections were chosen over locks.

    The disk subsystem, XN, is described in some detail. It appears that
    basic disk block access is a two step process: bind a disk block to a
    memory page via a Buffer Cache Registry and then map that page into your
    address space. I/O to that block is then handled via reads/writes to the
    page, in effect making all disk access memory-mapped. I thought the
    concept of template-based descriptions of metadata in place of a trusted
    kernel file system was an interesting approach to providing protection
    of disk blocks. However, it requires (I believe) that the entire tree
    above the disk block be in-memory and previously parsed. Also, I felt
    the Multics rule of checking rights on every access was more secure
    (albeit significantly slower) than only checking on binding. The rules
    for preventing file system corruption are good rules to always use when
    building a file system. However, it actually seems to be exceedingly
    difficult to actually implement these rules. The tainted block approach
    seems to require that virtually the entire directory structure to be in
    memory in order to determine if a block is tainted. Because there seems
    to be a block-level reference count, it is possible this is used to
    determine tainted blocks. If so, that would significantly simply
    identifying tainted blocks.

    The concept of a LibOS is also an interesting one. The burden it removes
    from the traditional OSes is protecting multiple processes from one
    another. Because the OS is basically statically linked, it is really
    only supporting a single process. Because of that, the abstractions it
    provides can be much more efficient. In reality, the LibOSes are
    dynamically linked, but by putting the management of the abstractions in
    the private process address space, each process appears to be the only
    process on the machine. This turned out to be a significant performance
    gain.

    Physical resource sharing becomes much more difficult on devices that
    require parameter settings prior to use. Examples of these types of
    devices include serial ports, sound cards (recording/input),
    optical/magneto-optical drives, most other removable media devices, and
    others. These types of devices generally require exclusive access and
    thus cannot be shared.

    A common optimization in file systems is to have separate data areas and
    metadata areas, or, more generally, to optimize the layout of files on
    disk. Because the layout can no longer be guaranteed with multiple file
    systems operating on the same physical disk (the file system can no
    longer assume it has a large fairly contiguous extent of blocks to work
    with), I suspect pre-allocation of blocks would take place. This would
    be a de-facto partitioning of the disk.

    While I was skeptical of the concept at the beginning of the paper, I
    could see some advantages for limited classes of resources. I can
    actually see where we implemented an extension to our kernel file system
    driver that would have been better served being managed by an
    application. Unfortunately, it wouldn’t have fully worked for a variety
    of reasons, but the concept might have been able to at least improve our
    implementation. The performance gains described were significant.
    However, performance often drops, sometimes by a large percentage, when
    a complete implementation is produced. As such, final judement on the
    performance of the Xok system must be withheld until a more complete
    product is produce.


  • Next message: Jeff Duzak: "Review of "Application Performance and Flexibility on Exokernel Systems""

    This archive was generated by hypermail 2.1.6 : Mon Jan 19 2004 - 21:33:24 PST