Review: Fine-grained mobility in the Emerald System

From: Cem Paya 98 (Cem.Paya.98_at_Alum.Dartmouth.ORG)
Date: Wed Feb 04 2004 - 13:49:46 PST

  • Next message: Cem Paya 98: "Review: Emerald"

    Review: Fine-grained mobility in the Emerald system
    Cem Paya, CSE551P

    This paper describes the Emerald, a distributed and object-
    oriented programming system with its own compiler, kernel,
    network protocols and even garbage collector (Last one
    sketched out but not implemented at the time of writing).
    Main distinguishing feature of Emerald is that it supports
    moving objects across nodes in the network at very
    granular level, including objects that have active method
    invocations in progress. Entire process is frozen and
    minimal snapshot, sometimes as small as hundreds of bytes
    in one example, compared to massive memory footprint, is
    sent to another node to continue execution. All of this
    behavior is transparent to programmers—although there are
    APIs to explicitly locate and move objects—and the
    compiler/runtime combination is responsible for optimally
    managing the location and addressing objects.

    One intriguing idea in the paper is the notion of call-by-
    move and call-by-visit semantics. When objects are passed
    as arguments in a remote method invocation, they are
    virtually guaranteed to cause additional network traffic
    when callee attempts to access them. Call-by-move
    addresses the scenario when such accesses are frequent and
    it makes more sense to preemptively transport the entire
    object to remote node. Call-by-visit involves call-by-move
    followed by returning the object back upon completion.
    This is all transparent to users, owing to uniform
    addressing scheme for all objects based on universal OIDs.
    Emerald design also incorporates perf considerations which
    leads to special casing local access and small data types.
    For example there are 3 different object storage modes,
    global, local and direct each accessed differently. Last
    one is similar to how virtual machines such as JVM special
    case primitives (integer, float etc.) as value types when
    everything else in the object system is reference type.
    Local access occurs in user mode while global access traps
    to kernel.

    There is an interesting analog to HTTP redirects (status
    code 30x series) for locating objects. Since objects move
    around, references can become stale. This is solved by
    keeping track of forwarding addresses. Difference from
    HTTP is that the node originally contacted for the
    invocation itself forwards the query on to where it
    believes the object to reside currentlyl; the final node
    with the object replies directly to the caller who updates
    their location table. This involves fewer messages
    compared to HTTP where the first node would simply have
    responded to caller with forwarding address but doesn’t
    scale as well because nodes are responsible forwarding
    requests for objects they used to have.

    Most impressive part of the implementation is the
    mechanism for moving objects around. Since address space
    isn’t shared, all pointers have to be remapped. Tracking
    pointers also has implications for garbage collection.
    This is where strongly-typed language and compiler support
    comes in: templates generated for each type keep track of
    where pointers are in the data section. (Contrast with
    conservative collectors, which does not know what is a
    pointer and must treat every address in memory as
    potentially holding a pointer.) One problem is, registers
    must be identified as well which means they can’t change
    during method invocation. On architectures with few
    registers—x86-- this would probably increase register
    pressure and slow things down. Explicit awareness of
    pointers in memory and register set is one of the
    weaknesses, and suggests the sophisticated object
    management in Emerald is difficult to decouple from the
    language. For example using objects in C or doing low-
    level hacks with pointers would be impossible.

    Last section describes a very novel, unique way of
    implementing email: message object moves between servers.
    Instead of a unique copy being delivered to each
    recipient’s inbox, there is a single object created by the
    sender and that moves around based on demand between nodes
    hosting different mailboxes. Compared to accessing the
    mail object directly, call-by-move semantics ends up
    reducing execution time by about 20% and network traffic
    by 7%. Missing from the comparison is the usual approach
    where a pointer to the message is sent, and recipients
    download the entire message at once. Since email is
    generally considered a single unified document accessing
    fields individually as in this case may not be a good
    example.


  • Next message: Cem Paya 98: "Review: Emerald"

    This archive was generated by hypermail 2.1.6 : Wed Feb 04 2004 - 14:04:08 PST