Review of Li and Hudak. Memory Coherence in Shared Virtual Memory Systems

From: Cliff Schmidt (cliff_at_bea.com)
Date: Wed Feb 18 2004 - 01:54:51 PST

  • Next message: Jeff Duzak: "Review of "Memory Coherence in Shared Virtual Memory Systems""

    In this paper, Kai Li and Paul Hudak describe their analysis and
    implementation of several possible strategies for dealing with
    shared virtual memory across a multiprocessor system. My
    understanding of this paper was definitely helped by having it
    follow the previous papers and classroom discussion of
    (uniprocessor ) virtual memory.

    Of course, the main issue with multiple processors and parallel
    operation on the same memory is coherency. The paper breaks
    the design choices for this into choice of granularity and
    strategies across page synchronization and page ownership. The
    choice of granularity affects the impact of contention.
    Smaller granularity reduces contention. The authors recommend
    choosing a page as the proper level of granularity. This not
    only allows the system to work within the conventional virtual
    memory system's page faulting mechanisms, but works pretty well
    considering that sequential programs are still the thing that's
    running on each processor, and we already know that sequential
    programs tend to have a high degree of locality.

    I liked the way the authors broke down the strategies into two
    axes of page synchronization and page ownership, and then
    illustrated the possible combinations in a matrix. However, it
    did seem like the writeback method of page synchronization was
    hardly introduced and described before it was rejected. I know
    that we discussed this briefly in class when we talked about
    TLB cache coherency, but I would have liked to have gotten more
    than the three sentences on it if it was going to be mentioned
    at all. Static page ownership is also dismissed pretty quickly
    as not being appropriate for a "loosely-coupled multiprocessor".
    This may be my own ignorance, but I also never completely
    understood how a loosely-coupled multiprocessor would differ
    from a tightly-coupled multiprocessor. I certainly understand
    these terms in other contexts, but I'm a little unclear about
    how they apply to multiprocessors. I wonder if a tightly
    coupled multiprocessor would be one that is part of a master-
    slave relationship or something?

    The paper then describes the Centralized Manager algorithms.
    This was very well written (as was most of the paper), and
    clearly explained the process that a processors goes through
    to find the owner of a page through a central manager (which
    lives on one particular processors), and then is able to write
    to it, after which an invalidation operation takes care of
    the page synchronization issue. The key piece of info in this
    description is Theorem 3.1, the worst case number of messages
    to locate a page is two. There is also an improved version of
    the basic Centralized Manager, which moves management of the
    copy set to the ptable, in each processor; this eliminates the
    need for the confirmation operation.

    The Distributed Manager algorithms are then described. In the
    fixed version of this, there is some deterministic way of
    knowing which processors manages which page (such as a mod
    mapping). Even this is noted to be "substantially superior"
    to the centralized manager whenever there is a high rate of
    page faults. However, the Dynamic Distributed Manager ends
    up being the most interesting and best performing algorithm.
    The idea with this is that each processors manager keeps track
    of a guess of where the manager is for any particular page.
    If this info isn't correct, the incorrect manager can forward
    the requestor to the correct one, or at least another manager
    who will have a better idea. This whole chain of forwarding
    thing reminded me of how Emerald find objects that could be
    located on one of many different servers.

    There's a lot more depth on the Distributed Manager
    algorithms in the paper, but one of the key points is made
    in Theorem 4.5, which basically says that the worst case
    number of messages to locate the owner of a page is actually
    the same as the Centralized Manager, which is good news
    since the Distributed Manager doesn't suffer from
    bottlenecks.

     


  • Next message: Jeff Duzak: "Review of "Memory Coherence in Shared Virtual Memory Systems""

    This archive was generated by hypermail 2.1.6 : Wed Feb 18 2004 - 01:54:54 PST