Review: Schroeder, et al. Experience with Grapevine: The Growth of a Distributed System.

From: Richard Jackson (richja_at_expedia.com)
Date: Sun Feb 01 2004 - 13:13:02 PST

  • Next message: Steve Arnold: "Review: Experience with Grapevine"

    This 1984 paper by Schroeder, Birrell and Needham describes Grapevine, a
    distributed system that is primary used for email services, but also
    includes functionality for a general naming service and access control.
    This paper is written from a somewhat high-level operational aspect,
    giving some analysis and general guidance for running a Grapevine
    system. In that aspect, it almost seemed like a system administrator's
    guidebook. However, there are also many design details/enhancements
    mentioned in the paper, which could be used to build other systems based
    on the general Grapevine design.
     
    Note that Grapevine was mentioned in the preceding paper[1], as a
    distributed means for storing RPC binding information.
     
    The paper is divided into two main sections, 1) overview of system, 2)
    operational issues. The 2nd part comprises most of the paper.
     
    In the first section, Grapevine is introduced. The basic components of
    Grapevine are a messaging service and a registration service, each of
    which runs on every Grapevine server. The former is used to deliver
    arbitrary messages(generally email) and the latter is used to provide a
    generic, hierarchical naming service(with distributed registrars, like
    DNS). At the time of the writing, there were 17 servers deployed within
    the Xerox internet. The paper discusses a good example of the naming
    service - the common idea of email users and distribution lists(groups
    of users and/or other distribution lists). Overall, Grapevine is
    described as a "replicated, distributed system that provides message
    delivery, naming, authentication, resource location, and access control
    services to an internet." To me, this description is overwhelming. It
    seems that the system is trying to serve too many functions.
    Thankfully, the paper acknowledges that Grapevine is mainly focused on
    mailing services. While these other services are somewhat related(i.e.:
    naming is required for email delivery), I think that the others are
    beyond the scope of this system. I think that the designers of this
    system were trying to build a generic system that could handle any type
    of data. While this is an admirable goal, I think they would be better
    served by focusing on a specific design domain. The paper hints at
    this in section 9, where they tell about an IC manufacturing operation
    that uses Grapevine. Here, the email needs of Grapevine sometimes
    overwhelm the servers, preventing the specialized needs from being met.
    Perhaps Grapevine could be used in this case, but I think the
    specialized system should be isolated from other orthogonal Grapevine
    systems.
     
    In the next section, many operational issues were discussed. This
    section was extremely thorough and covered all the critical topics, such
    as 1)scaling, 2)configuration of the system, 3) transparency of
    distributed system(user sees a single logical service), 4) modifying
    design to accommodate unexpected load, 5)remote access and operation of
    a distributed system, 6) reliability issues. One key issue that was not
    resolved is the fact that each server node had limited disk space, which
    unscrupulous users could easily abuse(ie: forgetting to delete old
    messages). To me, it seems that a simple pessimistic quota system could
    have prevented this problem, which would solve many of the issues raised
    in this paper. I also did not like their idea of constantly
    re-analyzing the system to find ways to configure it for better
    performance. I think that a better design would eliminate the need for
    this ongoing management and tuning.
     
    One weakness of this design is that the authors did not plan to scale
    beyond approximately 10000 users. This seems to be an arbitrarily low
    number, and simple design changes would have allowed them to scale well
    beyond this. In the conclusion, the authors say that a commercial
    version of this system at Xerox will include the necessary
    changes(change naming hierarchy from 2 levels to 3 levels) to allow a
    larger user base.
     
    The main strength of this paper was its incredible thoroughness. These
    people considered so many aspects of the system that it's hard to
    believe that the paper is 20 years old. Many of these problems still
    plague systems of today, while the authors of this paper seemed to have
    developed reasonable solutions in 1984. For example, their discussion
    of remote access, debugging, and the repair of disk corruption via a
    terminal interface is great. How many modern systems allow this level
    of remote administration in cases of partial failure? Also, their
    general remote-monitoring interface seems like a useful addition to
    modern systems, which is usually only added as an afterthought.
     
    This paper also pointed out two key things that we should keep in mind:
    1) it is hard to make major changes to a system after it has been widely
    adopted - the initial design must be very good, 2) even the
    experts(designers/developers) of a system slowly lose familiarity with
    the system over time, further preventing ongoing analysis and re-design.
    This underscores the importance of building it right the first time;
    often there is not a good opportunity to go back and rebuild.
     
    Overall, this system seemed to be a mixture of modern internet email
    systems and the DNS naming facility[2]. This pioneering work at Xerox
    surely had a large influence on later systems such as the SMTP and DNS
    standards.
     
     
    [1] Andrew D. Birrell and Bruce Jay Nelson. Implementing Remote
    Procedure Calls.
    [2] P.V. Mockapetris and K.J. Dunlap, Development of the Domain Name
    System.
     


  • Next message: Steve Arnold: "Review: Experience with Grapevine"

    This archive was generated by hypermail 2.1.6 : Sun Feb 01 2004 - 13:13:15 PST