Review: Howard, et.al., Scale and Perf in DFS

From: Steve Arnold (stevearn_at_microsoft.com)
Date: Mon Feb 23 2004 - 17:33:25 PST

  • Next message: Ian King: "Review: Howard et al., Scale and Performance in a Distributed File System"

    This paper written by researchers at CMU is not the first to be
    published for their Andrew System. In fact, this details the changes
    between their prototype and the current system (around 1988). In
    general, Andrew uses a client/server system where files are cached on
    the clients. When a file is opened, it is copied down, and when it is
    closed, only then is concurrency enforced. It uses an RPC system for
    communication. In the original system performance is much worse than
    local file access, but yet better than just time-slicing on a server.

     

    Most of the paper details the improvements that they have made to the
    system. First, they changed the system to use callbacks. That is the
    server needs to let a workstation know about any changes (this is
    different from communication that used to happen on every operation).
    They also changed the way that they resolve the names of files. Rather
    than using path names, they now use IDs. Also before they used to spawn
    a separate process for each client. This was mitigated by writing a
    user-level thread system that allows everything to run in one process.
    Lastly, they changed how files are stored in the file system, and made
    more information available to the user.

     

    The rest of the paper goes on to make comparisons. They show that the
    new system scales better and does much better in general than the
    original implementation. They also show that the system does better than
    NFS, another system (more of a peer-to-peer system). This is because
    they take advantage of locality and use whole files.

     

    The last changes that they detail are those that make the system more
    usable (like adding quotas), but I didn't think this was the main point
    of the paper.

     

    I was a little confused at first as to the point of the paper and what
    the authors were trying to convey about the system (it wasn't until
    toward the end of the paper). I also thought they provided way too much
    data in the tables. Charts are more useful and these could have been
    left in an appendix. I did, however, find that many of the lessons
    learned here were also related to some of the other papers that we read
    (such as log-based FS, and memory coherency).

     


  • Next message: Ian King: "Review: Howard et al., Scale and Performance in a Distributed File System"

    This archive was generated by hypermail 2.1.6 : Mon Feb 23 2004 - 17:33:51 PST