Review of AFS

From: Nathan Dire (ndire_at_cs.washington.edu)
Date: Mon Feb 23 2004 - 15:46:01 PST

  • Next message: Muench, Joanna: "Review of a Distributed File System (Andrew)"

    In "Scale and Performance in a Distributed File System", Howard, et al,
    reflect on developments in the Andrew File System project. Andrew is a
    distributing computing environment developed at CMU with the goal of combining
    the functionality of personal computers with mainframes. This paper focuses
    on the scalability and performance of the Andrew File System.

    The basic implementation of AFS has two components: client software called
    "Venus" and server software called "Vice". The Venus process on a client
    provides a file system interface that allows a user to see a file system
    distributed across a number of Vice servers. Vice servers use a local file
    system to store files in the shared name space along with hidden files
    containing metadata about location in the system. Vice would cache entire
    files, but would verify the timestamp before using them each time.

    The initial implementation suffered from a number of performance problems
    which were addressed in subsequent versions. First, the authors noticed that
    clients were verifying time stamps frequently, so callbacks were added to
    inform clients when file had been modified. Second, by only using pathnames
    to identify files, Vice servers were spending a lot of time in namei
    operations, so the authors added a unique fixed-length "fid" to identify files.
    Third, Vice server performance was impacted by the per-client process, so
    lightweight processes were added which only served individual requests.
    Finally, they linked inode numbers on the server to Vice vnodes to help avoid
    costly lookups. The performance changes in the revised version showed
    dramatic improvements (in Fig 1, for example). The authors were also able to
    show a significant scalability advantage over Sun's NFS.

    In addition to performance improvements, manageability was also improved in a
    later version with the addition of volumes. Volumes were subdivisions of
    shared file tree which could be handled individually. Volumes could be copied
    or moved to allow for better load balancing.

    Even with the improvements, the consistency model is weak, and replication
    seems cumbersome. But these are issues that remain mostly unsolved today, and
    I believe later versions showed improvement in these respects. There's also a
    lot that's left out of this paper, but like the Grapevine paper, what's
    important is that this research flushed out some major issues in distributed
    file systems and some major approaches to dealing with them.

    A testament to the success of AFS is it's implementation in a commercial
    product (DFS), which I believe is still supported by IBM, and has been adopted
    as a standard by the OSF. I would be interested to see comparisons of well
    the more recent incarnations, e.g., AFS 3, DFS, NFSv3, and NFSv4, fare with
    respect to these same performance measurements. I'm not sure that AFS has
    been surpassed by any competing technology.


  • Next message: Muench, Joanna: "Review of a Distributed File System (Andrew)"

    This archive was generated by hypermail 2.1.6 : Mon Feb 23 2004 - 15:46:12 PST