Review on Scale and Performance in a Distributed File System

From: Honghai Liu (liu789_at_hotmail.com)
Date: Sun Feb 22 2004 - 09:41:07 PST

  • Next message: ahemavathy: "AFS Distributed File System Review"

    Title: Scale and Performance in a Distributed File System

    Reviewer: Honghai Liu

     

     

    The paper presents the Andrew file system, which performs much better than other file system and shows
    excellent scalability ability. The paper is significant and perhaps the most important one in distributed file
     system. It is amazing that quit a few new ideas came out of this paper and most of them are still of great
     relevance of today's file system technology.

     

    Prior to Andrew file system, most of the file systems were based on traditional client server mode. In fact,
    they are implemented through synchronous RPC calls. So client would block on every file access system
    call until the server returns. This has two important implications: first, all the file access request has to go
     through network and bother the server. Second, potentially clients to wait (block) on each remote disk
    operation.

     

    Callbacks and local caching successful address the above issues. Specifically, the Andrew file system
    process (Venus) on each client assumes the cache entries are valid unless otherwise notified. And the way
     it is notified is through callbacks. Callbacks are asynchronous mode of computation and it is actually the
     mechanism for server to call the client (so server plays client's role and the client plays the server's).
    Callback is used when server has to notify the changes of a certain file to those who have the copy of the
    file. It is hoped that the callback traffics would be less expected because the multiple copies of files are
     seldom, therefore most the operation on files can be done locally. In fact, this is the case in an operational
    distributed system, people normally don't write and share many files very often.

     

    The notion of volume is novel one as well, and it appears to fix all the operability problems the author
    originally are concerned with. The volume is a collection of files forming a partial subtree of a name tree.
     It provides transparency to the users as well as to the administrators. Moving files, quoting and backing
    up were not trivial, but volume is the magic key. Even it was invented in this distributed file system; it is
     also a valuable tool in one single server system.

     

    Although there are some other new ideas (e.g. fid with new resolution mechanism), the callbacks
    ( with local caching) and volume are most impressive ones due to the fact they contribute to the success
    of the Andrew File system in multiple areas.


  • Next message: ahemavathy: "AFS Distributed File System Review"

    This archive was generated by hypermail 2.1.6 : Sun Feb 22 2004 - 09:41:10 PST