From: Nathan Dire (ndire_at_cs.washington.edu)
Date: Mon Feb 23 2004 - 15:46:01 PST
In "Scale and Performance in a Distributed File System", Howard, et al,
reflect on developments in the Andrew File System project. Andrew is a
distributing computing environment developed at CMU with the goal of combining
the functionality of personal computers with mainframes. This paper focuses
on the scalability and performance of the Andrew File System.
The basic implementation of AFS has two components: client software called
"Venus" and server software called "Vice". The Venus process on a client
provides a file system interface that allows a user to see a file system
distributed across a number of Vice servers. Vice servers use a local file
system to store files in the shared name space along with hidden files
containing metadata about location in the system. Vice would cache entire
files, but would verify the timestamp before using them each time.
The initial implementation suffered from a number of performance problems
which were addressed in subsequent versions. First, the authors noticed that
clients were verifying time stamps frequently, so callbacks were added to
inform clients when file had been modified. Second, by only using pathnames
to identify files, Vice servers were spending a lot of time in namei
operations, so the authors added a unique fixed-length "fid" to identify files.
Third, Vice server performance was impacted by the per-client process, so
lightweight processes were added which only served individual requests.
Finally, they linked inode numbers on the server to Vice vnodes to help avoid
costly lookups. The performance changes in the revised version showed
dramatic improvements (in Fig 1, for example). The authors were also able to
show a significant scalability advantage over Sun's NFS.
In addition to performance improvements, manageability was also improved in a
later version with the addition of volumes. Volumes were subdivisions of
shared file tree which could be handled individually. Volumes could be copied
or moved to allow for better load balancing.
Even with the improvements, the consistency model is weak, and replication
seems cumbersome. But these are issues that remain mostly unsolved today, and
I believe later versions showed improvement in these respects. There's also a
lot that's left out of this paper, but like the Grapevine paper, what's
important is that this research flushed out some major issues in distributed
file systems and some major approaches to dealing with them.
A testament to the success of AFS is it's implementation in a commercial
product (DFS), which I believe is still supported by IBM, and has been adopted
as a standard by the OSF. I would be interested to see comparisons of well
the more recent incarnations, e.g., AFS 3, DFS, NFSv3, and NFSv4, fare with
respect to these same performance measurements. I'm not sure that AFS has
been surpassed by any competing technology.
This archive was generated by hypermail 2.1.6 : Mon Feb 23 2004 - 15:46:12 PST