From: Gail Rahn (grahn_at_cs.washington.edu)
Date: Mon Feb 23 2004 - 18:45:06 PST
Review of "Scale and Performance in a Distributed File System" by Howard et
al
This paper describes the performance considerations for Andrew, a
distributed file system used on the campus of CMU. Andrew will eventually
span up to 10,000 nodes. The paper describes the prototype implementation of
Andrew and the design improvements made obvious from scaling and perfomance
observations.
Andrew is comprised of Venus and Vice components. All system components run
the Berkeley BSD 4.2 version of UNIX. Vice is (collectively) a set of
trusted servers that provide a homogeneous, location-transparent file
namespace to client workstations. Venus is a user-level client process
running on workstations that access Andrew. When a remote file is opened,
Venus caches the entire file received from Vice. When the file is closed,
Venus stores the modified version of the file back on the Vice server. All
file modifications are performed on a cached version of the file on the
local machine and do not involve Venus.
Each Vice server had a directory structure that mirrored the structure of
Vice files stored on it. File status information was kept in shadow ".admin"
directories. For files kept on other servers, the directory entries were
present and ended in "stub directories" that identified the server
containing the file. Venus queried Vice for filenames based on their full
path (Corresponding to directories and terminating filename), the prototype
implementation did not implement any low-level filenaming. Files
locally-cached by Venus were always suspect, and when opening files, Venus
checked to make sure that the file timestamp was indeed the latest version
of the file.
The creators of Andrew attempted to design for large scale. Evaluation of
the prototype design resulted in the following observations:
* Vice architecture of one dedicated process per client used too
many system resources
* Embedding locations of remote servers in a Vice directory tree
made home directory migration painful.
* Storage quotas could not be enforced for users.
* Servers performed adEquately when limited to less than 20
concurrent client connections.
The prototype implementation also uncovered that the vast majority of
client/server interaction was testing user authentication for a file or
getting file status. Actual fetch and store operations accounted for only 6%
of client/server traffic, of that, 2% was storing a new version of a file.
The most frequently-used operation, TestAuth, was shown to significantly
degrade in performance in a load of more than 5 concurrent clients. Further,
context-switching between the server processes dedicated to clients was
costly in terms of CPU usage.
The next version of Andrew changed in several ways. Venus, the client
process, additionally caches directory contents and symbolic links. File
status information is cached in virtual memory, to ensure that FileStat
requests are services rapidly. Substituting for Venus's constant
cache-verification checks for a file, instead Venus and Vice communicate via
callbacks. Venus registers callbacks with the Vice server for a file or
directory. Vice notifies registered Venus clients before modifying the file
or directory. Tihs dramamtically reduces client/server traffic. File
name-resolution was improved to used a fixed-length unique identifier (a
fid) rather than the full pathname. Vice servers are unaware of pathnames
and work with only fids. Vice was rewritten to used thread-pooling rather
than one server process per client. And, inodes were reintroduced as a
storage mechanism, rather than pathnames. These improvements significantly
improved performance of the file system under substantial load.
-- Gail.
-------------
Gail Rahn
grahn_at_cs.washington.edu
This archive was generated by hypermail 2.1.6 : Mon Feb 23 2004 - 18:45:17 PST