From: Jeff Duzak (jduzak_at_exchange.microsoft.com)
Date: Sun Feb 22 2004 - 21:24:43 PST
This paper is very much reminiscent of the Grapevine paper, in that it
discusses a number of issues arising from the growth of a distributed
system. This paper describes Andrew, a distributed file system
developed with scalability in mind. A prototype of the system was
implemented and tested first, and the lessons learned from the prototype
were applied to a more complete version. The final system was compared
to Sun's NFS system, and was shown to scale considerably better than
NFS.
A design choice that is stressed in the paper is the decision to cache
shared files on the client when that file is open. Because of this
caching, the server needs to be contacted only when a file is opened or
closed, not for each read and write. Testing showed that this decision
certainly improves performance in benchmarked scenarios. However, it
also causes greater up-front latency compared with a system that
contacts the server for each read or write. Further, no guarantees are
made regarding simultaneous writes to a file. Presumably, if multiple
workstations modify a file simultaneously, whichever workstation closes
the file last will blast over the other workstations' changes.
A number of issues were highlighted by thorough testing of the
prototype, and certain changes were made to improve performance in the
production system. For example, in the prototype, a new process was
spun off for each client to serve that client's requests. This caused a
lot of CPU time to be wasted in context switches, and also prevented
sharing of data between request processors. In the production system,
lightweight threads were used instead of processes, resulting in much
less overhead.
The goal of the system was to be able to service 50 users with each
server, and apparently this goal was attained, with the assumption that
the benchmark used for testing was equivalent to typical use by 5 users.
Compared with the prototype, the production system showed much more
uniform performance with increasing load, validating the design changes
made for the production system.
Andrew shows some of the same limitations as Grapevine. For instance, a
database of volume names and locations must be maintained on each server
in Andrew, just as such a database of registries had to be maintained on
each server in Grapevine. Load balancing was done manually through
distribution of volumes among servers. Grapevine appears to have had
better replication. Each registry in Grapevine was replicated on one or
two other servers. In Andrew, only some system volumes are replicated
in a read-only manner. Most volumes are not replicated.
The authors acknowledge that the system likely will only scale well up
to 500 to 700 workstations. More recent (research) systems, such as
CFS, are designed to scale indefinitely.
This archive was generated by hypermail 2.1.6 : Sun Feb 22 2004 - 21:25:05 PST