Distributed File System Paper Review

From: Reid Wilkes (reidwilkes_at_hotmail.com)
Date: Sat Feb 21 2004 - 22:38:51 PST

  • Next message: Honghai Liu: "Review on Scale and Performance in a Distributed File System"

    The paper by Howard, et. al. described a distributed file system called Andrew. The system was developed over several years in the mid to late eighties at CMU. Andrew was designed to run on a collection of 4.2 BSD Unix machines. Two implementations are described in the paper; a prototype implementation and then a more full and scalable implementation that followed. I'm going to talk mainly about the full implementation here. The basic idea is that location of files is transparent to applications; file requests which result in accessing a file located on a remote machine are handled by the internal mechanisms of Andrew. The paper assigns somewhat bizarre and awkward nomenclature for the constituent parts of these internal mechanisms of the file system. "Venus" is a user level process running on each client node of the system performing all necessary tasks on the client. "Vice" is the name for the set of servers over which the physical files are distributed and through which file requests and location lookups are performed. Operations in the distributed system are performed at the granularity of entire files - a client requests to open a file and the Andrew system locates that file and transfers a copy of it to the local cache of the client. The client then performs arbitrary reads and writes on the file; these are local operations only and the distributed file system is not aware of them. When the client closes the file, changes are sent back out to the Vice layer from which the file originated. A callback mechanism is implemented such that if a file is changed at the servers then there is a notification sent to clients who have an active cache copy of that file. In the absence of these callbacks, clients will check for the latest version of a cache file on the server at each open. The performance of this system is predictably much worse that using local files only. A lengthy performance comparison against Sun's NFS is made concluding that Andrew provides a much higher level of scalablity. Knowing nothing more about NFS than what is described in this paper, I would come to the conclusion that it is a pretty poor technology. I thought the NFS semantics of file sharing and synchronization to be somewhat awkward until I read the description of those semantics in NFS - in NFS the semantics seem also too convoluted and counterintuitive to make the system usable. The NFS design is significantly different than Andrew's in that every file read and write is done through the distributed system. It is certainly my opinion that the Andrew system is a much better design; easier to understand and much more scalable. However, I feel unconvinced that the whole concept of a distributed file system is one which bears a great deal of merit. Whereas the distributed VM system described in a previous paper provided a significant simplication of the programming environment for those writing parallel applications, I do not see that the Andrew model of distributed file system has such a compelling case. It seems especially difficult to justify why this kind of distributed file system is necessary in light of specialized file servers and relational databases. However, even if there are examples of application domains where this type of technology provides a significant enhancement, I cannot see how it could be any more than an opt-in subsytem. It seems too far fetched to see this replacing the local file system entirely on a node.


  • Next message: Honghai Liu: "Review on Scale and Performance in a Distributed File System"

    This archive was generated by hypermail 2.1.6 : Sat Feb 21 2004 - 23:15:02 PST