From: Muench, Joanna (jmuench_at_fhcrc.org)
Date: Wed Mar 03 2004 - 15:16:42 PST
Rowstron and Druschel (2001) present PAST, a persistent peer-to-peer storage
utility implemented in Java and run on a Compaq AlphaServer. Their stated
goal is to develop a scalable and self-organized storage system with strong
persistence and a high degree of reliability. The designed system seems in
general to meet those goals through the use of some innovative techniques,
especially involving scalable load balancing.
While inspired by applications such as Napster, PAST does not attempt to
provide searchable storage. Once stored in PAST, retrieval requires the
unique fileId generated at storage. The fileId provides more that just a
unique identifier to the file. When looked up in Pastry, the routing
substrate, the fileId identifies the nodes where the file is most likely to
reside. PAST nodes are arranged in a circular namespace designed to avoid
correlation between the nodeId and any external node properties such as
geographic location, ownership, etc. Nodes track the status of other nodes
in their neighborhood, updating as needed when nodes fail or are replaced.
Routing between nodes is randomized to decrease vulnerability to malicious
or failed nodes.
The paper focuses largely on efficient storage policies that work well at a
high level of utilization. PAST uses both replica diversion and file
diversion to deal with the inevitable imbalances associated with a
heterogeneous set of nodes and statistical variation in fileId assignments.
The authors identify three important considerations for replica diversion:
1) don't balance if utilization is low, 2) divert large files over small and
3) move from below average free space to above average free space. They
combine these considerations into a single metric. Multiple replica declines
will in turn spur a file diversion. Caching is also important to achieve
adequate performance retrieving popular files.
The results section illustrates how well the system handles a high level of
utilization. The authors do note a trade off between the success rate and
level of utilization, but the rates are so high (above 90%) criticism would
seem nit-picking.
The PAST system clearly fulfills its goals of scalability,
self-organization, strong persistence and reliability. However the system
has some restrictions, especially the write-once nature of the files. This
makes the system ideal for storing immutable objects (music and movies
perhaps?) but not useful as a general file system.
The largest surprise for this well-written paper is how clearly written it
is despite the lack of any obvious affiliation with the UW CS department.
The organization is excellent, although (like this review) it suffers from
extensive use of the passive voice.
This archive was generated by hypermail 2.1.6 : Wed Mar 03 2004 - 15:17:58 PST