Jim Shearers review of Storage Management and Caching in PAST

From: shearerje_at_comcast.net
Date: Thu Mar 04 2004 - 06:34:17 PST

  • Next message: Manish Mittal: "Storage management and caching in PAST, a large-scale, persistent peer-to-peer storage utility"

    PAST is distributed write-once file system using a non-hierarchical (peer-to-peer) confederation of member nodes. Its logical topology, selection of which nodes will store a particular file, and routing algorithms are built on the PASTRY distributed hash table. The key to the hash table is a “quasi-unique” fileId that is a SHA-1 hash of the file’s name, the originator’s public key, and a random value. PAST requires that originators of files have a smartcard encryption device to supply a public/private key pair. This key pair is also used to generate certificates of file ownership (largely for quota management).

    PASTRY itself handles node addition or removal, and file replication. All files are replicated on some number of nodes to assure that when a node is removed the files that were on it are still available. Load balancing and replica diversion strategies make up much of the paper’s content.

    PAST has a load balancing scheme where all “nodes” are assumed to be within 2 orders of magnitude of each other in storage capacity (no consideration is given to access rate). If a very large physical device joins the network, it is represented at two or more logical nodes to enforce the 2-orders requirement. The discussion of the ramifications of this for common mode failure was unsatisfactory.

    Where “replication” is discussed in the context of file system robustness, “caching” is really the same mechanism motivated by the high usage of popular files. The paper briefly discusses the Greedy Dual-Size (GD-S) policy for determining proliferation of “cached” copies.

    The paper several times mentioned the feature that nodes in such a system are “diverse in terms of ... rule of law”. I think there are many more legitimate “public library” or “corporate library” uses of such a system wherein a specific geographically dispersed organization owns all the nodes. But such uses are lost in the controversy over intellectual piracy.


  • Next message: Manish Mittal: "Storage management and caching in PAST, a large-scale, persistent peer-to-peer storage utility"

    This archive was generated by hypermail 2.1.6 : Thu Mar 04 2004 - 06:34:23 PST