PAST review

From: Praveen Rao (psrao_at_windows.microsoft.com)
Date: Wed Mar 03 2004 - 15:58:58 PST

  • Next message: ahemavathy: "PAST review"

    This paper discusses PAST which is a distributed peer-to-peer storage
    management system. PAST nodes co-operatively route file queries, store
    multiple replicas of files and cache copies of popular files. PAST does
    storing of caching of whole files.

    Storage nodes and files are assigned unique ids and file is stored at
    nodes whose identifier matches that of the file most closely.

    Each node is capable of initiating and routing client requests to insert
    and retrieve files. Node can optionally contribute storage to PAST.
    Inserted files are replicated across multiple nodes for availability.

    PAST uses Pastry routing protocol. The number of nodes traversed and
    messages sent to locate a file are logarithmic order of total number of
    nodes. A client must know the file-id and (decryption key if applicable)
    to locate a file. Local file-system of the nodes acts as a cache for the
    PAST content.

    The interface of PAST is as follows:
    fileId = Insert (name, owner-credentials, k, file)
    file = Lookup (fileId)
    Reclaim (fileId, owner-credentials)

    Reclaim indicates that file is no longer maintained by PAST (and Lookup
    would fail) but Lookup may still succeed. This weaker semantic avoids
    complex agreement protocols among the nodes.

    The Pastry routing protocol maintains a leaf set, routing table and
    neighborhood set on the nodes. Leaf set consists of node ids immediately
    higher and lower the current node. The neighborhood set consists of
    nodes close to the current node based on proximity metric.

    PAST's storage management aims at allowing high global storage
    utilization and graceful perf degradation as the system approaches full
    storage load. For load balancing PAST uses replica diversion (where a
    file is stored on a node that is not among the k closed nodes) and file
    diversion (where a file is diverted to a different fileId space by using
    a different salt in id generation).

    PAST allows caching of the files along the file search path to minimize
    fetch latencies. Caches do not change semantics and hence cached files
    can be discarded anytime only costing perf.

    PAST has limitations due to whole file caching - a very large file may
    not find a node to be stored on even when the system on the whole has
    sufficient storage space. Experiments show that failure rates jump
    drastically once the system is beyond 90% utilization.


  • Next message: ahemavathy: "PAST review"

    This archive was generated by hypermail 2.1.6 : Wed Mar 03 2004 - 15:58:50 PST