Storage management and caching in PAST, a large-scale, persistent peer-to-peer storage utility

From: Manish Mittal (manishm_at_microsoft.com)
Date: Mon Mar 08 2004 - 11:32:48 PST


In this paper, peer-to-peer storage utility named PAST is discussed.
PAST is internet based peer-to-peer global storage utility which aims to
provide strong persistence, high availability, scalability and security.
PAST system contains nodes connected to internet. Each node is capable
of initiating and routing client requests to insert or retrieve files.
PAST is based on Pastry P2P routing and location.

Pastry is an efficient routing scheme which ensures that client requests
are reliably routed to appropriate nodes. It supports Insertion and
replication of files and it has high storage utilization. It achieves
load balancing through storage management and caching. Each node is
assigned a unique identifier (nodeId). Given a key and a message, Pastry
routes the message to the node with nodeId numerically closest to the
key ID.

PAST system exports operations for inserting, deleting and retrieving
files for its clients. Unlike CFS, whole files are stored and it is not
split into blocks. One of the design consideration is to have the
aggregate size of stored files closer to the aggregate capacity in the
PAST network before insert requests are rejected. This is done in two
ways.1} Replica diversion. 2) File diversion.

In Replica diversion scheme, if a node cannot store a replica locally,
it asks a node in its leaf set if it can. In File diversion scheme, if
one of the k nodes with nodeId closest to file Id declines to store a
replica, the file is diverted by generating new fileId. Caching is used
on node to improve throughput and minimize latency.

Experiment results shows that storage management is essential to storage
utilization and caching improves throughput of the system.

 



This archive was generated by hypermail 2.1.6 : Mon Mar 08 2004 - 11:31:48 PST