From: Brian Milnes (brianmilnes_at_qwest.net)
Date: Wed Mar 03 2004 - 12:37:24 PST
Wide Area Cooperative Storage with CFS - Dabek et al
The authors describe the design of CFS a distributed read only peer to peer
file system which differs from existing systems in that it is all of
symmetric, decentralized, unmanaged, fast, load balanced and fault tolerant.
CFS is built using the SFS file system on DHASH block storage over CHORD
node identification. CFS does not attempt to provide anonymity so that it
can focus on efficiency and reliability.
The system stores data blocks signed by a public key for a fixed period of
time. Clients must renew files and only the author can modify a directory.
Each IP has a quota to prevent malicious attempts to fill the file system.
Servers are located using CHORD which provides an available distributed
version of consistent hashing.
DHASH is used to partition the blocks onto many servers. This balances the
load and prevents large files from filling any server. DHASH puts k copies
of a block on the k servers down ring from the publisher and duplicates
these when a server leaves. DHASH keeps an LRU cache of popular blocks so
that as servers request hot blocks they are replicated throughout the
system.
CFS balances load using the universal hash of blocks to servers but this may
vary by O(log n), so it introduces virtual copies of servers to add load to
stronger servers. CFS limits storage by allowing each node to use only
1/1000 of any hosts available storage.
CFS is 7000 lines of C++ including 3000 lines for Chord and runs on many
free operating systems. The clients mount CFS as an NFS file system. CFS
uses UDP and an SFS RPC. This seems dangerous in that they won't get good
retransmission results without carefully working out TCP-like algorithms.
They implemented this on 12 machines spread throughout the
internet and achieved near client to server TCP download speeds of a small
64K bytes per second. Their replication and caching algorithms both seem to
work quite well. When as many as 20% of their servers fail, no lookups fail
during reconstruction and the cost is on average only one more RPC when as
many as 50% of the servers fail.
CFS does not protect against malicious servers and requires a
distributed search engine. This is a very nice reliable read only peer to
peer system. I wonder how hard it would be to adapt it for local file access
with read write capabilities?
This archive was generated by hypermail 2.1.6 : Wed Mar 03 2004 - 12:37:28 PST