CFS Review

From: Brian Milnes (brianmilnes_at_qwest.net)
Date: Wed Mar 03 2004 - 12:37:24 PST

  • Next message: Cliff Schmidt: "Review of Dabek et al. "Wide Area Cooperative Storage with CFS""

    Wide Area Cooperative Storage with CFS - Dabek et al

    The authors describe the design of CFS a distributed read only peer to peer
    file system which differs from existing systems in that it is all of
    symmetric, decentralized, unmanaged, fast, load balanced and fault tolerant.
    CFS is built using the SFS file system on DHASH block storage over CHORD
    node identification. CFS does not attempt to provide anonymity so that it
    can focus on efficiency and reliability.

    The system stores data blocks signed by a public key for a fixed period of
    time. Clients must renew files and only the author can modify a directory.
    Each IP has a quota to prevent malicious attempts to fill the file system.
    Servers are located using CHORD which provides an available distributed
    version of consistent hashing.

    DHASH is used to partition the blocks onto many servers. This balances the
    load and prevents large files from filling any server. DHASH puts k copies
    of a block on the k servers down ring from the publisher and duplicates
    these when a server leaves. DHASH keeps an LRU cache of popular blocks so
    that as servers request hot blocks they are replicated throughout the
    system.

    CFS balances load using the universal hash of blocks to servers but this may
    vary by O(log n), so it introduces virtual copies of servers to add load to
    stronger servers. CFS limits storage by allowing each node to use only
    1/1000 of any hosts available storage.

    CFS is 7000 lines of C++ including 3000 lines for Chord and runs on many
    free operating systems. The clients mount CFS as an NFS file system. CFS
    uses UDP and an SFS RPC. This seems dangerous in that they won't get good
    retransmission results without carefully working out TCP-like algorithms.

                They implemented this on 12 machines spread throughout the
    internet and achieved near client to server TCP download speeds of a small
    64K bytes per second. Their replication and caching algorithms both seem to
    work quite well. When as many as 20% of their servers fail, no lookups fail
    during reconstruction and the cost is on average only one more RPC when as
    many as 50% of the servers fail.

                CFS does not protect against malicious servers and requires a
    distributed search engine. This is a very nice reliable read only peer to
    peer system. I wonder how hard it would be to adapt it for local file access
    with read write capabilities?


  • Next message: Cliff Schmidt: "Review of Dabek et al. "Wide Area Cooperative Storage with CFS""

    This archive was generated by hypermail 2.1.6 : Wed Mar 03 2004 - 12:37:28 PST