Review of Dabek et al. "Wide Area Cooperative Storage with CFS"

From: Cliff Schmidt (cliff_at_bea.com)
Date: Wed Mar 03 2004 - 15:22:30 PST

  • Next message: Tarik Nesh-Nash: ""Wide Area Cooperative Storage with CFS" Review"

    CFS was similar in a lot of ways to PAST. I was actually
    surprised that there weren't even more comparisons to PAST
    in the paper. So, for this review I'm going to list the
    similarities and differences from PAST:

    Similarities
    - goals of efficiency, robustness, load-balancing of a
    read-only file store (although I don't think PAST
    emphasized the read-only attribute as much as the CFS
    paper did, but I think it is equally true). Both
    systems only allow the publisher to modify a file (I
    believe this is really only done by deleting and
    inserting it again with the same id)

    - both enforce publishing quotas

    - use of a hash on public keys and names

    - similar approach to caching

    - similar configuration option to adjust degree of
    replication. Somewhat similar approach to replicating
    across numerically local node ids.

    Differences
    - circular node with pointers to successors and a
    finger table for jumping ahead in the circle (however,
    the PAST system's log N routing table had a similar
    effect, IMO). CFS's circular path requires it to
    include a more defined node id authentication
    mechanism to prevent malicious nodes from trying to
    prevent a query from finding a succeeding node.

    - CFS stripes the nodes with blocks of a file, which
    is a better use of storage space and makes load
    balancing much easier.

    - CFS uses virtual servers to represent the greater
    storage capacity that one node may have over another.
    PAST accounted for varying storage sizes, but I believe
    it accomplished this through metadata, rather than
    making one server look like ten virtual servers if it
    has ten times the base storage size. The CFS virtual
    servers take up an impressively small 10KB per
    additional virtual server (just the extra data
    structure).

    - CFS has no delete operation. There is only an
    agreed-upon finite interval, at which point the data
    can be deleted as necessary, unless the publisher
    requests an extension. CFS sees this as a feature
    to automatically get rid of unneeded and potentially
    maliciously useless amounts of data.

    I was surprised that the "Real life" test of CFS
    only involved a single one megabyte file. I assume
    the authors felt they covered the issue of routing
    well enough in other tests, but it wasn't clear to
    me that the "Real life" test was valid without the
    complexity of having other files in the system.
    (I actually think this was a problem with their
    explanation or my reading, because this would
    otherwise be a major hole in their testing.)


  • Next message: Tarik Nesh-Nash: ""Wide Area Cooperative Storage with CFS" Review"

    This archive was generated by hypermail 2.1.6 : Wed Mar 03 2004 - 15:22:31 PST