From: Cliff Schmidt (cliff_at_bea.com)
Date: Wed Mar 03 2004 - 15:22:30 PST
CFS was similar in a lot of ways to PAST. I was actually
surprised that there weren't even more comparisons to PAST
in the paper. So, for this review I'm going to list the
similarities and differences from PAST:
Similarities
- goals of efficiency, robustness, load-balancing of a
read-only file store (although I don't think PAST
emphasized the read-only attribute as much as the CFS
paper did, but I think it is equally true). Both
systems only allow the publisher to modify a file (I
believe this is really only done by deleting and
inserting it again with the same id)
- both enforce publishing quotas
- use of a hash on public keys and names
- similar approach to caching
- similar configuration option to adjust degree of
replication. Somewhat similar approach to replicating
across numerically local node ids.
Differences
- circular node with pointers to successors and a
finger table for jumping ahead in the circle (however,
the PAST system's log N routing table had a similar
effect, IMO). CFS's circular path requires it to
include a more defined node id authentication
mechanism to prevent malicious nodes from trying to
prevent a query from finding a succeeding node.
- CFS stripes the nodes with blocks of a file, which
is a better use of storage space and makes load
balancing much easier.
- CFS uses virtual servers to represent the greater
storage capacity that one node may have over another.
PAST accounted for varying storage sizes, but I believe
it accomplished this through metadata, rather than
making one server look like ten virtual servers if it
has ten times the base storage size. The CFS virtual
servers take up an impressively small 10KB per
additional virtual server (just the extra data
structure).
- CFS has no delete operation. There is only an
agreed-upon finite interval, at which point the data
can be deleted as necessary, unless the publisher
requests an extension. CFS sees this as a feature
to automatically get rid of unneeded and potentially
maliciously useless amounts of data.
I was surprised that the "Real life" test of CFS
only involved a single one megabyte file. I assume
the authors felt they covered the issue of routing
well enough in other tests, but it wasn't clear to
me that the "Real life" test was valid without the
complexity of having other files in the system.
(I actually think this was a problem with their
explanation or my reading, because this would
otherwise be a major hole in their testing.)
This archive was generated by hypermail 2.1.6 : Wed Mar 03 2004 - 15:22:31 PST