From: Andrew Putnam (aputnam@cs.washington.edu)
Date: Sun Oct 10 2004 - 21:33:51 PDT
A Digital Fountain Approach to Reliable Distribution of Bulk Data
John W. Byers, et. al.
Summary:
This paper introduces an efficient data distribution mechanism for
distributing files to a large number of destination nodes. The digital
fountain relies on erasure codes for efficient reconstruction of
packets and a low total receive time per transmission.
Comments:
TCP was designed primarily for fast point-to-point communication.
TCP provides for multiple destination points via the multicast and
broadcast mechanisms, but this solution does not scale well, and
becomes infeasible when dealing with more than a small number of
destination nodes. So reliably distributing a large amount of data to a
large number of destinations requires a different mechanism.
The authors propose a data distribution mechanism called a digital
fountain. Digital fountains work by rebroadcasting data constantly,
which allows receivers to tune into the stream and eventually receive
all the packets without forcing the host to receive acknowledgments or
to rebroadcast particular packets to particular destination nodes.
The most basic implementation of a digital fountain is a constant
rebroadcast of the data, which allows receivers to tune into the stream
and eventually catch all of the packets. The problem with this approach
is that the receiver must wait until the entire file is sent before it
will see the missed packet again. For large files, this time may be
unacceptable.
A much better implementation uses erasure codes, which allow the
destination node to reconstruct packets which it may have missed.
Depending on the configuration, these codes allow the receiver to
reconstruct packets at fractional intervals of the entire transmission.
The drawback is that it increases the total size of the transmission.
A key element of the digital fountain approach is the efficiency of
the Tornado code developed by the authors to distribute the data. The
Tornado code relies on XOR computations which are extremely fast and
easily implemented in hardware. This is a substantial improvement over
other erasure codes.
One questionable part of the analysis was the author's choice of the
Internet trace used to demonstrate performance. The authors claim that
the selected trace segment has just over 11% loss. This is incredibly
high for a standard Internet connection, particularly a wired
connection. Since their approach outperforms other approaches most in
networks with high loss rates, this particular choice is very suspect.
It is certainly catering to their system's strength, but it is not
convincing that this benefits the typical case of < 1% loss.
The authors are also not particularly good about making comparisons
between Tornado Z codes and the interleaved code. They only define that
their comparison is against "an interleaved code with comparable
inefficiency". This is not clear enough to allow others to reproduce
the results.
This archive was generated by hypermail 2.1.6 : Sun Oct 10 2004 - 21:34:12 PDT