A Digital Fountain Approach to Reliable Distribution of Bulk Data ----------------------------------------------------------------- Byers et. al., 1998. 1. Summary This paper explores how forward error correction (FEC; error correcting codes) can be used instead of acknowledgements and retransmissions (ARQ) to provide reliable data transfer in practice. The setting here is multicast (one-to-many) for which FEC has a particular advantage: different receivers can lose different packets and it does not matter; in contrast if this were ARQ then each packet lost by a receiver would be retransmitted. Specifically, the paper describes the concept of an ideal digital fountain and shows how recent codes (Tornado codes) come closer to realizing it in practice (in terms of channel efficiency for available computational overhead) than earlier schemes (based on parity over blocks of packets or Reed-Solomon codes) 2. Questions Q: To compare ARQ to FEC, consider when would you use one or the other? A: Note that FEC can be applied at different granularities, e.g., within a packet, across a small number of packets, across large files. There are several factors in general: -Timeliness. Retransmissions take time. FEC can be used over small amounts of data when the receiver cannot wait for a retransmission, e.g., real-time video streaming. -Error model. Retransmissions will be most efficient when packets are expected to successfully cross the channel but might fail to arrive. FEC is needed when packets are not expected to successfully cross the chanel without it, e.g., consider an error rate of 1/1000, either one bit per 1000 or 1000 bits all at once. -Scalability. FEC across packets interacts well with multicast becuase the different receivers can lose different packets without penalty. -Reverse Channel. Retransmissions require it; FEC depends on it less. The digital fountain can work well without any reverse channel, which is rather surprising. Q: What is the contribution of this paper? A: It's not the codes (since they are covered and claimed elsewhere) so it must be something else. It is the concept of a digial fountain and the application of Tornado codes to a networking problem to more closely achieve it in practice (in terms of channel efficiency for available computational overhead) than before. Note that the computational different is in the order of growth, not just a constant factor, so the advantage will stay relevant as file sizes increase even as processors get faster. 3. Insights This is a fairly radical alternative way to provide reliability compared to ARQ -- you don't even need a reverse channel! Yet it has not caught on very rapidly for content distribution. There are a number of reasons why this might be the case: -It is relatively complex (i.e., congestion control, dependence on IP-level multicast, coding) but not really necessary. Other content systems based on caching work well enough, e.g., CDNs and see recently Bit Torrent. Caching in effect provides spatial multicast over time. -It makes little difference over the earlier coding schemes when the loss rate is low, which may be frequently, and the advantages are somewhat exaggerated anyway, e.g., parallelism could be used in decoding. The study using MBONE traces for their loss events and consideration of congestion control is very nice. It makes the resulting system much more real than a direct analytic evaluation of the algorithm. Both of these factors bring up issues -- how to handle periods of very high loss; and the interleaving problem across layers being sent at different rates. 4. Concepts You should be clear on ARQ and FEC from Peterson and the paper.