From: Tom Christiansen (tomchr@ee.washington.edu)
Date: Tue Oct 12 2004 - 19:51:07 PDT
In this rather wordy article, the authors discuss various reliability
issues with file and data transfer via networks. The first part of the
article deals with the different error modes ranging from physical media
read/write errors through CPU/RAM errors to catastrophic errors such as
server crashes, power failures, etc. The remaining part is essentially a
list of different reliability - performance tradeoffs.
The main argument throughout the article is that end-to-end error
detection, management, and correction (via re-transmission) is best dealt
with at the application layer (or even as user-to-user interaction) as this
provides the best tradeoff between reliability and performance. This was
probably the case in 1984 when the paper was written as computer power -
especially that of embedded systems - was limited. These days fast FPGA's,
micro controllers, etc. make it very possible to implement error detection
on network adaptors, routers, and gateways without any significant
degradation in performance.
According to the authors, hardware cannot be trusted. Thus, when
transferring files from one host to the other, the communications protocol
should verify that the data was read correctly, transmitted across the
network correctly, and written correctly to the storage media at the
receiver before acknowledging the transmission. Naturally, this cannot all
happen in the lower transport layers. This argument was likely true in
1984, but is much less the case today. Basically, today, if the data make
it safely across the network it can be considered a successful transmission
as the main sources for transmission errors are network collisions and data
corruptions caused by electrical interference, electrostatic discharge,
lightning strikes, etc.
Another major difference between then and now is the development in
operating systems. Modern OS'es contain event handlers to deal with error
handling when communicating with local hardware. This was not the case in
the mid 80'es.
The author's approach allows application programmers to tailor the error
detection/correction algorithms directly to the needs of that particular
application. This makes development of applications a somewhat slow
process. The modern philosophy is to encapsulate and reuse program modules
to allow applications to be developed quicker and with fewer errors
(assuming the support libraries are well tested).
The paper does present one good point, though: There are applications
(voice comm., and voice messaging systems being the examples mentioned)
where some level of data corruption or loss can be tolerated.
This archive was generated by hypermail 2.1.6 : Tue Oct 12 2004 - 19:51:15 PDT