Cerf & Kahn, A Protocol for Packet Network Interconnection, 1974 ---------------------------------------------------------------- 1. Summary The paper presents a concrete design for an early version of the internetworking and transport protocol that evolved to become TCP/IP. It is based on internetworking at a higher layer rather than the conversion between the protocols of different networks at gateways for practical reasons. The paper identifies and reasons about the problems that must be solved to make such a system work, presenting candidate mechanisms for each problem: o Variation in packet sizes -> fragmentation o Variation in addresses -> higher layer of addressing, IP and ports o Reliability across networks -> acknowledgements and retransmissions o Variation in bandwidths -> sliding window flow control o Buffer management -> headers to simplify implementation 2. Questions Q: Why is internetworking hard? A: Heterogeneity of networks coupled with minimal assumptions about what native services the underlying networks can provide. We have it in bandwidth, sender/receiver capabilities such as buffering, addressing, packet sizes, error reporting, delays, ... these are essentially the problems above for which mechanism is proposed. The heterog Exercise: List these factors and the mechansisms to deal with them. It should produce a list like the one in the summary above. Q: What has changed today because it didn't pan out? (Perhaps these are non-obvious by definition :) A: Here is my ordered list: i. Split of TCP into IP + TCP/UDP/other. They assumed "TCP" would be able to support services all the way from virtual circuits (reliable, ordered bytestreams) to datagrams (single-packet connections). But this turned out not to be the case, and today we have multiple transport protocols for different services, all on top of IP. ii. Congestion collapse! TCP retransmission with a fixed window, e.g., 8 packets, isn't good enough when the network is heavily loaded. This resulted in meltdowns by the 80s, which we will study in later papers. iii. The size of addresses was way too small -- 256 networks in the entire world! Note that this is before LANs were invented. The address space has been expanded at least twice (to 32 bits of class A, B, C networks, and then to 32 bits of classless prefixes) and is still under pressure to be expanded as it is in the IPv6 protocol. iv. The TCP handshake. They have a two-packet model in mind. This has reliability problems and was changed to a three-packet handshake that we use today as proposed by Tomlinson. v. Fragmentation. This is an interesting case study. They already propose fragmentation inside the network but reassembly only at the destination to avoid excessive overhead at routers. However, even this turns out to have a number of gotchas -- it is a loss/bandwidth multiplier because any lost fragment dooms the entire packet; it is a security hole because end-hosts must maintain state to do the reassembly; it is a burden on routers (as they are currently engineered); and it decreases end-to-end performance. Instead, modern implementations mark packets as "do not fragment" and get an error indication from the network if the packet is to large to send, thereby learning the size of the largest packet that will fit down the path they are using. vi. There are many, many more icky details that mostly correspond to refinements in corner cases that have been discovered over time (checksums cover the IP pseudo-header for end-to-end reliability, the Nagle algorithm, PAWS, timestamp, silly window syndrone, window probes, MTU discovery, ...) Exercise: Go over the key aspects of mechanism that is needed, looking at a modern TCP/IP header hierarchical addressing fragmentation ports sliding windows flow control connection setup 3. Insights They has a single-transport-centric view; this was before the TCP/IP split. They thought TCP would support services all the way from virtual circuits to datagrams. The latter conceptually fits as a single packet with a SYN and REL (FIN). It's inspiring how much they got right, and how much terminology was carried over. This is 30 years back, and the framework and many mechanisms are still visible. This is engineering. They are concerned with designing mechanisms to solve problems in a cost-effective way. The technology at the time put bandwidth and host software processing and complexity at a premium, favoring small headers and simple to implement schemes, e.g., for buffer management, fragmentation. Should we still be concerned with number of bits in the header? Their worldview was "loss is rare". Earlier network links were designed to minimize loss by allocating buffers ahead of time, e.g., X.25 [Check this]. Loss is sometimes not rare at all, particularly in congested networks as began to occur later, peaking with the congestive collapses in the mid to late 80s and the invention of other mechanism to handle congestion. 4. Concepts From the paper and Peterson you should be familiar with all of the following and how they work: -Ports -Hierarchical addressing -Fragmentation -Sliding windows -Acknowledgements and retransmisions -Flow control -Connection setup