Wireshark is already installed on CSE machines. If it is not already installed on your machine, your standard mechanism for installing software might provide it, or else you can download it from www.wireshark.org.
Trace File
The trace file is here.
The Wireshark GUI has three main sections, as shown in the figure below. In the top panel is a list of the packets in the trace. The bottom two panels show details on a single packet, selected by clicking on one in the top panel. The middle panel shows header fields - each protocol layer adds a header that encapsulates the information passed down by the layer above. The bottom panel shows the raw bytes that make up the packet. (Note that we're using "packet" as a general term here. Strictly speaking, a unit of information at the link layer (which is what is captured in the trace) is called a frame. At the network layer (IP), the unit is called a packet, at the transport layer a datagram or a segment (depending), and at the application layer a message. We'll often use packet out of habit, though.)
Select a packet for which the Protocol column is HTTP and the Info column says it is a GET. This is a packet that carries a web (HTTP) request, for instance as sent from your browser to a web server. Let's have a closer look to see how the packet structure reflects the protocols that are in use.Figure 1: Example Wireshark session
Since we are fetching a web page, we know that the protocol layers being used are as shown below. That is, HTTP is the application layer protocol used to fetch URLs. Like many Internet applications, it runs on top of the TCP transport layer, which itself runs on top of the IP network layer. IP runs on top of some link/physical layer protocols, depending on the physical network. These are typically combined by Wireshark and displayed as Ethernet, if the trace was captured on a wireless interface, or 802.11, if it was captured on a wireless interface.
With the HTTP GET packet selected you can examine the protocol header for each layer, using the middle panel. You can expand the information for each layer by clicking on the + expander or icon to see details about the information it provides.
Figure 2: Protocol stack for a web fetch
Figure 3: Inspecting a HTTP 200 OK response
Draw a figure of the HTTP GET packet (packet 4 in the trace) that shows the position and size in bytes of the HTTP, TCP, IP and Ethernet protocol headers. Your figure can simply show the overall packet as a long, thin rectangle. Leftmost elements are the first sent on the wire. On this drawing, show for each protocol layer the byte range containing the protocol header. If the topmost layer has data, it will be contained in the final segment of the packet. In that case, show its byte range as well. (So, your diagram partitions the bytes of the packets into many protocol headers and possibly one data segment.)
To work out sizes, observe that when you select a protocol block in the middle panel by clicking on it, Wireshark highlights the bytes it corresponds to in the packet in the lower panel and displays their length at the bottom of the window.
Estimate the download protocol overhead, or percentage of the download bytes taken up by protocol overhead. To do this, consider HTTP data (headers and message) to be useful data for the network to carry, and lower layer headers (TCP, IP, and Ethernet) to be the overhead. We would like this overhead to be small, so that most bits are used to carry content that applications care about. To work this out, first look at only the packets in the download direction for a single web fetch. (The GET travels upstream. The other direction is downstream.) The packets should start with a short TCP packet described as a SYN ACK, which is the beginning of a connection. They will be followed by mostly longer packets in the middle (of roughly 1 to 1.5KB), of which the last one is an HTTP packet. This is the main portion of the download. And they will likely end with a short TCP packet that is part of ending the connection. For each packet, you can inspect how much overhead it has in the form of Ethernet / IP / TCP headers, and how much useful HTTP data it carries in the TCP payload. You may also look at the HTTP packet in Wireshark to learn how much data is in the TCP payloads over all download packets.
Q3: Estimate the download protocol overhead for the entire HTTP response, as defined above.
When an Ethernet frame arrives at a computer, the Ethernet layer must hand the packet it contains to the next higher layer to be processed. There can be many "next higher layers" installed on any particular system, and the act of finding the right one to hand any particular incoming packet to is called demultiplexing. We know that in our case the higher layer is IP. But how does the Ethernet protocol know this? We have the same issue at the IP layer -- IP must be able to determine that the contents of IP message is a TCP packet so that it can hand it to the TCP protocol to process. The answer is that protocols use have fields in their headers indicating what the next higher level protocol is. These fields, called demultiplexing keys, are filled in by the protocol layer on the sender side and are read by the protocol layer on the receiving side, since the path up through the layers on the receiver should be the same as the path down through the layers on the sender.
Look at the Ethernet and IP headers of a download packet in detail to answer the following questions:
Q5: Which IP header field is the demultiplexing key indicating that the next higher layer is TCP? What value is used in this field to indicate TCP?