|
|
|
|
Homework 3: Protocol Layers
Out: Wednesday April 24
Due: Wednesday May 1 (midnight)
Turnin: Online
Teams No, do this individually
Objectives
To learn to capture and analyze packets using wireshark.
To learn how protocols and layering are represented in packets.
Maximum Score: 50 points
Requirements
Wireshark: This lab uses the Wireshark
software tool to capture and examine a packet trace. A packet trace is a record
of traffic at a location on the network, as if a snapshot was taken of all the
bits that passed across a particular wire. The packet trace records a
timestamp for each packet, along with the bits that make up the packet, from
the lower-layer headers to the higher-layer contents. Wireshark runs on most
operating systems, including Windows, Mac and Linux. It provides a graphical UI
that shows the sequence of packets and the meaning of the bits when interpreted
as protocol headers and data. It color-codes packets by their type, and has
various ways to filter and analyze packets to let you investigate the behavior
of network protocols. Wireshark is widely used to troubleshoot networks. You
can download it from www.wireshark.org if it is not already installed on your
computer.
Running Wireshark requires root privileges. On your
own machine, that shouldn't be a problem. For use on attu and
the lab Linux workstations, there is a special installation of Wireshark at /usr/sbin/wireshark that doesn't
require root, but limits you to examining log files (rather than capturing live
data from the network). (Note: Do
NOT just issue the command 'wireshark' on CSE machines. That will launch a
version that will fail (on an insufficient privileges problem). Instead, give
the full pathname to the correct version, /usr/sbin/wireshark .)
Launch Wireshark from the
command line. Select "Open" on the screen that appears to open the
log file.
wget / curl: This lab uses wget
(Linux and Windows) and curl (Mac) to fetch web
resources. wget and curl
are command-line programs that let you fetch a URL. Unlike a web browser, which
fetches and executes entire pages, wget and curl
give you control over exactly which URLs you fetch and when you fetch them. Under
Linux, wget can be installed via your package
manager. Under Windows, wget is available as a binary;
look for download information on http://www.gnu.org/software/wget/. Under
Mac, curl comes installed with the OS. Both have many
options (try wget --help or curl
--help to see) but a URL can be fetched simply with wget
URL or curl URL .
Both curl and wget are installed on attu.
Trace File is here
Step 1(Optional): Capture a Trace
This part teaches you to
capture a network trace using wireshark. The graded assignment, however, will
ask you to use the given trace. Thus, you may skip this if you want.
We want this trace to look
at the protocol structure of packets. A simple Web fetch of a URL from a server
of your choice to your computer, which is the client, will serve as traffic.
Pick a URL and fetch it with wget
or curl. For example,
wget http://www.cs.washington.edu
or curl http://www.cs.washington.edu.
This will fetch the resource and either write it to a file (wget) or to the screen (curl). You are checking to see that
the fetch works and retrieves some content. A successful example is shown below
(with added highlighting) for wget.
You want a single response with status code 200 OK. If the fetch does
not work then try a different URL; if no URLs seem to work then debug your use
of wget/curl or your Internet connectivity.
Figure 1: Using wget
to fetch a URL
Close
unnecessary browser tabs and windows. By minimizing browser activity you
will stop your computer from fetching unnecessary web content, and avoid
incidental traffic in the trace.
3. Launch
Wireshark and start a capture with a filter of tcp
port 80 and check enable network name resolution. This
filter will record only standard web traffic and not other kinds of packets
that your computer may send. The checking will translate the addresses of the
computers sending and receiving packets into names, which should help you to
recognize whether the packets are going to or from your computer. Your capture
window should be similar to the one pictured below, other than our highlighting.
Select the interface from which to capture as the main wired or wireless
interface used by your computer to connect to the Internet. If unsure, guess
and revisit this step later if your capture is not successful. Uncheck capture
packets in promiscuous mode. This mode is useful to overhear packets sent
to/from other computers on broadcast networks. We only want to record packets
sent to/from your computer. Leave other options at their default values.
The capture filter, if present, is used to prevent the capture of other traffic
your computer may send or receive. On Wireshark 1.8, the capture filter box is
present directly on the options screen, but on Wireshark 1.9, you set a capture
filter by double-clicking on the interface.
Figure 2: Setting up the capture options
4. When
the capture is started, repeat the web fetch using wget/curl
above. This time, the packets will be recorded by Wireshark as the content
is transferred.
5. After
the fetch is successful, return to Wireshark and use the menus or buttons to
stop the trace. If you have succeeded, the upper Wireshark window will show
multiple packets, and most likely it will be full. How many packets are
captured will depend on the size of the web page, but there should be at least
8 packets in the trace, and typically 20-100, and many of these packets will be
colored green. An example is shown below. Congratulations, you have captured a
trace!
Figure 3: Packet trace of wget traffic
Step 2: Inspect the
Trace
Wireshark will let us select a packet (from the top panel)
and view its protocol layers, in terms of both header fields (in the middle
panel) and the bytes that make up the packet (in the bottom panel). In the
figure above, the first packet is selected (shown in blue). Note that we
are using "packet" as a general term here. Strictly speaking, a unit of information
at the link layer is called a frame. At the network layer it is called a
packet, at the transport layer a segment, and at the application layer a
message. Wireshark is gathering frames and presenting us with the
higher-layer packet, segment, and message structures it can recognize that are
carried within the frames. We will often use packet for convenience, as
each frame contains one packet and it is often the packet or higher-layer
details that are of interest.
Select a packet for which the Protocol column is HTTP
and the Info column says it is a GET. It is the packet that carries the web
(HTTP) request sent from your computer to the server. (You can click the column
headings to sort by that value, though it should not be difficult to find an
HTTP packet by inspection.) Lets have a closer look to see how the packet
structure reflects the protocols that are in use.
Since we are fetching a web page, we know that the protocol
layers being used are as shown below. That is, HTTP is the application layer
web protocol used to fetch URLs. Like many Internet applications, it runs on
top of the TCP/IP transport and network layer protocols. The link and physical
layer protocols depend on your network, but are typically combined in the form
of Ethernet (shown) if your computer is wired, or 802.11 (not shown) if your
computer is wireless.
Figure 4: Protocol stack for a web fetch
With the HTTP GET packet
selected, look closely to see the similarities and differences between it and
our protocol stack as described next. The protocol blocks are listed in the
middle panel. You can expand each block (by clicking on the + expander or
icon) to see its details.
The first Wireshark block is Frame. This is not a protocol, it
is a record that describes overall information about the packet, including when
it was captured and how many bits long it is.
The second block is Ethernet. This matches our diagram! Note
that you may have taken a trace on a computer using 802.11 yet still see an
Ethernet block instead of an 802.11 block. Why? It happens because we asked Wireshark
to capture traffic in Ethernet format on the capture options, so it converted
the real 802.11 header into a pseudo-Ethernet header.
Then come IP, TCP, and HTTP, which are just as we wanted. Note
that the order is from the bottom of the protocol stack upwards. This is
because as packets are passed down the stack, the header information of the
lower layer protocol is added to the front of the information from the higher
layer protocol, as in Fig. 1-15 of your text. That is, the lower layer
protocols come first in the packet on the wire.
Now find another HTTP packet,
the response from the server to your computer, and look at the structure of
this packet for the differences compared to the HTTP GET packet. This
packet should have 200 OK in the Info field, denoting a successful fetch. In
our trace, there are two extra blocks in the detail panel as seen in the next
figure.
The first extra block says [11 reassembled TCP segments ].
Details in your capture will vary, but this block is describing more than the
packet itself. Most likely, the web response was sent across the network as a
series of packets that were put together after they arrived at the computer.
The packet labeled HTTP is the last packet in the web response, and the block
lists packets that are joined together to obtain the complete web
response. Each of these packets is shown as having protocol TCP
even though the packets carry part of an HTTP response. Only the final packet
is shown as having protocol HTTP when the complete HTTP message may be understood,
and it lists the packets that are joined together to make the HTTP response.
The second extra block says Line-based text data . Details in
your capture will vary, but this block is describing the contents of the web
page that was fetched. In our case it is of type text/html, though it could
easily have been text/xml, image/jpeg, or many other types. As with the Frame
record, this is not a true protocol. Instead, it is a description of packet
contents that Wireshark is producing to help us understand the network traffic.
Figure 5: Inspecting a HTTP 200 OK response
Step 3: Packet
Structure
For the graded questions in the next parts, please use the trace that can be downloaded from here. To load a trace in wireshark, just File->Open it.
To show your understanding of packet structure, draw a
figure of an HTTP GET packet that shows the position and size in bytes of the
TCP, IP and Ethernet protocol headers. Your figure can simply show the
overall packet as a long, thin rectangle. Leftmost elements are the first sent
on the wire. On this drawing, show the range of the Ethernet header and the
Ethernet payload that IP passed to Ethernet to send over the network. To show
the nesting structure of protocol layers, note the range of the IP header and its payload as well as the layers within.
To work out sizes, observe that when you click on a protocol
block in the middle panel (the block itself, not the + expander) then Wireshark
will highlight the bytes it corresponds to in the packet in the lower panel and
display the length at the bottom of the window. For instance, clicking on the
IP version 4 header of a packet in our trace shows us that the length is 20
bytes. (Your trace will be different if it is IPv6, and may be different even
with IPv4 depending on various options.) You may also use the overall packet
size shown in the Length column or Frame detail block.
[10 points] Hand in your packet drawing.
Step 4: Protocol Overhead
Estimate the download protocol overhead, or percentage of
the download bytes taken up by protocol overhead. To do this, consider HTTP
data (headers and message) to be useful data for the network to carry, and
lower layer headers (TCP, IP, and Ethernet) to be the overhead. We would
like this overhead to be small, so that most bits are used to carry content
that applications care about. To work this out, first look at only the packets
in the download direction for a single web fetch. You might sort on the Destination
column to find them. The packets should start with a short TCP packet described
as a SYN ACK, which is the beginning of a connection. They will be followed by
mostly longer packets in the middle (of roughly 1 to 1.5KB), of which the last
one is an HTTP packet. This is the main portion of the download. And they will
likely end with a short TCP packet that is part of ending the connection. For
each packet, you can inspect how much overhead it has in the form of Ethernet /
IP / TCP headers, and how much useful HTTP data it carries in the TCP payload.
You may also look at the HTTP packet in Wireshark to learn how much data is in
the TCP payloads over all download packets.
[5 points] Estimate the download protocol
overhead on packet 7 in the given trace.
[10 points] Estimate the download
protocol overhead for the entire HTTP data, as defined above. Tell us whether
you find this overhead to be significant.
Step 5: Demultiplexing Keys
When an Ethernet frame arrives at a computer, the Ethernet
layer must hand the packet that it contains to the next higher layer to be
processed. The act of finding the right higher layer to process received
packets is called demultiplexing. We know that in our case the higher layer is
IP. But how does the Ethernet protocol know this? After all, the higher-layer
could have been another protocol entirely (such as ARP). We have the same issue
at the IP layer – IP must be able to determine that the contents of IP
message is a TCP packet so that it can hand it to the TCP protocol to process. The
answer is that protocols use information in their header known as a
demultiplexing key to determine the higher layer.
Look at the Ethernet and IP
headers of a download packet in detail to answer the following questions:
[5 points] Which Ethernet header field is
the demultiplexing key that tells it the next higher layer is IP What value is
used in this field to indicate IP?
[5 points] Which IP header field is the
demultiplexing key that tells it the next higher layer is TCP? What value is
used in this field to indicate TCP?
More Questions
[5 points] Look at a short TCP packet
that carries no higher-layer data. To what entity is this packet destined?
After all, if it carries no higher-layer data then it does not seem very useful
to a higher layer protocol such as HTTP!
[5 points] In the classic layered model
described above, lower layers append headers to the messages passed down from
higher layers. How will this model change if a lower layer adds encryption?
[5 points] In the classic layered model
described above, lower layers append headers to the messages passed down from
higher layers. How will this model change if a lower layer adds compression?
|