HW3 -- Packet Analysis

Version 1.0. 1/25/05. 11.11am.
Out: Wed 1/25/06
Due Back: Fri 2/3/06 11.59 PM

Background

In class, we've been talking about the behavior of sliding window. TCP is a protocol that uses sliding window aggressively. And, as you know from the first two asssignments, tcpdump is a program that lets us monitor network activity at the packet level.

For this assignment, you will be analyzing some packet traces produced by tcpdump. For some of the questions, you may need to write a program. You may use any programming language you like, including shell scripts, or existing Unix tools. For others, you can probably just eyeball the output from tcpdump. To be clear, this is not a programming assignment, but you may use your programming skills to help you complete it.

A few details about tcpdump and tcp

For this assignment, you can use the standard tcpdump program, rather than tcpdump461. This is because you are working on a pre-existing trace file, so there are no privacy policy issues involved in using the program. Use tcpdump -r tracefile to look at packets in file tracefile.

There are two things to keep in mind about the output of tcpdump. The first are that some options are generic to all protocols and have to with how information is formatted as it is printed. These are set by command line flags. Here are some ones that you might find useful.

The second is that tcpdump includes a powerful expression-based filtering language that can be used to extract only certain packets from a trace. For example, to extract only those packets sent from www.cs.washington.edu, one would say tcpdump -r tracefile src host wwww.cs.washington.edu

TCP is bi-directional, meaning that any given packet can both carry data intended for the peer, as well as an acknowledgement for data sent from the peer. For many of the questions, you will want to look at data sent in only one direction.

TCP

A TCP sequence number refers to a byte, not a packet. Hence, any given packet is described by a pair of sequence numbers: the first and the last. For example, 1:1184 are the start and end in the following packet:
20:32:35.437157 IP www.cs.washington.edu.http > 10.0.1.4.60258: P 1:1184(1183) ack 202 win 1716 

The difference is the size of the tcp portion of the packet. Questions that refer to packet size refer to just this portion.

A packet may also include an acknowledgement, where the acknowledgement contains a sequence number that describes the largest byte received such that all bytes of smaller sequence number have also been received. In the above example, the ack is for sequence number 202, which is the sequence number of the last in-order byte received. So, for example, if three packets are received in order with sequence numbers [1,100], [200,215], [216,220], then the TCP receiver does not generate an ack having a sequence number in excess of 100 until it receives a data packet with a sequence number of at least 101. It may however generate acks for sequence numbers 100 as a "signal" to the sender that something is quite literally "out of sorts" with the packets that have been received.

By default, tcpdump displays the true sequence number for the first packet in a conversation, and then reverts to a delta for subsequent packets. This makes the trace easier to read. You can use the -S option to change this behavior.

Here's the output of running tcpdump -r smalltrace.raw. I recommend that you open in a new window and STRETCH it in order to look at it. You won't need to know what all the fields are. The most important ones are the timestamp, src and destination host, sequence number, and ack (with sequence number).

The Assignment

There are two parts to this assignment. The first part involves answering a few questions about a very short packet trace representing a brief conversation between my home computer and a computer somewhere on the internet. The entire conversation fits on half a screen, and you should be able to answer all the questions about the first using tcpdump on the raw packet trace directly.

The second part involves analyzing a much longer packet trace, again between my home computer and a server on the internet. This file is too large for you to "eyeball" the results, so you will need to process it somehow.

The way in which you answer each question is just as important as your answer. That means that you should answer each question by providing the information being asked for, as well as an explanation for how you produced the answer. Your explanation should make clear strategy and mechanics. Strategy says "what to look for." Mechanics says "how to look for it."

For example, suppose the question was "A pure-ack packet is a packet that carries no data. It contains only an ack. How many pure ack packets are reflected in the trace file to www.cs.washington.edu?" A good answer would be:

Alternatively, you could write a program that computed this directly in any programming language you like.

Or, even more alternatively, you could use a combination of simple tools to extract portions of the data and put it an easy-to-use format, and then write a program that produces the final result. The choice is yours. You will not be graded on the basis of your choice of mechanics, but some choices will be much more time-consuming than others. You will find this assignment easiest if you first think about each question carefully in order to convince yourself that you can answer it easily without a lot of mechanics. None of the questions require much.

The Traces

You can get both the short and long trace files here.

Part A: Short Trace File

These questions can be answered with the data in smalltrace.raw.

Questions

These questions shouldn't require that you do anything more than look at the trace file. In some cases, you may need to use tcpdump to look at the raw trace file. In others, you can get the answer from the text dump above. For all questions, your solution should follow the strategy/mechanics/answer format.

Part B: The Big Trace File

Now that you've gotten your feet wet, it's time to tackle a larger trace (bigtrace.raw) and some more difficult questions. This file shows my network activity while I was fetching a pdf file.

Questions

In addition to using the strategy/mechanics/answer format, in a few cases, I ask you to also discuss the meaning of the answer. Please be brief and thoughtful.

What to turn in

You should turnin a directory called HW3 that contains an html file called "index.html." Turnin instructions will be posted before the due date. Please use this template. The name of the game for the html is SIMPLE SIMPLE SIMPLE. Look at the source for this page to get an idea of what I mean by simple. I use fewer than half a dozen html features, and while it doesn't look great, it does the job.

If you need to include any out of line figures (pdf or gif only), or shell scripts, or program source code do so by including them as links in your hw3.html file. If you want to include them in line in your web page, know how, and they don't distract too much from the flow, go ahead. But, don't worry if you don't know how.

Finally, make sure your web page works before you turn it in. We're not going to be able to debug your html


Last modified: Wed Jan 25 11:22:32 PST 2006