CSE 461: Introduction to Computer Communication Networks, Spring 2013
  CSE Home   About Us   Search   Contact Info 
Home
Overview
Course email
Home Virtual Machines
Homework Turnin
Class GoPost Forum
Schedule
Hw/Project List
   

Project 1

Out: Monday, April 8
Due: Thursday April 18 (midnight)
Turnin: Online
Teams: Groups of two or three


Shortcuts to:
-Eclipse project setup
-software infrastructure
-configuration file documentation

Assignment Overview

You'll write code to measure client-server throughput, latency, and error rate using both UDP and TCP. There are a number of goals for this activity:
  • To gain some programming experience using raw sockets for communication, on both the client and server sides.

  • To gain some experience with protocol headers and encapsulation.

  • To get some sense of TCP and UDP throughput, latency, and error rates.

  • To gain some appreciation of the tradeoff between performance and reliability.

  • To gain some appreciation of the need for message framing.

  • To verify that you're set up to work on the projects.

A major goal of this assignment is to verify that you have set up the infrastructure required to do future projects. This includes forming a team, figuring out what platform you will work on, and making sure you can deal with Eclipse. Experience says that this usually goes smoothly, but you should make sure you can build the project in Eclipse as soon as possible, as resolving difficulties with that can take some time.

The ping Application

ping is a program that measures the time required for a client to contact a server and receive a reply. We do that by misusing an echo server that is already built and is part of the source distribution.

Together, the echo client and server move an arbitrary string from the client to the server and back to the client, which then prints what it gets. Ping does the same, except that (a) it always sends the null string as the payload, and (b) it measures and reports latency - how long it takes for the null string to make the round-trip. To be more specific, we define round-trip latency to be the interval from just before creation of the client-side socket to the just after the reply is fully received.

To save us some work, we'll simply re-use the echo server in building ping - our ping client sends the null string to the already implemented echo server. You'll need to build a new client, but that mostly involves making simple modifications to the existing echo client.

echo supports both UDP and TCP, and your ping client should as well. Because you're modifying the echo client, the details of the protocol's message representation are already implemented, but they're presented here for completeness.

UDP message formats

Both the client-server request message and the server-client response are single UDP packets. Each carries an echo header, followed by a payload. The single packet restriction limits the size of the payload that the echo application can handle when using UDP, but is irrelevant to ping, since it sends a null payload.

The client-server message consists entirely of the echo header, which is four bytes long. The header consists of the ASCII character values in "echo":

echo

If there were a non-null payload, it would follow the header.

The server-client response (which also carries a null payload) looks like this:

okay

TCP message formats

The message formats when using TCP are essentially the same. The client sends a four byte header, whose contents correspond to the ASCII character values in "echo", followed by a payload (which for ping is null). The server responds with a four byte header, whose contents correspond to the ASCII character values in "okay", followed by any bytes in the received payload. The client closes its connection when it receives back what it sent. The server closes when it detects end-of-file on the stream.

The dataxfer Application

dataxfer measures the server to client data transfer and error rates. You should write both the server and a console client for this application, supporting both UDP and TCP data transfers. Your code should run repeated trials of the transfer, and report mean results.

A data transfer error is a failure to correctly receive all of the expected data. We simply count the number of bytes received, and assume the results are correct if we the number we expect and incorrect otherwise. The error rate is simply the fraction of trials for which there was an error.

If there is no error, we define the transfer rate to the be the amount of data transferred divided by the transfer time. The transfer time is the time from just before creation of the client-side socket until the final byte of data arrives at the client.

To see how transfer rate depends on the total amount of data sent, the server offers a few transfer sizes. In particular, for both UDP and TCP, the server allocates four consecutively numbered UDP ports (e.g., 22000, 22001, 22002, and 22003). It returns 1000 bytes of data on the lowest numbered port, and a factor of 10 more bytes on each successively higher numbered one. (So it returns 1,000,000 bytes on the highest numbered port.)

UDP message formats

The client sends a UDP packet containing just the dataxfer header:

xfer

The server responds with one or more UDP packets, each containing a dataxfer header (blue) and data payload (white, except that the last byte is pink):

o k a y          
o k a y          

o k a y       

Except for the last packet, the payload should be 1000 bytes. So, for instance, if a total of 3500 bytes will be sent, the server would respond with three packets containing 1000 payload bytes and a final packet with 500 payload bytes. (The transfer lengths offered by the four dataxfer ports are all multiples of 1000, but that won't be the case in later projects.)

TCP message formats

The TCP server uses the same four port numbers for TCP sockets that it uses for UDP sockets, and transfers the same amount of data on each as in the UDP case. (There is no port number conflict because the sockets support different protocols.)

The TCP message formats are similar. The client writes four bytes to its outgoing stream:

xfer

The server responds with a header and then all of the data bytes:

o k a y                                

Your implementation should be capable of transferring any amount of data, and in particular amounts far larger than the size of main memory. That means the server cannot try to write all of the data to the output stream with one write() call, and the that client can't try to read it all with one read() call. Instead, each should read or write some fixed, maximum amount in a single operation, and use repeated operations to achieve the desired transfer size.

Note: The client interface asks you to assemble all received bytes into a single byte array, violating the intent of what was just said. This is an artificial feature intended solely to support debugging code that tries to verify that the transfer was successful. You should still read the incoming data in fixed maximum size chunks, though.

Note: The size of the chunks you read and write can significantly affect the transfer rates you achieve.

What Do I Do?

Finish the implementation of the following 3 files:

Source FileInterface FileEclipse Project
PingRaw.javaPingInterface.javaConsoleApps
(edu.uw.cs.cse461.consoleapps.solution)
DataXferRaw.javaDataXferInterface.javaConsoleApps
(edu.uw.cs.cse461.consoleapps.solution)
DataXferRawServiceNoneNet
(edu.uw.cs.cse461.net.base.service)

The Interface File column indicates what promises are made specifically by this class. Your implementations need to conform to these interfaces to operate within the infrastructure.

Running Your Code

To do most anything, you need to be running two instances of the code. We call one the client and one the server. The client runs the echo, ping, and dataxfer applications. Communication starts at the client, which contacts the server. It gets tedious to type the IP address of the server in to the client application while you're debugging, so we put that (and much other) information in a configuration file. Most client applications use the IP address defined by field net.server.ip from the configuration file.

We distribute two configuration files, client.config.ini and server.config.ini. They're set up with the assumption that you will run both the client and the service instance on the machine you're sitting at. That's the default configuration, and is recommended for this project because it tends to simplify a lot of non-networking issues involved in simultaneously debugging on two machines. If you would like to run distributed, it's simple enough: run the instances on separate machines, update the config file on the client to set net.server.ip to the IP address of the server, and comment out the line net.host.ip=localhost in the server's config file.

We are also providing a fully functional implementation of the projects, as a jar file. You can use it as one of your two instances while developing your code. It's much easier to build one side of the communication side when the other side is working than to simultaneously debug both sides.

Testing Your Code

We have provided two test apps that you can use to exercise your client-side implementations of ping and dataxfer. These can be accessed from the main app list, or from a provided app called testdriver.

Our tests use a special service that we have written, using your client-side implementations of ping and dataxfer to talk to them. So, in order to run these tests, you'll need to have both client and server instances of your code running, which means having at least one set of your code running outside of Eclipse. As an example setup, you might export a jar of your project (as described at the bottom of the software infrastructure page), and run this locally using the server.config.ini config file. At the same time, you'd run your code in Eclipse using client.config.ini, and run the test apps from there.

The tests are not meant to imply completeness, but you can use them as a sanity check.

What to Turn In

Turn-in is online. We'd like a single file containing the items listed next. The file should be in .tar.gz, .zip, or .jar format. The file name should list the UWNetID's of the members of your team, following the format of this example name: mbforbes_bhora_arvind.tar.gz. (Substitute .zip or .jar for .tar.gz if you use either of those formats.)

You should include in that file:

  1. Your modified PingRaw.java, DataXferRaw.java, and DataXferRawService.java source files, and possibly a README.txt file.
    We will insert your source into a test harness (the solution code, essentially) and run it. For that to work, you should not have done anything unexpected, like changing class or package names, or introducing new source files. If you have done something unexpected, and can't undo it before turn-in, add a README.txt file that explains what you've done and what we're going to need to do to build our solution code with your code injected into it. Note that we will check for cheating versus other student's submissions from this and previous quarters.

    If your code isn't fully working, please characterize what doesn't work in file README.txt.

  2. An answers.txt or answers.pdf file answering the following questions:

    1. Why might the UDP ping client time out?
      Suppose your UDP ping client experiences a timeout while waiting for a response. List all distinct possible causes of the timeout.

    2. What is an appropriate response to a timeout?
      Suppose your ping client is asked to make 100 trials UDP pings. The client code experiences a timeout on trial number 5, and no other timeouts.

      1. There's an argument that when the timeout occurs, the client code should give up, not perform the remaining 95 trials, and return "I don't know" as the estimated ping time. What is that argument?

      2. There's also an argument that the code should go ahead and perform the remaining 95 trials, returning the average of the 99 successful trials and an indication that one trial failed. (This response is interpreted as "When the ping is successful, the average latency is xx msec., but it's successful only Y% of the time," which isn't quite the original goal but is still something worth knowing.) Briefly explain that argument.

      3. There's no argument that you should just keep running trials until you have 100 completed without timeout, and then simply return the mean of those 100 trials. Briefly explain why not.

      4. Very briefly, what does your UDP ping code implementation do when a timeout occurs? (This is just a question of what you actually do. You don't have to have done "the best imaginable thing." In fact, the project discourages you from trying to do the best imaginable thing because it's much too complicated for this project.)

    3. The UDP ping protocol we asked you to implement doesn't actually work, in some abstract sense at least. Explain how it can fail.
      Hint: Consider the case where the caller's timeout is relatively short, say, about the same as the time it will take to receive the server's response in the normal case.

    4. In contrast, the TCP ping protocol does work, not because TCP is reliable and UDP isn't, but because our ping implementation using TCP creates a new connection for each ping attempt. Explain how to adapt that solution to the UDP case. (I.e., how could you change the UDP implementation so that it solved the problem in a manner analogous to the TCP implementation?)

    5. The solution developed in the preceding question is "heavyweight," in the sense of generating a lot of seemingly unnecessary overhead. Explain how to fix the UDP problem by changing the format of the echo header, but without modifying how you use UDP sockets.

Setup Details

There are two primary issues to setup: the general one of getting your build/test environment set up, and the specific code base used in this project.

Project 1 Software: Modifications and Execution

You may have to make some modest changes and additions to the distributed software. A separate page describes the software architecture, and how to run the software.

Downloads

The "downloads" are actually files in the CSE file system that you copy to your platform (not web downloads).

  1. The skeleton code is located at /cse/courses/cse461/13sp/461projects.jar. Un-jar it to produce an Eclipse workspace. (You'll need to follow the directions above to establish your build environment before the projects in the workspace will build without problems.)

  2. The solution files are located in directory /cse/courses/cse461/13sp/461solution/. Copy that entire directory to your machine. The directory contains the solution code for the first project (461solutionP1.jar), and two sample configuration files (client.config.ini, and server.config.ini).

    To run the solution, first look at and perhaps modify the configuration file (for instance, if you are talking to a server/client besides localhost). Then, while in the directory with the 461solutionP1.jar file, say

    $ java -jar 461solutionP1.jar -d .

Computer Science & Engineering
University of Washington
Box 352350
Seattle, WA  98195-2350
(206) 543-1695 voice, (206) 543-2969 FAX
[comments to arvind at cs.washington.edu]