CSE/EE 461, Winter 1998 Project 2

CSE 461 Project 2: Routing and Congestion Control

Due March 2, 1998
Design reviews February 24-25

For project 2 we will delve a little deeper into the network, and look at routing and congestion control. The network simulator has been expanded so that the network connecting senders and receivers can be modeled as a switched network consisting of many links and routers. Each of these links will simulate a simple delivery model, and you will have to write the code for the routers to efficiently discover the network topology, route packets to their destinations, and handle congestion, load variation and link failures. You will also need to be able to handle multiple simultaneous conversations on the network, and for the first time these conversations will come and go.

New Failure test setup

Here is a version of the square config that should work for testing failures.

failure.cfg

New Congestion test setup

Here is a config file that will exhibit congestion problems (in the second phase between 120000 and 170000). The theoretical max should be 0.5 Mb/s for both phases (split between both converstations in the second phase).

dogbone.cfg

New Patches available

Some enhancements to the basic distribution are available. Only the changed files are provided, so you will have to copy them into your workspace if you want to use them.

Project 2 Patches.

Online distribution

The online distributions are here. Choose either the Windows or Unix version. You can use WinZip on the NT machines to unzip the Windows distribution.

Windows distribution: proj2-dist.zip
Unix distribution: proj2-dist.tar.gz

Part 1a: Routing

The first part of this project involves implementing a packet routing algorithm similar to that used by IP. Routing involves two parts:

Topology discovery
Routing packets

You may implement any of the topology discovery algorithms discussed in lecture, or one of your own creation. For this part you will need to modify the implementations of two classes:

Router implements the internal nodes in the network. The current implementation uses a repeater algorithm with no topology discovery.
RoutingNetworkEdge implements the edge nodes. There is exactly one client (either a sender or a receiver) associated with each RoutingNetworkEdge. Edge nodes differ from internal nodes in that they are aware of the host connected to them (which gives them some additional information to contribute to the topology discover algorithm), and in that they support two interfaces: that of a generic node to the rest of the network, and that of a complete network to the client. The existing implementation simply recognizes arriving packets destined for the host, and forwards all outgoing packets to the rest of the network.

Your routing algorithm should, after the topology is discovered, route packets along the best route, for any reasonable definition of best you choose.

Routing evaluation

To evaluate your routing algorithm, we will provide a few sample network topologies with simple conversations and the property that congestion is not an issue. You will need to achieve an total conversation bandwidth (the sum of the observed bandwidths of all conversations) of at least half the "theoretical maximum," which we will compute as the optimal bandwidth assuming each conversation takes the shortest route. Your algorithm will be given some initial time to stabilize the routing tables before the traffic starts.

Part 1b: Failure Management

Once you have routing working, we can introduce link failures. When a link fails, the two routers that used to connect to it will be immediately notified. Make sure that your algorithm can successfully deal with failures and still get packets through along the new optimal routes (assuming that the network is not partitioned by the failure).

Failure management evaluation

To evaluate your failure management, we will extend the routing tests so that various link failures are "scheduled" for different points in the test run, so that the topology progresses through a series of configurations. For each configuration we will compute the same "theoretical maximum" bandwidth, and you need to achieve at least half of this within some reasonable time after the failure occurs.

Part 2: Congestion Control

Congestion control is much more interesting. To deal with it, you will need to modify the sliding window algorithm to enable backoff. You can choose either TCP style end-to-end congestion control, or if you want you may experiment with a "pushback" algorithm of some flavor. You will need to be able to deal with conversations that come and go and recover available bandwidth, a typical situation might look like:

Long conversation 1 starts, uses all available bandwidth
Short conversation 2 starts, 1 and 2 share bandwidth more or less evenly.
Short conversation 2 ends, conversation 1 reverts to using all available bandwidth.

You do not need to implement truly fair congestion control, but your algorithm should not be unreasonably unfair. You can implement congestion control using either your solution to project 1 (which will have to be adapted to the new framework) or the sample solution (which will be set up so that the window size can be adjusted dynamically).

Congestion control evaluation

As before, we will provide network topologies, this time with the property that some links can be overloaded. The "theoretical maximum" will be computed assuming all conversations take the shortest path, and that all congested links are shared fairly. As before, you need to achieve at least half the theoretical value, again stabilizing within some reasonable amount of time from when the load situation changes.

Extra credit

All through this document we have used the word "theoretical maximum" in quotes. For extra credit, we will supply a few topologies where it is possible to substantially exceed the "theoretical maximum" bandwidth, either by routing away from congested links or by splitting traffic from a single conversation. Extra credit of some form will be available for those who can substantially exceed the "theoretical maximum" on these topologies.

Code

The code for this project is substantially larger than that for the last one. The simulator has been broken into several Java packages:

debug for the debugging message code
timer for the timer and associated stuff
link for the link-level parts
network for the network-level parts (which is where routing fits in)
reliability for the reliable-transport-layer parts

The other directory in the distribution is doc, which contains the javadoc-generated documentation for all the provided code. That documentation is also available online.

The code is available in two distributions: a zip file for Windows/Visual J++ and a tar file for Unix. The only differences are the CR/LF convention on the source files, and the workspace/makefile setup:

The Visual J++ version comes with a single workspace containing a project for each package/directory. Also set up in the workspace are the classes to run and the CLASSPATH modifications to find the other packages in the distribution.
The Unix version comes with a Gnu makefile and a README describing how to set up your environment and run the program.

The synchronization methodology has also been changed in an attempt to make managing the parallelism easier. All operations are now synchronized exclusively through the global timer, making it substantially harder to deadlock the system. More explanation will be provided in section.

The driver for this project reads the network topology and load configuration from a config file. Documentation and the provided configs are in the directory config. The simulator arguments are:

   java network.project2 [seed=num]         (default current time)
                         [config=filename]  (default "project2.cfg")
                         [debug=codes]

to set the random number seed (strongly recommended) and the config file to use.

Groups

Just as in project 1, this assignment should be done and handed in in groups of 2-3.

Design reviews

Design reviews will be held on or around February 24-25. Groups will need to sign up for a timeslot and all group members will need to attend. A few review slots may be available on the 19th, talk to Andy.

Turnin

Online turnin will be used as in project 1. The due date is March 2. The turnin program and server will be available shortly before the due date. You may use your slip days on this assignment if you wish.

Turnin is now available. Grab the file turnin.class, and turn in this project just like project 1:

Grab the file turnin.class and put it in your project directory.
Make sure all your .java files should be in this directory, and no extraneous .java files are there.
Make a file readme.txt in your project directory that (at the very least) gives your names, describes your algorithm, and tells us how we should run it to see it work. A sample file, which shows what I'm after, is here. Remember: we will be reading this file, and it will benefit you to tell us anything that might be helpful when grading your project. Show us that it works.
Also note: short and sweet should be the readme rule of thumb. Remember, you're also turning in all your code and I can run the program if I want to see all the output. The readme should be a roadmap to your project, and should show why you think it works and tell what you learned.
In your project directory, run the command (from the command line)
```
java turnin
```
(or jview turnin for Visual J++ systems). You will need to be on a machine which is connected to the internet. Be patient. The server is slow and single threaded.

The turnin program prompts you for the student ID for the "first" student in the group. I don't really care whose SID you use, but make sure that if you submit more than once you keep using the same SID to identify the group. As long as you use the same SID you can submit as many times as you like. All turnins are saved, but unless we have some reason to do otherwise we will only look at the last one.

What you get back from turnin is a short receipt listing files and sizes received, the turnin time and the number of slip days used for this assignment.

If you have any problems with turnin, be sure to mail Andy (acollins@cs.washington.edu, and/or post to the email list.