In this project you'll write code that interacts with a registration service. The function of the service is to keep track of nodes that participate in some distributed application, so that participants can find each other. When a node comes up, it announces its presence by registering with the service, providing it with an IP address and port at which it can be contacted. Just before a node goes down, it should unregister. Any node can ask the registration service for a list of currently registered nodes. The registration service is therefore a discovery mechanism -- a way for nodes to find each other. Like many things in distributed systems, the information it provides should be considered a hint, since it can be incorrect.
This sounds simple enough, but there are quite a few details required for it to work robustly. Those are described below.
The code is this project is intended to be a utility for the ultimate project, Tor61. Tor61 is an overlay network whose goal is to enable anonymous use of the Internet. Like the Internet, Tor61 implements routing. The routers it uses need to find each other. We'll use the registration infrastructure from this project to achieve that.
You'll be implementing this project as a standalone application, but you should keep in mind that you'll want to be including its functionality basically as a library in Tor61. It's worth being careful about modularity now to avoid having to rewrite project 1 as part of doing project 3.
Note: The code you build for this assignment can be packaged for re-use as a library. In most cases, that requires that your future projects be built using the same language as this library. Because registration is not performance critical, you can sidestep the permanent marriage to a language by packaging registration as a process. When you later write the Tor61 router implementation, your router could invoke a registration process, handing it data like IP and port as (basically) command line arguments. Personally, I like this architecture, as it stretches you a bit more than you most likely have been in the past. (The fact that it lets you change your mind about implementation language between projects is probably of only minor value, but understanding that you can package code this way is definitely useful.)
Communication involving the registration service is always "request-response": one side sends a message to the other, the other responds, and that's the end of it. The possible exchanges are these:
Register registers an instance. The server returns a Registered response. Unregister removes a registration. The server responds with an ACK. Fetch requests information about registrations. The server responds with a FetchResponse message that contains a few randomly chosen registrations. Probe is simply a request for an ACK response. It is used to test if the other end is still there. Probe may be initiated by the client or by the server. All other exchanges are initiated only by the client.
Note: We have purposefully omitted any kind of error response, in an attempt to contain the work this project requires. If the server detects an error, it simply doesn't respond. That may turn out to be a protocol design mistake though -- time will tell (and, by the end of the quarter, you can be the judge).
Here's a picture of the architecture we're headed to:
"Some service" wants to register the port its socket is bound to (q) with the registration service. In this project you simply make up some port number to register, since we're not actually implementing a client service at this time.
With that small issue ignored, the rest of the figure is accurate, even in this project. The registration agent itself has a distinct pair of datagram sockets that it uses when communicating with the registration service. The sockets are bound to consecutive ports (p and p+1, for some arbitrary p). One socket is used when the agent wants to initiate a request-response exchange with the registration service. The other is there for the registration service to use if it wants to initiate an exchange. Having separate sockets avoids confusions like: (a) the agent sends a Register to the registration service, and expects to receive an Registered message back; but (b) at the same time, the registration service sends a probe to the agent. If only a single port were used at the agent for both purposes, the agent would read a Probe when it expected a Registered, and get confused. (It's possible to make the one port approach work, but it's clumsy.)
The registration service requires the ports to which the agent's two sockets are bound be consecutive because the agent messages never explicitly give the agent's own port numbers. Instead, the registration service learns the sending port number (p) because it's in the UDP header of the messages it receives from the agent. It deduces the port number it should send Probes to (p+1) because of the protocol agreement that it should be one higher than the agent's sending port number.
Summarizing the basic operation, Register:
All registration messages begin with a common header consisting of the two-byte value 0xC461, a one-byte unsigned integer sequence number, and a one-byte message type. All multi-byte numeric values are in network order (big endian). Sequence numbers from a single sender form an increasing sequence, except that:
As well as providing the IP address and port at which the service can be contacted, the service provides two additional pieces of information. The "service data" is an arbitrary 32-bit data item the service wants to publish. It is communicated as an unsigned 32-bit integer in network byte order. The "service name" is a variable length string describing the service. The service name len field gives the length of the service name as an unsigned, 8-bit integer integer.
The lifetime field is an unsigned integer value indicating how long the registration service will keep the data it received in the Register message before deleting it. If the client service wants to maintain its registration, it must re-register before the lifetime expires. The lifetime is given in seconds. Note: all spec-compliant implementations must automatically re-register active services before they expire without the user needing to do this manually! Services that have been manually unregistered should not be re-registered.
The server tries to return all matching registrations, but imposes a maximum packet length for its response that may cause it to omit some registrations. The server's packet length limit ensures that the response fits in a single UDP packet, and, further, that there is at least a reasonable chance of delivery (based on Internet behavior observed in some simple experimentation).
Each returned entry looks like this:
Unregister expects an ACK response on success.
Service errors are things like responding with the wrong message type or a bad sequence number, or not at all (having crashed, for instance).
When network errors occur, you should try to overcome them. Service implementation errors will presumably just be repeated, so you can give up and report an error if you encounter one.
It's typically hard to debug the code intended to deal with network errors, because they occur infrequently. To help debug, the service instance deployed on cse461.cs.washington.edu (explained later) artificially drops some incoming packets. This is done probabilistically. The service simulates bursty errors. At any moment it is in one of two states, one with a low artificial drop rate and one with a very high rate. It transitions from one state to the other at random.
$ ./run <registration service host name> <service port>
Your client should accept commands from stdin that cause it to send messages to the registration service identified by the command line arguments, to read its responses the service sends, and to print appropriate messages about the result of the interaction. You must support the following commands. You may also support additional command formats that you find useful. (For example, the sample solution often provides default arguments if required arguments are not provided.)
We aren't too picky about the format of your output, but you should try to make it clear to a TA reading it.
It can be useful when debugging your client code to have some idea what the server is receiving and sending, and what registrations it holds. For that reason, the registration service provides a web interface:
http://cse461.cs.washington.edu:8080/statusDisplays the current registration data.
http://cse461.cs.washington.edu:8080/logThe server produces a log of activity, including incoming packets shown as hex strings and as parsed messages. This page shows the most recently written (up to) 250 lines of the log.