|
|
|
|
Project 3: Remote Procedure Call
Out: Tuesday, May 21
Due: Friday, June 7 (11:59 PM)
Turnin: Online
Teams: Yes
1. Assignment Overview
Remote procedure call (RPC) is a mechanism intended to greatly
simplify writing networked code, relative to using raw sockets. RPC
mimics the semantics of procedure call: generally, the caller invokes
a method of the callee; both arguments and return values are typed
data, with some form of type checking enforced by the RPC system;
exceptions may be thrown; and execution of the callee is synchronous
(meaning the caller waits until the method returns before continuing
to execute instructions). Different RPC systems have deviated from
these local procedure call semantics in various ways, but that's the
general idea.
For this project you build what might be considered a simple RPC
system. It supports all of the functionality just mentioned, except
for type checking, which is left to the application. (Importantly, true RPC
systems typically have some form of compiler support; our system
doesn't have any.)
We'll continue to use ping and data transfer as applications. We need
applications for debugging the RPC implementation, and these are about
the simplest applications imaginable. They also are just the tools we
need to determine whether the overheads of our RPC implementations are
trivial or significant.
2. Our RPC System Overview
Suppose some class Foo implements method bar, whose type
is shown in italics below, along with a typical local invocation:
// Original, local invocation code
public int bar(int x, int y, String who) throws Exception;
try {
...
int result = Foo.bar(10, 20, "pluto");
...
} catch (Exception e) {
...
}
To build our RPC system, we first impose a restriction that supported
methods take a single JSONObject as an argument and return a JSONObject
as a result. In many cases, it's simple to translate from an arbitrary
method's type to one that can be supported by our RPC. For the example above,
the translation is:
// Translated, local invocation code
public JSONObject bar(JSONObject args) throws Exception;
try {
...
JSONObject args = new JSONObject().put("x", 10).put("y",20).put("who","pluto");
JSONObject resultObj = Foo.bar(args);
int result = resultObj.getInt("retval");
...
} catch (Exception e) {
...
}
Here the field name "retval" for the returned value
was chosen by the implementor of bar(). The implementor
of the caller has to know this, perhaps by looking at bar()'s documentation.
Now we're ready to move bar() to a remote machine. When we do, the
invoking code looks like this:
// RPC invocation code
try {
...
JSONObject args = new JSONObject().put("x", 10).put("y",20).put("who","pluto");
JSONObject resultObj = RPCCall.invoke(serverIP, serverRPCPort,
"Fooservice", "bar", args);
int result = resultObj.getAsInt("retval");
...
} catch (Exception e) {
...
}
The caller now directly invokes our RPC infrastructure, asking it the carry to call to the remote
machine, and to return the result. The remote machine is named by an IP address and port
(the first two arguments to RPCCall.invoke()). To identify the procedure to invoke, we need
to name a remote body of code ("Fooservice") and a method for which it allows RPC
invocation ("bar"). (Both of those names are created by the remote code. They are just
strings, and so do not necessarily correspond to Java class or method names.)
Finally, we need to pass the arguments.
RPCCall.invoke() then causes messages to be sent to the RPC infrastructure
running on the callee's machine, which performs a local invocation of the bar procedure,
collects its return value, and sends it back.
RPCCall.invoke() then returns to the caller, handing back the return value
it obtained from the remote system.
During this process, an error might occur and an exception be thrown.
That could happen in the code of the remote procedure
(e.g., it has been passed invalid arguments)
or in RPC infrastructure code (e.g., RPCCall.invoke() can't
connect to the remote side, or the remote side can't find a method with the specified name).
In these cases, RPCCall.invoke() returns to the caller by throwing
an exception (rather than by returning a value).
For this to work, the remote RPC infrastructure must have a way to return an exception
to RPCCall.invoke(), in addition to being able to send back return values.
3. RPC Protocol
3.1 RPC: Implementation Summary
- RPCs take place over a TCP connection.
- A typical RPC results in four messages:
- An initial RPC "handshake," consisting of one message in
each direction. This establishes the RPC channel.
- An invocation message sent from the caller to the callee, followed by a return message
from the callee to the caller.
- In the basic protocol, only a single RPC may be issued across an
established connection; both sides close the connection when they have
handled the first call.
- You will also implement a more advanced version of this protocol with
"persistent connections": once an RPC connection is
established, any number of RPCs may be issued across it.
The protocol specifies how persistent connections are negotiated and managed, and the solution server supports them.
Details are in another section.
- RPC messages are exchanged between the two RPC instances
(not between the calling and called application code).
In memory, an RPC message is a JSONObject. (Note that the RPC message JSONObject may itself
contain an application layer JSONObject representing, say, the call arguments, but the two
are entirely distinct.)
On the wire, an RPC message is a JSON encoded string representation of
the
JSONObject message. Those strings are sent using TCPMessageHandler
encoding.
3.2 The Control Handshake
The initial handshake is communication between the calling RPC service and the
remote RPC service. It is not part of application communication, i.e., not
application to application.
Connect Message
The handshake begins with the caller sending an RPC connect control message.
Connect messages look like this.
{
"id":2,
"host":"cse461",
"action":"connect",
"type":"control",
"options":{
"connection":"keep-alive"
}
}
All of this is what you can think of as RPC header:
- The
host field identifies the sending RPC service. The protocol
doesn't rely on it, but it's useful for debugging. Setting it to the IP address
is a reasonable choice for this project.
- The
id field is an ID. The combination of host and
id is a unique ID for this message.
(While RPC service implementations can simply count up by one to
generate these IDs, they are not sequence numbers; the recipient won't
necessarily see consecutive IDs.)
- The
type field indicates this is a control message, i.e., the
intended recipient is the remote RPC service, not some application on the remote
machine.
- The
action field indicates what kind of control message this is.
(There's only one, but we send this field to allow later extension of the protocol.)
- The
options field is optional.
The one in the example contains a connection
field indicating that the caller would like the connection to be persistent.
A message may not include an options field at all.
Success Response Message
If the remote RPC service is willing to connect, it sends a success response.
(In our implementations, the remote service is always willing to connect.
In general, though, it could reject a connection if it thought it were already too busy,
or for any other reason.)
For our example, the response looks like:
{
"id":1,
"host":"",
"callid":2,
"type":"OK"
}
The type field indicates this is a success response.
The callid says what it is a response to - it's
the id field of the message being responded to.
The id and host fields have the same meaning as in
the connect message. (The name of this remote
host happens to be the null string, a special name in Project 4.)
Error Response Message
An error response to the same call looks like this:
{
"id":1,
"host":"",
"callid":2,
"type":"ERROR",
"msg":"Max connections exceeded"
}
The msg field is a free form error message, intended to help the caller understand
what the problem is.
3.3 RPC Call
RPC Invocation
Suppose echo is a remote service (analogous to a local class)
and echo() a method it supports.
Here's how the remote call echo.echo("test message") is encoded:
{
"id":4,
"app":"echo",
"host":"cse461",
"args":{
"header":{
"tag":"ehco"
},
"payload":"test message"
},
"method":"echo",
"type":"invoke"
}
The type field indicates this is an
application-to-application invocation message. The app
field is the (unique) name of the remote application.
The method field indicates which of the remote
application's methods the caller wants to invoke.
The args field carries the JSONObject argument for that
invocation.
Success Response
A success response looks like this:
{
"id":3,
"host":"",
"callid":4,
"value":{
"header":{
"tag":"okay"
},
"payload":"test message"
},
"type":"OK"
}
The value field is the return value of the invocation.
Unless the remote procedure returns the equivalent of void ,
the value field is always a JSONObject.
(In the example, the application is echo , which just
returns anything sent to it, so the return value matches the invocation argument.)
If the remote procedure has the equivalent of a void return,
it returns null when invoked by its RPC infrastructure, and that infrastructure
does not produce a value field in the returned message.
Error Response
An error response looks like tis:
{
"message":"some error message",
"id":3,
"host":"",
"callid":4,
"type":"ERROR",
"callargs":{
"id":6,
"app":"echo",
"host":"cse461",
"args":{
"header":{
"tag":"ehco"
},
"payload":"test message"
},
"method":"echo",
"type":"invoke"
}
}
The new field here is callargs , which simply sends back the invocation arguments. It can be useful in debugging
the caller.
4. Persistent Connections
A persistent connection is a TCP connection that can be used to make more than one call.
In our system, there may be only a single call in progress at a time on a TCP connection; we
do not pipeline
or multiplex
multiple calls over a connection, we simply cache TCP connections in case another request
is made to the same host in the near future.
If the caller would like to establish a persistent connection with the remote system, its
initial handshake message looks like this:
{
"id":1,
"host":"default.uw12au.cse461",
"action":"connect",
"type":"control",
"options":{
"connection":"keep-alive"
}
}
If the server is willing to engage in persistent connections, it responds something
like this:
{
"id":2,
"host":"default.uw12au.cse461",
"callid":1,
"value":{
"connection":"keep-alive"
},
"type":"OK"
}
If the caller doesn't receive a keep-alive response, the connection is non-persistent: both sides
should close it after a single RPC call.
Implementing persistent connections is a bit tricky. You must keep a cache of persisted connections.
The cache is cleaned by connections timing out when they haven't been used for too long.
There are races to worry about in maintaining the cache, and a race between the two ends of the connection
about when the connection is closed. Finally, it can be quite hard to determine that the other end
of a TCP connection has closed it. In Java, you have to read from the connection, which isn't a suitable
mechanism if all you want to do is determine if the connection is still open (since you'll potentially consume some
data that should be part of a call).
5. RPC Implementation
The RPC implementation consists of two main classes: RPCService , which implements
the receiving side (accepting calls from remote systems), and RPCCall ,
which implements sending remote calls. Both sides are NetLoadableService s,
and so must be listed in the config file to be loaded.
The simpler part of this is the caller side. RPCCall exposes a single, static method,
invoke(...) . The method is static so that client code can be simply RCPCall.invoke(...) ,
rather than the tedious line required to lookup the RPCCall service using NetBase .
Because it's only interesting method is static, there is no interface file for it. However, RPCCall.java
contains the full implementation of the static invoke(...) , as well as the signature of a private
_invoke(...) method that is the actual implementation. An example use of RPCCall
is provided in file EchoRPC.java .
The RPCService implementation is more complicated. There is an interface file for it,
containing really only one method, registerHandler() . That method allows client code running
on the same machine to
expose some of its methods through RPC: the client "registers" itself and a set of its methods with the
RPCService .
When the RPCService receives an incoming invocation naming the client and one of its methods,
it performs a callback to the client's code. A full implementation of EchoRPCService is provided
as an example. It is client code to the RPCServiceCode . It registers its name ("echo") and one
method ("echo"). Remote code can effect an invocation by contacting the
RPCService on the target machine and specifying application echo and method echo.
RPCService basically demuxes incoming messages, sending them on to client code that has previously
registered itself. It sits on a single TCP port waiting for connections. All RPC calls to that machine are
directed to the port the RPCService is listening to.
6. DataXfer: Payload as a Single Reply
It's easy to convert ping to use RPC: you send an RPC to the remote echo service and
wait for a response. Implementing dataxfer is trickier, because of a mismatch between
its communication pattern and that of RPC: dataxfer is one message from client to server
followed by many messages back. RPC is one message in each direction.
Just implement dataxfer so that the client sends a length as an argument and the server returns that many bytes in a single reply. (In future projects we would have worked around this issue to send large amounts of data; can you imagine how you might do this?)
7. Ping and DataXfer Protocols
Having introduced RPC, there are now two interfaces of interest:
the Java interfaces that we're used to, and the RPC interfaces.
The Java interfaces describe the methods available to other Java code running in the same JVM.
The RPC interfaces describe the methods available to any code, running anywhere.
Ping
Client
The ping client should implement the java interface given in file PingInterface.java.
Because PingRPC uses EchoRPCService as the service it talks to, it expects the 'header' JSONObject passed into PingRPC.ping(...) to look like:
{
"tag":"echo"
}
Note that the testing code will pass in a different header.
The 'args' JSONObject that PingRPC will send to RPCCall.invoke(...) looks like:
{
"header":<headerJSONObject>
"payload":""
}
where <headerJSONObject> should be replaced with the header JSONObject passed into ping(...) .
Service
Ping invokes EchoRPCService, whose full implementation was distributed.
EchoRPCService has RPC app name "echorpc". It exports one method, echo().
echo() accepts a JSONObject argument containing an arbitrary number of fields of arbitrary
names, each holding a String. It expects that there will be one field, "header", with will contain a JSONObject, which must in turn contain the field "tag" mapped to "echo". That is, the minimum required argument to echo() should contain:
{
"header":{
"tag":"echo"
}
}
ehco() returns whatever JSONObject argument is passed to it, modifying the "tag" field in the "header" object to say "okay" rather than "echo". For example, a response to a normal ping request would look like:
{
"header":{
"tag":"okay"
},
"payload":""
}
Note that the response given by the test service will be different.
DataXfer
Client
The data xfer client should implement the Java interface given in file DataXferInterface.java .
The format of the JSONObjects used in this protocol are theoretically flexible, as long as the transfer length is sent and data is returned. To be compatible with the solution code, DataXferRPC expects the 'header' JSONObject passed into DataXferRPC.DataXferRate(...) and DataXferRPC.DataXfer(...) to look like:
{
"tag":"xfer",
"xferLength":<xferLen>
}
Where <xferLen> is an int that is the amount to transfer.
Note that the testing code will pass in a different header.
The 'args' JSONObject that DataXferRPC will send to RPCCall.invoke(...) looks like:
{
"header":<headerJSONObject>
}
where <headerJSONObject> should be replaced with the object passed into DataXferRate(...) and DataXfer(...) .
The response the client should get back from the DataXferRPCService is detailed below in the service description.
Note that the response for the test app will be different.
To convert the returned String value back to a byte[], the client must invoke Base64.decode() .
Service
The data transfer service should be implemented by class edu.uw.cs.cse461.service.DataXferRPCService .
Its RPC app name is "dataxferrpc".
It exports a single method, dataxfer(), that, in the abstract, returns a byte[].
In practice, we have a bit of extra data packed in the response value as part of our protocol. The response returned to the client should look like:
{
"header":{
"tag":"okay",
"xferLength": <xferLen>
},
"data": "Base64.encode(byte[])"
}
Thus, given what the client sends and what the service returns, we can construct the signature of the data transfer RPC method as:
argument:
{"header": {"tag": "xfer", "xferLength": int } }
method name:
dataxferrpc.dataxfer
return value:
{"header": { "tag": "okay", "xferLength": int}, "data": "Base64.encode(byte[])" }
More on Base64 encoding
JSONObjects don't support byte[] fields.
To deal with that, the data xfer service
base 64 encodes the byte array into a string with Base64.encode(byte[]) .
(Note that this is done at the application layer - it's not a feature of the RPC system.)
Base 64 encoding converts a byte[] to a String, increasing the length of the data as it does so.
Therefore, if a client requests a 10,000 byte transfer,
the code will transfer a string of more than 10,000 characters. That's okay.
It's an overhead inherent in the design of our RPC system and its decision to use JSONObjects for
argument and return values. (Any application trying to transfer a byte[] using our RPC system
will pay this, or some, penalty.)
The Base64 class was included as part of the original source distribution, in the util project.
8. What to Implement
- The caller and callee sides of RPC.
- Console versions of
ping and dataxfer clients.
- The
dataxfer service.
9. Testing
Note: You are not required to conform to the JSON 'args' and 'value' structures of the solution / testing code. However, to test against the solution and run the testers, you will need to.
Step 1: Update solution
Download 461solutionP3.jar from /cse/courses/cse461/13wi/461solution/461solutionP3.jar . Running the solution and choosing the 'version' app should produce "Version 1.1.5 3/3/2013".
Step 2: Update test jars
Download Tester.jar from /cse/courses/cse461/13wi/Tester.jar and replace the one in your Lib/ directory. Running java -jar Tester.jar should produce "Version 1.1.5 3/3/2013".
Step 3: Add new tests to client config file
In your client config file, add to the test.driver.console.apps entry the following classes:
- edu.uw.cs.cse461.consoleapps.grading.PingRPCTester
- edu.uw.cs.cse461.consoleapps.grading.DataXferRPCTester
Step 4: Ensure your 'args' and 'value' JSON Objects for PingRPC and DataXferRPC are formatted the way the solution code expects.
This has been detailed above in the 'ping' and 'dataXfer' client and server sections.
In theory, you can structure the 'args' objects in any way that both your client and server understand. But for compatibility with the solution code, you must conform to the format that they expect.
Step 5: Run against the solution jar
Try running your code as the client with the solution as the server, and vice versa. Run both the PingRPC and DataXferRPC methods.
Step 6: Run the tests
To exercise your client-side implementation, run PingRPCTester and DataXferRPCTester. (Note that this will only test your RPCCall invoke and client-side apps. For testing your server-side code, use the solution jar as a client, as in step 5).
10. What to Turn In
Submit a single file in the format of the previous projects.
Include in it all files you've changed or implemented.
To reduce the odds that you forget to include something, we'll remind you that you should have:
- PingRPC.java (consoleapps.solution)
- DataXferRPC.java (consoleapps.solution)
- DataXferRPCService.java (service)
- Your entire (net.rpc) package, including
- RPCCall.java
- RPCService.java
- Any additional classes you changed or modified. It's fine to just turn in the whole package.
- Any additional files you modified (such as the config files)
Please also include a very brief README file (.txt or .pdf) which describes your approach for persistent connections, along with with any additional design decisions you want us to notice. If you modified and are turning in any other files, please list them here, too. This is not meant to take you much time, but to help us in looking through your project and what you've done.
We're not asking for a report, but make sure to
run pingrpc and dataxferrpc and compare
results with their raw and TCPMessageHandler counterparts.
|