|
|
 |
|
Project 3: Remote Procedure Call
Out: Thursday October 11
Due: Monday October 29 (11:59 PM)
Note: There is a midterm on Friday 10/26
Turnin: Online
Teams: Pairs
1. Assignment Overview
Remote procedure call (RPC) is a mechanism intended to greatly
simplify writing networked code, relative to using raw sockets. RPC
mimics the semantics of procedure call: generally, the caller invokes
a method of the callee; both arguments and return values are typed
data, with some form of type checking enforced by the RPC system;
exceptions may be thrown; and execution of the callee is synchronous
(meaning the caller waits until the method returns before continuing
to execute instructions). Different RPC systems have deviated from
these local procedure call semantics in various ways, but that's the
general idea.
For this project you build what might be considered a simple RPC
system. It supports all of the functionality just mentioned, except
for type checking, which is left to the application. (Importantly, true RPC
systems typically have some form of compiler support; our system
doesn't have any.)
We'll continue to use ping and data transfer as applications. We need
applications for debugging the RPC implementation, and these are about
the simplest applications imaginable. They also are just the tools we
need to determine whether the overheads of our RPC implementations are
trivial or significant.
2. Our RPC System Overview
Suppose some class Foo implements method bar, whose type
is shown in italics below, along with a typical local invocation:
// Original, local invocation code
public int bar(int x, int y, String who) throws Exception;
try {
...
int result = Foo.bar(10, 20, "pluto");
...
} catch (Exception e) {
...
}
To build our RPC system, we first impose a restriction that supported
methods take a single JSONObject as an argument and return a JSONObject
as a result. In many cases, it's simple to translate from an arbitrary
method's type to one that can be supported by our RPC. For the example above,
the translation is:
// Translated, local invocation code
public JSONObject bar(JSONObject args) throws Exception;
try {
...
JSONObject args = new JSONObject().put("x", 10).put("y",20).put("who","pluto");
JSONObject resultObj = Foo.bar(args);
int result = resultObj.getInt("retval");
...
} catch (Exception e) {
...
}
Here the field name "retval" for the returned value
was chosen by the implementor of bar(). The implementor
of the caller has to know this, perhaps by looking at bar()'s documentation.
Now we're ready to move bar() to a remote machine. When we do, the
invoking code looks like this:
// RPC invocation code
try {
...
JSONObject args = new JSONObject().put("x", 10).put("y",20).put("who","pluto");
JSONObject resultObj = RPCCall.invoke(serverIP, serverRPCPort,
"Fooservice", "bar", args);
int result = resultObj.getAsInt("retval");
...
} catch (Exception e) {
...
}
The caller now directly invokes our RPC infrastructure, asking it the carry to call to the remote
machine, and to return the result. The remote machine is named by an IP address and port
(the first two arguments to RPCCall.invoke()). To identify the procedure to invoke, we need
to name a remote body of code ("Fooservice") and a method for which it allows RPC
invocation ("bar"). (Both of those names are created by the remote code. They are just
strings, and so do not necessarily correspond to Java class or method names.)
Finally, we need to pass the arguments.
RPCCall.invoke() then causes messages to be sent to the RPC infrastructure
running on the callee's machine, which performs a local invocation of the bar procedure,
collects its return value, and sends it back.
RPCCall.invoke() then returns to the caller, handing back the return value
it obtained from the remote system.
During this process, an error might occur and an exception be thrown.
That could happen in the code of the remote procedure
(e.g., it has been passed invalid arguments)
or in RPC infrastructure code (e.g., RPCCall.invoke() can't
connect to the remote side, or the remote side can't find a method with the specified name).
In these cases, RPCCall.invoke() returns to the caller by throwing
an exception (rather than by returning a value).
For this to work, the remote RPC infrastructure must have a way to return an exception
to RPCCall.invoke(), in addition to being able to send back return values.
3. RPC Implementation
3.1 RPC: Implementation Summary
- RPCs take place over a TCP connection.
- A typical RPC results in four messages:
- An initial RPC "handshake," consisting of one message in
each direction. This establishes the RPC channel.
- An invocation message sent from the caller to the callee, followed by a return message
from the callee to the caller.
- In the basic protocol, only a single RPC may be issued across an
established connection; both sides close the connection when they have
handled the first call. This is the only behavior required by this
project.
In theory, you could implement
"persistent connections": once an RPC connection is
established, any number of RPCs may be issued across it.
The protocol specifies how persistent connections are negotiated and managed, and the solution server supports them.
Details are in another section.
Implementing persistent connections is entirely optional.
- RPC messages are exchanged between the two RPC instances
(not between the calling and called application code).
In memory, an RPC message is a JSONObject. (Note that the RPC message JSONObject may itself
contain an application layer JSONObject representing, say, the call arguments, but the two
are entirely distinct.)
On the wire, an RPC message is a JSON encoded string representation of
the
JSONObject message. Those strings are sent using TCPMessageHandler
encoding.
3.2 The Control Handshake
The initial handshake is communication between the calling RPC service and the
remote RPC service. It is not part of application communication, i.e., not
application to application.
Connect Message
The handshake begins with the caller sending an RPC connect control message.
Connect messages look like this.
{"id":2,"host":"cse461","action":"connect","type":"control",
"options":{"connection":"keep-alive"}}
All of this is what you can think of as RPC header:
- The
host field identifies the sending RPC service. The protocol
doesn't rely on it, but it's useful for debugging. Setting it to the IP address
is a reasonable choice for this project.
- The
id field is an ID. The combination of host and
id is a unique ID for this message.
(While RPC service implementations can simply count up by one to
generate these IDs, they are not sequence numbers; the recipient won't
necessarily see consecutive IDs.)
- The
type field indicates this is a control message, i.e., the
intended recipient is the remote RPC service, not some application on the remote
machine.
- The
action field indicates what kind of control message this is.
(There's only one, but we send this field to allow later extension of the protocol.)
- The
options field is optional.
The one in the example contains a connection
field indicating that the caller would like the connection to be persistent.
A message may not include an options field at all.
Success Response Message
If the remote RPC service is willing to connect, it sends a success response.
(In our implementations, the remote service is always willing to connect.
In general, though, it could reject a connection if it thought it were already too busy,
or for any other reason.)
For our example, the response looks like:
{"id":1,"host":"","callid":2,"type":"OK"}
The type field indicates this is a success response.
The callid says what it is a response to - it's
the id field of the message being responded to.
The id and host fields have the same meaning as in
the connect message. (The name of this remote
host happens to be the null string, a special name in Project 4.)
Error Response Message
An error response to the same call looks like this:
{"id":1,"host":"","callid":2,"type":"ERROR","msg":"Max connections exceeded"}
The msg field is a free form error message, intended to help the caller understand
what the problem is.
3.3 RPC Call
RPC Invocation
Suppose echo is a remote service (analogous to a local class)
and echo() a method it supports.
Here's how the remote call echo.echo("test message") is encoded:
{"id":4,"app":"echo","host":"cse461","args":{"msg":"test message"},"method":"echo",
"type":"invoke"}
The type field indicates this is an
application-to-application invocation message. The app
field is the (unique) name of the remote application.
The method field indicates which of the remote
application's methods the caller wants to invoke.
The args field carries the JSONObject argument for that
invocation.
Success Response
A success response looks like this:
{"id":3,"host":"","callid":4,"value":{"msg":"test message"},
"type":"OK"}
The value field is the return value of the invocation.
Unless the remote procedure returns the equivalent of void ,
the value field is always a JSONObject.
(In the example, the application is echo , which just
returns anything sent to it, so the return value matches the invocation argument.)
If the remote procedure has the equivalent of a void return,
it returns null when invoked by its RPC infrastructure, and that infrastructure
does not produce a value field in the returned message.
Error Response
An error response looks like tis:
{"message":"some error message","id":3,"host":"","callid":4,"type":"ERROR",
"callargs":{"id":6,"app":"echo","host":"cse461","args":{"msg":"test message"},
"method":"echo","type":"invoke"}}
The new field here is callargs , which simply sends back the invocation arguments. It can be useful in debugging
the caller.
4. Persistent Connections (Optional)
A persistent connection is a TCP connection that can be used to make more than one call.
In our system, there may be only a single call in progress at a time on a TCP connection; we
do not pipeline
or multiplex
multiple calls over a connection, we simply cache TCP connections in case another request
is made to the same host in the near future.
If the caller would like to establish a persistent connection with the remote system, its
initial handshake message looks like this:
{"id":1,"host":"default.uw12au.cse461","action":"connect","type":"control",
"options":{"connection":"keep-alive"}}
If the server is willing to engage in persistent connections, it responds something
like this:
{"id":2,"host":"default.uw12au.cse461","callid":1,"value":{"connection":"keep-alive"},
"type":"OK"}
If the caller doesn't receive a keep-alive response, the connection is non-persistent: both sides
should close it after a single RPC call.
Implementing persistent connections is a bit tricky. You must keep a cache of persisted connections.
The cache is cleaned by connections timing out when they haven't been used for too long.
There are races to worry about in maintaining the cache, and a race between the two ends of the connection
about when the connection is closed. Finally, it can be quite hard to determine that the other end
of a TCP connection has closed it. In Java, you have to read from the connection, which isn't a suitable
mechanism if all you want to do is determine if the connection is still open (since you'll potentially consume some
data that should be part of a call).
5. RPC Implementation
The RPC implementation consists of two main classes: RPCService , which implements
the receiving side (accepting calls from remote systems), and RPCCall ,
which implements sending remote calls. Both sides are NetLoadableService s,
and so must be listed in the config file to be loaded.
The simpler part of this is the caller side. RPCCall exposes a single, static method,
invoke(...) . The method is static so that client code can be simply RCPCall.invoke(...) ,
rather than the tedious line required to lookup the RPCCall service using NetBase .
Because it's only interesting method is static, there is no interface file for it. However, RPCCall.java
contains the full implementation of the static invoke(...) , as well as the signature of a private
_invoke(...) method that is the actual implementation. An example use of RPCCall
is provided in file EchoRPC.java (a new download this project).
The RPCService implementation is more complicated. There is an interface file for it,
containing really only one method, registerHandler() . That method allows client code running
on the same machine to
expose some of its methods through RPC: the client "registers" itself and a set of its methods with the
RPCService .
When the RPCService receives an incoming invocation naming the client and one of its methods,
it performs a callback to the client's code. A full implementation of EchoRPCService is provided
as an example. It is client code to the RPCServiceCode . It registers its name ("echo") and one
method ("echo"). Remote code can effect an invocation by contacting the
RPCService on the target machine and specifying application echo and method echo.
RPCService basically demuxes incoming messages, sending them on to client code that has previously
registered itself. It sits on a single TCP port waiting for connections. All RPC calls to that machine are
directed to the port the RPCService is listening to.
6. RPC and DataXfer
It's easy to convert ping to use RPC: you send an RPC to the remote echo service and
wait for a response. Implementing dataxfer is trickier, because of a mismatch between
its communication pattern and that of RPC: dataxfer is one message from client to server
followed by many messages back. RPC is one message in each direction.
We'll confront this issue more fully in a later project. For now, just implement dataxfer
so that the client sends a length as an argument and the server returns that many bytes in a single reply.
7. Ping and DataXfer Interfaces
Having introduced RPC, there are now two interfaces of interest:
the Java interfaces that we're used to, and the RPC interfaces.
The Java interfaces describe the methods available to other Java code running in the same JVM.
The RPC interfaces describe the methods available to any code, running anywhere.
Ping
Service
Ping invokes EchoRPCService, whose full implementation was distributed.
EchoRPCService has RPC app name "echorpc". It exports one method, echo().
echo() accepts a JSONObject argument containing an arbitrary number of fields of arbitrary
names, each holding a String. ehco() returns whatever is passed to it.
Extending Java slightly, the signature of the echo service is something close to
{String ... args} echorpc.echo({String ... args});
The returned JSONObject is a copy of the argument JSONObject.
Client
The ping client should implement the java interface given in file PingInterface.java.
DataXfer
Service
The data transfer service should be implemented by class edu.uw.cs.cse461.Net.RPC.DataXferRPCService.
Its RPC app name is "dataxferrpc".
It exports a single method, dataxfer(), that, in the abstract, returns a byte[].
In practice, JSONObjects don't support byte[] fields.
To deal with that, the data xfer service
base 64 encodes the byte array, resulting
in:
{data: Base64.encode(byte[])} dataxferrpc.dataxfer({xferLength: int});
(Note that this is done at the application layer - it's not a feature of the RPC system.)
Base 64 encoding converts a byte[] to a String, increasing the length of the data as it does so.
Therefore, if a client requests a 10,000 byte transfer,
the code will transfer a string of more than 10,000 characters. That's okay.
It's an overhead inherent in the design of our RPC system and its decision to use JSONObjects for
argument and return values. (Any application trying to transfer a byte[] using our RPC system
will pay this, or some, penalty.)
The Base64 class was included as part of the original source distribution, in the util project.
Client
The data xfer client should implement the Java interface given in file DataXferInterface.java .
To convert the returned String value back to a byte[], the client must invoke Base64.decode() .
8. What to Implement
- The caller and callee sides of RPC.
- Console versions of
ping and dataxfer clients, and the
dataxfer service.
- Update Android
ping to use RPC (in place of just TCPMessageHandler ).
This is worth doing just so you can verify that your RPC implementation runs on the phones: there
can be small incompatibilities between Android and non-Android Java, and you want to deal with them
now rather than finding them in a later project.
9. Downloads
10. What to Turn In
Submit a single file in the format of the previous projects.
Include in it all files you've changed or implemented.
To reduce the odds that you forget to include something, we're
asking for full directories:
your full Net/src/.../Net/RPC/ directory
(which should include your data transfer service as well as the
RPC system implementation) and your full ConsoleApps/src/.../ConsoleApps/
directory.
We're not asking for a report, but make sure to
run pingrpc and dataxferrpc and compare
results with their raw and TCPMessageHandler counterparts.
|