# Lecture 3: RPC ## Agenda - RPC - intro - failures - semantics - in practice ## Announcements - President informs us that we should provide "increased flexibility" for rest of month - This message was just sent today, so we are still trying to figure out how to interpret it - Our options: - Original plan: in person lecture starting Monday with Panopto recording for those who can't make it - Full remote plan: just like week 1 - Something more synchronous hybrid-y - James and whoever wants to be in person - But also logged into Zoom from the classroom - (Lots to work out here -- would be buggy at first) - Challenges: - The numbers are bad - The university guidance is not very useful - A big benefit of being in person is that most people are able to make it - A bad outcome (it seems to James): half of your courses are in person only and half are remote only

- Problem Set 1 due tonight - Intended to just get you thinking about unreliable message delivery -- don't worry if not super clear yet - Fill out the partner form by Monday - Start on Lab 1! - Office hours - I will have office hours after class today until 5:30 or so in the lecture Zoom room - If people are around on discord and want it, I will have another office hour later tonight ## Remote Procedure Call (RPC) ### Executive Summary - RPCs allow nodes to call functions that execute on other nodes using convenient syntax - The key difference from a local function call is what happens when things fail - To tolerate failures, use sequence numbers and retransmissions ### Intro to RPC #### What is it? - It's a programming model for distributed computation - "Like a procedure call, but remote" - The client wants to invoke some procedure (function/method), but wants it to run on the server - To the client, it's going to look just like calling a function - To the server, it's going to look just like implementing a function that gets called - Whatever RPC framework we use will handle the work of actually calling the server's implementation when the client asks

- For context, Google does about \(10^{10}\) RPCs per second #### Local procedure call recap - Remember roughly how local function calls work: - In the high-level language (e.g., C :joy:), we write the name of the function and some expressions to pass as arguments - The compiler orchestrates the assembly code to: - in the caller, evaluate the arguments to values - arrange the **arguments** into registers/the stack according to the calling convention - use a call instruction of some kind to jump to the **label** of the callee function - caller instruction pointer (return address) is saved somewhere, typically on the stack - callee executes, can get at its arguments according to calling convention - callee might call other functions - callee eventually **returns** by jumping back to return address, passing **returned values** according to calling convention - C programmers rarely have to think about these details (good!) #### RPC workflow - Key difference: function to call is on a different machine - Goal: make calling remote functions as easy as local functions - Want the application programmer to be able to just say `f(10)` or whatever, and have that automatically invoke `f` on a remote machine, if that's where `f` lives. - Mechanism for achieving this: an RPC framework that will send request messages to the server that provides `f` and response messages back to the caller - Most of the same details from the local case apply - Need to handle **arguments**, which function to call (the **label**), where to **return** to, and **returned values**. - Instead of jumping, we're going to send network messages. So here's the workflow - In the high-level language, we write the name of the function and some expressions to pass as arguments - The compiler and RPC framework: - orchestrate evaluating the arguments on the clients - the client makes a normal, local procedure call to something called a "stub" - the stub is a normal local function implemented/autogenerated by the RPC framework that: - serializes the arguments (converts them into an array of bytes that can be shipped over the network) - sends the function name being called and the arguments to the server (this is called a request message) - waits for the server to respond with the returned value (this is called a response message) - deserializes the returned value and then does a normal, local return from the stub to return that value to the caller - The server sits there waiting for requests. When it receives one, it: - parses the message to figure out what function is being requested - deserializes the arguments buffer according to the type signature of the function - invokes the requested function on the deserialized arguments - serializes the returned value and sends a response to the client Here is a diagram of what happens when a client invokes an RPC:
Now is a good time to get familiar with this style of diagram. We will be seeing them a lot this quarter. Key points: - Time flows down - Columns are nodes - Diagonal arrows are messages (they arrive after they were sent, hence pointing slightly down) #### A silly example Suppose we have this function on the server: ```java int total = 0; int incrementBy(int n) { total += n; return total; } ``` And suppose we set up our server with our RPC framework to accept calls to `incrementBy` (and probably other functions too). Now we have a client that wants to call `incrementBy`, as in: ```java void main() { int x = 10; int ans = incrementBy(10); // * print(ans); } ``` And suppose we set up our RPC framework on the client and told it where to find the server, and we've told the framework about `incrementBy` and its type signature. Here is what will happen: - The client starts running until line `*`. - When the client invokes `incrementBy`, it's really invoking a stub - The client stub for `incrementBy` will: - Construct a message containing something like "Please call `incrementBy` on argument `10`" - This involves serializing all the arguments (in this case, just `10`) - Send this message to the server and wait for a response - When the request message arrives at the server, the RPC framework will: - Parse the message, figure out what function is being requested, and deserialize the data to the right argument types - Invoke the "real" implementation of `incrementBy` on the provided arguments - Construct a message containing the result - This involves serializing the result (in this case, just `10`) - Send this message back to the client - Such messages are called "responses" - When the response arrives back on the client, the stub continues executing: - Parse the response message - Return the result from the stub to the caller. The key points to notice about RPCs are: - When we implement `incrementBy` on the server, we *just* wrote a normal function - When we wanted to call `incrementBy` on the client, we *just* wrote a normal function call - The RPC library has to do a bunch of work to make this nice interface possible Finally, notice that this RPC is stateful: it causes the server to update some state - In this very simple example, just a global variable on the server - But state is very common, and often the server would actually store the state in a database or some other backend (and the server would communicate with that backend via further RPCs) ### Failures in RPC - From a distributed systems perspective, we are going to be really interested in what happens when certain components break. - Now is a good time to get familiar with thinking about failures. - For each such failure, we want to understand how our system handles those failures. #### Three important failure cases for RPC - (Consider the asynchronous unreliable network model with fail-stop node crashes.) - What if the request message gets dropped? - What if the server crashes? - Before the request arrives? - While executing the request? - After the response is sent? - What if the response message gets dropped? #### Detecting failures In order to recover from failures, systems often take some corrective action (retransmit the message, talk to somebody else instead, start up a new machine, etc.). A fundamental limitation in distributed computing is that it is impossible to accurately detect many kinds of failure: - Some are easy: if messages arrive out of order, you can tell by just numbering them sequentially. - Some are hard: if a node is really slow, another node might think it has crashed. - The only way to check if a node is up is to send it a message and hope for a response - But if the node is just super slow, you won't get a response for a while, and during that time you have no way to know if its because the node crashed or is slow (or maybe the network is slow or dropped your message or the response) - Another hard one: "network partition" - Nodes 1 and 2 can talk to each other just fine, and nodes 3 and 4 can do the same among themselves, but neither 1 nor 2 can talk to 3 or 4 or vice versa. - Very confusing if your mental model is "a node is either up or down" - In a sense, nodes 1 and 2 are "down" from the perspective of 3 and 4. - Important to realize that this is not a "new" kind of failure: - If you assume the network can drop any message, then the network can "simulate" a network partition. The takeaway here is that, just like in the muddy foreheads game, nodes have only partial information about the global state of the system. We can't know if the server received our message unless we hear back from the server. #### Towards tolerating failures in RPC The first step is figuring out what we *want* to happen. - When a client receives a response message, what *should* that *mean* - Fancy word for "meaning": semantics Three options for RPC semantics: - At least once (NFS, DNS, lab 1b) - At most once (common, lab 1c) - Exactly once ### RPC Semantics #### Our options in more detail At least once - If the client receives a response message to a request: - then the request was exectued at least one time (perhaps more than once) - If the client does not receive a response message: - then the request *might* have been executed (or it might not, or it might have been exectued more than once) At most once - If the client receives a response message to a request: - then the request was exectued exactly once - If the client does not receive a response message: - then the request *might* have been executed (or it might not) Exactly once - If the client receives a response message to a request: - then the request was exectued exactly once - If the client does not receive a response message: - it blocks forever retransmitting the request and waiting until it receives a response #### Implementing at least once - send request and wait for response - if you wait for a while and hear nothing, re-transmit request - if you re-transmit a few times and still hear nothing, return error to the calling application - Typically an RPC framework would throw some kind of exception from the stub, or maybe return a special error value to indicate that the call failed - This is pretty different from a local function call! It's an error that says "I couldn't call the function (I think, but I might have)" Advantages: - The server keeps no state -- just executes the requests it receives Disadvantages: - Can be difficult to build applications that tolerate operations happening more than once When to use it: - Operations are pure (no side effects) or idempotent (doing it more than once is the same as doing it once) - For example: - reading some data - taking the max with some new data - say `n` is a global `int` on the server, then the operation `n = max(n, x)` is idempotent (where `x` is an argument to the request) #### Implementing at most once Key idea: still need to retransmit to tolerate message drops, but filter out duplicate requests on server. - Client sends request tagged with a "sequence number" - gives each request a unique identifier - can just start at 0 and increment for each new request the client wants to send - Client waits for response - If client doesn't hear response after a while, re-transmit request *with same sequence number* - If still nothing after a few retries, give up and return error - On the server, keep track of which `(client, sequence number)` pairs have been executed, and don't re-execute duplicate requests. Instead, return previously-computed response. Advantages: - Usually easier to program on top of - Works well for stateful operations Disadvantages: - Server has to keep state proportional to the number of requests (but see below for optimizations) Implementation challenges: - If the client crashes and reboots, it may lose track of the last sequence number it used, and restart numbering at 0, which would cause bugs. - In the labs, we assume a fail-stop model where nodes crash but don't reboot. - In practice, one way to handle this is to change the client's name every time it restarts. - How can we reduce the server state? - Option 1: client tells server "I have heard your response to all sequences numbers \(\le x\)" - server can discard responses for those requests - Option 2: only allow one outstanding RPC per client at a time - when request numbered \(x + 1\) arrives, can discard all previous state about that client - The labs use option 2. #### Does TCP solve all our problems? TCP: reliable bi-directional byte stream between two nodes - Retransmit lost packets - Detect and filter out duplicate packets - Useful! Most RPCs sent over TCP in practice But TCP itself can time out - For example, if the server crashes or the network goes down for long enough. - Usually TCP will retransmit a few times, but after say 30 seconds, it gives up and returns an error. - Application needs to be able recover, usually involves establishing a *new* TCP connection. - Question: on reconnection, were my old requests executed or not? #### What if the server crashes? - If the list of all previous responses is stored in server memory, it will be lost on reboot - After reboot, server would incorrectly re-execute old requests if it received a duplicate. - One option would be to keep the state in non-volatile storage (hdd, ssd) - Another option is to give server new address on reboot (fail stop model) - this is what the labs do ### More about RPC in practice #### Serialization - Refers to converting the in-memory representation of some (typed) data into a linear sequence of bytes - Usually so we can write those bytes to the network or to disk. - Above we did an example that took an `int` and returned an `int`. - Serialization just encodes the `int` as its four bytes. - Other primitive types (float, long, etc.) work similarly. - What about pointers? - Should we just send the pointer over the network? - No, doesn't make sense because the server won't be able to dereference it - Whatever that pointer points to on the client is probably not even in the servers memory, much less at the same address. - Could convert to a "global reference", or use some kind of global addressing system - Definitely possible! Complicated. - Instead, most of the time what we do is pass a copy of the data pointed to by the pointer. - For example, if we have an RPC to write to a file on the server like `void write(char* data)` or something, the client will send the server a copy of the `data` buffer. - Similarly, if we have an RPC to read the file like `char* read()`, then we want to get a copy of that returned data back on the client. - More interestingly, what if we had a `void read(char* dest)` function that wrote the answer into its argument - The `dest` pointer is an "out pointer", it is passed just so that the function `read` can write to it. - Then we don't need to send anything, really, to the server - But we need the server to send us back the contents of `dest` after the function call! - Such "out pointers" have to be handled specially by RPC frameworks - In the labs, we use Java, which has serialization built in. This will make implementing our RPCs relatively easy. #### RPC vs procedure calls - From the application programmer's perspective, very similar to a normal (local) procedure call. - Some additional complexities under the hood that don't show up with local procedure calls. - "Binding": the client RPC library needs to know where to find the server - Need to make sure the function we want to call actually exists on the server - And that the server is running the version of the software we expect - Binding is often solved through "service discovery" - Have one well-known server whose job it is to keep track of all the servers, their names/addresses, what RPCs they support, what version they're running, etc. - Then clients first talk to the the service discovery server to find a server that will work for them. - Implementing the stubs - The RPC framework often has a compiler-like thing that will autogenerate code to do serialization, send, recv, deserialization, etc. - Takes as input the signatures of the procedures. - Performance - A local procedure call is very fast, on the order of 10 instructions (a few nanoseconds) - RPC to a machine in the same data center: about 100 microseconds (10k times slower than local call) - RPC to a machine on other side of planet: about 100 milliseconds (10 million times slower than a local call) - Solutions: - Issue multiple requests in parallel - Batch requests together - Cache results of requests