Summary |
For assignment #2, you will implement two-phase commit over a replicated, durable key-value store, using RPC to coordinate the replica processes. There are three major pieces you will need to implement. First, you will build a durable key/value store. Second, you will implement a two-phase commit protocol to coordinate writing values atomically to replicas of the key/value store. Third, you will expose an RPC interface to the two-phase commit coordinator for clients to communicate with.
Part A: Key/Value Store |
In this part of the assignment, your job is to implement a durable key/value store library. Your library should expose three methods:
The major requirement for your store is that it is durable; if your process exits (or is killed) and restarts, any values that were previously stored using a "put" must still be retrievable using a "get", and similarly, any value removed using a "del" must stay removed. It's up to you to decide on appropriate types for key and value, and to decide whether get/put/del can return failure in some cases.
I strongly recommend that you find existing code for this; you might, for example, consider using the sqlite database library. If you decide to roll your own library, this will take a lot of effort, particularly to get post-failure recovery correct.
Part B: Two-Phase Commit |
Part B contains the bulk of this assignment. Your job is to implement a replicated key/value store, building on Part A. The architecture of your replicated key-value store is fairly simple; it consists of a single "master" process and multiple "replica" processes:
The master should be concurrent; it should permit multiple clients to connect to it and it should be able to process multiple operations concurrently.
The replica should be concurrent: it should be able to process multiple commands concurrently. Thus, it needs to implement some form of concurrency control. For this assignment, the concurrency control scheme is very simple: if two concurrent operations manipulate different keys, they can safely proceed concurrently. If two concurrent operations manipulate the same key (e.g., two "put" operations to the same key show up at the same time), the first to arrive should be able to proceed while the second's two-phase commit should abort.
Your implementation needs to be fault-tolerant. If either the master or one or more replicas fail, your two-phase commit implementation needs to do the right thing. Similarly, after a failed component recovers, your two-phase commit implementation needs to recover and proceed as appropriate.
It's fine for you to hard-code knowledge in the master and replicas about the configuration of the system, i.e., which processes exist and where they are running.
Part C: Clients |
In part C, you should implement a client program that connects to the master and issues a stream of get, put, and del requests. You should be able to launch multiple clients in order to drive the master concurrently -- i.e., each separate client should be issuing its own stream of get/put/del operations.
Use your client programs to test your implementation of two-phase commit. Be sure to find a way to test as many corner cases as you can, such as the master failing during the period of uncertainty.
What to turn in |
You should turn in the following: