CSE550 Problem Set #2

out: Thursday January 23rd, 2014
due: Monday February 10th, 2014 by 5:00pm.

[ Summary | Paxos | Lock Service | How to submit | Grading ]

Summary.

In order to build an highly available service such as Google's MegaStore/Spanner storage systems or Facebook's social networking site, one of the most important building blocks is to provide a way in which the underlying data is replicated and can be accessed even in the presence of failures. To this end, this assignment will require you to build a replicated state machine based on Paxos. The focus of this assignment will be on the backend Paxos algorithms and in designing and building a highly reliable distributed storage system. More importantly, we would like to provide you with substantial flexibility so that you can choose any language of your choice to demonstrate your work on the Paxos implementation.

Paxos Implementation

The core of the assignment is to develop a Paxos replicated state machine. The basic idea of the Paxos RSM is described in Section 3 of the "Paxos made simple" paper by Lamport. A quick summary of the implementation strategy is as follows:

You can organize your code in whatever mechanism that you think is appropriate. For example, you could have a single entity that serves all three roles of proposer, acceptor, and listener. The implementation of Paxos should be robust to message loss and node failures and should be able to make progress as long as a majority of nodes are online. Note that a Paxos node could be offline for a period of time, but when it comes back online, it can help constitute a majority for passing Paxos commands. (This is the traditional fail-stop-recover model that Paxos supports.) Include a brief discussion of design issues that came up in a README.TXT file, as well as a rough overview of what your code does.

Lock Service

Demonstrate the correctness of your Paxos implementation by developing a simple lock server as the replicated state machine. The lock server should have the following functionality:

We would like you to focus primarily on the Paxos implementation, so you can make any simplifying assumptions that you feel appropriate with respect to the actual lock service exported by the replicated RSM. For instance, you can assume the following:
What to turn in

When you're ready to turn in your assignment, do the following:

  1. Create a directory called "problemset2". In it, you should have the code for the assignment, a README.TXT file, and a writeup.

  2. The README.TXT file should contain your name, student number, and UW email address, as well as instructions on how to launch your server.

  3. A major component of your grade will be based on a project write-up, in which your group should explain the protocol and any relevant implementation details. You should include, at a minimum, the following:
    • a detailed description of your group's implementation
    • any assumptions you made
    • how to use your implementation
    • any outstanding issues
    • anything else that you feel is important to discuss (e.g. quirks, interesting behavior, performance characteristics, etc)
  4. Use the course dropbox (there is a link on the course homepage) to submit that tarball.

Grading

We will be basing your grade on several elements: