CSE logo University of Washington Department of Computer Science & Engineering
 CSEP 552: PMP Distributed Systems, Spring 2012

Overview

Instructor: Arvind Krishnamurthy
Office hours: Wednesday 5:00pm-6:00pm, in CSE544, or by appointment.

TA: Vincent Liu (vincent AT cs DOT washington DOT edu)
Office hours: by appointment (send email)

Lectures: Wednesdays, 6:30pm-9:20pm, in Johnson 075.

CSEP552 is a graduate course on distributed systems. Distributed systems have become central to many aspects of how computers are used, from web applications to e-commerce to content distribution. This course will cover abstractions and implementation techniques for the construction of distributed systems, including client server computing, the web, cloud computing, peer-to-peer systems, and distributed storage systems. Topics will include remote procedure call, maintaining consistency of distributed state, fault tolerance, high availability, and load balnce. As we believe the best way to learn the material is to build it, there will be a substantial programming project.

Prerequisities: the basic prerequisite is to have taken an undergraduate operating systems course (CSE 451 or equivalent) or an undergraduate networks course (CSE 461 or equivalent). If you haven't taken an undergrad OS or networks course, please come talk to Arvind. We will not be covering undergraduate material in this course.

Papers: you will be responsible for reading approximately three papers before each class, and contributing your thoughts on each assigned paper to the class discussion board before the class that covers it.

Project: The core of the course is the project: to design and build a fault tolerant peer-to-peer Facebook application by the end of the quarter. The project is to be done in groups of three students. The project has three pieces, each building on the previous ones, due roughly every three weeks. This is a very aggressive schedule and so we need every group to get an early start on the assignments. We will provide you some basic message passing code as well as a framework for simulating and emulating a distributed system to aid in debugging. The first project assignment is to build a simple client-server storage system. We then add crash recovery and high availability to this basic system. The final assignment is to use these pieces to construct a fault tolerant Facebook application that can run without a central server, on a cluster of nodes in the undergraduate PC lab.

Exam: near the end of the quarter, I'll hand out a take-home final exam. The questions will test your understanding of the material we cover in the reading and during lectures. You'll have a few days to finish and turn in the final exam.

Administrivia

Mailing list: When you register for the course, you'll automatically be added to the class mailing list. (csep552m_sp12@uw.edu for 5th year masters and csep590a_sp12@uw.edu for PMP students). This list will first be created on March 26th. To manage your subscription after then, visit the mailing list web page for 5th year masters or mailing list web page for PMP students. You will be subscribed using your u.washington.edu email address. But, you can modify your subscription to use an email address of your choice. Note that you can only post to the mailing list from your subscribed email address.

Announcements:

Discussion Board

Here's the link to the class discussion board:
https://catalyst.uw.edu/gopost/board/arvindk/26947/
and the link to the assignment dropbox:
https://catalyst.uw.edu/collectit/dropbox/arvindk/21374

Paper schedule

Here is the schedule of papers that might be tweaked as the quarter progresses. Note that you're required to read assigned papers, but the optional additional papers are just that: purely optioin, for your interest, if you choose to go deeper on your own. Discussion board entries for the assigned papers are due by noon on the day of the associated lecture.

Date

Reading

Notes

Assignments

March 28 Introduction intro, rpc, time
April 4 Global states, Consistency GPD, SVM
April 11

Consistency

consistency
April 18 Transactions and Replication transactions w/ white bg
April 25 Paxos paxos w/ white bg
May 2 Overlays, DHTs DHTs w/ white bg
May 9 Cloud Storage storage w/ white bg
May 16 Map-reduce MR/Dryad w/ white bg
May 23 Peer-to-peer systems P2P w/ white bg
May 30 Security
BFT w/ white bg

Project

Everybody registered for the course should already have had an instructional UNIX account created for them by the department support staff, and have been notified of it. Using this account, you can remotely log into (via ssh) the attu.cs.washington.edu compute cluster. You can find more information about instructional resources here.

You should also be able to do the programming assignments on your own personal machines; none of them require large or exceptionally powerful machines. I'd recommend doing your work on Linux; I'd start with a standard Linux Ubuntu distribution. Note that the department has made virtual machine images available with the departmental linux installation on them. You'll need to get ahold of VMware to use them.

A few rules of the road are worth mentioning. For design questions, you should feel free to talk with each other about the question and ideas that you come up with. You should not, however, share your written answers with each other directly. If you do discuss ideas with each other, please cite who you discussed with in your turned in work. This is mostly so that you get in the habit of properly attributing collaborations.

Similarly, you should feel free to talk with each other about the programming assignments, and share ideas as you see fit. You can also make use of Google or other resources. However, you must not share code with each other, or rely on code you find else where, such as the Web, to solve the programming assignment directly: you must implement your own code to solve each programming assignment. Unless the programming assignemnt specifies otherwise (and a few of them will), tou can pick whatever programming environment or tools to build on that you like -- e.g., you can make use of shells, interpreters, and within reason, libraries or other building blocks that don't directly solve the problem for you. As before, if you do discuss a programming assignment with someone else or find useful sources of information (e.g., code or technical descriptions on the Web), please cite or otherwise attribute all of your sources.