CSE logo University of Washington Department of Computer Science & Engineering
 CSE 552: Parallel and Distributed Systems, Spring 2019

Overview

Instructors: Jon Howell (jonh AT cs) and Jay Lorch (lorch AT cs)
Office hours: Wed 2–3 pm in CSE 332, or by appointment

TA: Niel Lebeck (nl35 AT cs)
Office hours: Thu 2:00–3:00 pm in CSE 220, or by appointment

Classes: MW 11:30am–12:50pm in CSE2 G04

Large-scale distributed systems have become pervasive, underlying virtually all widely used services. In tandem, our understanding of how to build scalable, robust, efficient, and secure systems is increasingly well-founded. This class will attempt to bring students to the state of the art in distributed systems research and practice as well as to identify a set of open research problems.

This course will focus on correctness. Distributed systems are particularly difficult to make correct. Concurrent updates to distributed state introduce subtle race conditions. Component failures are inevitable, and "distributed" means the system is expected to proceed despite them. What invariants and progress conditions ensure that systems operate correctly despite concurrency and failures?

This iteration of the class will NOT cover parallel computing.

Prerequisities: You should have taken CSE 550 or CSE 551 or CSE 452. The course listing has a pre-requisite of CSE 551; this will not be enforced, and it won't matter if you haven't had it. However, it's essential that you understand and have experience with how to program with threads before taking this class; at UW, that can be accomplished with, e.g., CSE 451.

Further, we'll assume basic knowledge of distributed-systems topics, including: remote procedure calls (RPC), two-phase commit, distributed time, serializability, and MapReduce. This can be obtained by taking CSE 550 before CSE 552 (which is the normal order). A motivated student will be able to pick up these topics on their own. See the background section below for more information. Students having taken an undergraduate distributed-systems class will find about 25% of the material will overlap; in particular, it's permitted to take both CSE 452 and CSE 552.

Research Project: A major part of the course will be an independent research project on a topic of your choosing related to distributed systems, with team members of your choosing. A strict requirement is that every project must have a quantitative result — in other words, purely paper designs are not sufficient.

Course Reading and Discussion: Another major part of the course will be a group discussion of various assigned papers both before and during class. The goal will be to develop your ability to uncover the broader implications of research papers. Most systems research starts with this process: what can one conclude from a research result, beyond what's written by the authors?

Mailing List

When you register for the course, you'll automatically be added to the class mailing list (cse552a_sp19@uw.edu). To manage your subscription, visit the mailing list web page. You've been subscribed using your u.washington.edu email address. You can, however, modify your subscription to use an email address of your choice. Note that you can only post to the mailing list from your subscribed email address.

Course Reading and Discussion

Each class will focus on a single topic, with discussion oriented around two papers. Generally, one of those papers is mandatory for you to have read before the class and the other is optional. See below for a list of the papers assigned for each class.

For each class except the first, we'll have a forum thread that any student may post to and read. Each student must, before class begins, post a paragraph that makes an interesting point about one of that class's papers.

The class discussion will be divided into two parts. First, we'll discuss the plain content of the paper: What did the authors think the paper was about? Second, we'll examine the subtext and context of each paper: What do we (the lecturers and the students) think is really interesting about the paper? For instance, what limits and opportunities does the paper miss?

For the first part of the discussion, we'll lecture on mechanical elements of the papers that are hard to discuss without prepared slides. We'll then kick off an interactive discussion. The class-participation portion of your grade will depend on the extent to which you participate meaningfully in each such discussion.

Prior to each class, we'll post a short list of questions to get the discussion started, but we'll also cover topics from the forum thread and topics that arise naturally from the class discussion.

At the end of each class, we'll spend 5–10 minutes preparing you to read the papers for the next class. We'll provide background material that provides context for them, and we may suggest questions to think about while reading them.

Discussion Board

Here's the link to the canvas page for the class discussion board and assignment uploads:
https://canvas.uw.edu/courses/1272963
The discussion board can be used for two purposes:

Background

As discussed above, we'll assume basic knowledge of some distributed-systems topics. So, if you're not familiar with the following papers, it will be helpful to read them within the first two or three weeks of class:

Schedule of Non-Class Events

Date Time Event
Friday, April 12 5:00 pm Initial one-page research project proposal due
Friday, April 19 5:00 pm Problem set 1 due
Friday, May 3 5:00 pm Three-page outline of research project, including a complete introduction, due
Sunday, May 5 5:00 pm Problem set 2 due
Tuesday, May 21 5:00 pm Problem set 3 due
Friday, May 24 5:00 pm Five-page version of research project paper, including a complete discussion of related work, due
Monday, May 27 11:30 am – 12:50 pm No class (Memorial Day)
Monday–Friday, June 10–14 Various Research project presentations, scheduled individually
Tuesday, June 11 5:00 pm Final research project paper due
Wednesday, June 12 2:00 pm – 4:20 pm Final exam

Class Paper Schedule

Here's the schedule of papers to be discussed during class. For each class, there's one primary paper you must read before class, and one or more optional papers we encourage you to also read before class. Your post to the forum thread before class can be on either the primary paper or an optional paper.

Date Reading Slides
Monday, April 1
Wednesday, April 3
Paxos and Raft Class Intro slides
Lecture slides
Monday, April 8 Performance enhancements to Paxos Background slides
Lecture slides
Wednesday, April 10 Distributed system verification Background slides
Lecture slides
Monday, April 15 Chain replication Background slides
Lecture slides
Wednesday, April 17 Byzantine fault tolerance Background slides
Lecture slides
Monday, April 22 Distributed logging Background slides
Lecture slides
Wednesday, April 24 Byzantine-fault-tolerant distributed logging
Monday, April 29 Blockchains
Wednesday, May 1 Peer-to-peer systems
Failure detection
Distributed file systems
Byzantine-fault-tolerant distributed file systems
Consistency models
Storage semantics
Relational storage
Remote memory direct access (RDMA)
In-network computation
In-network computation

Research Project and Presentation

An independent research project is required, on a topic of your choosing, with team members of your choosing. Projects can be done by teams of size 1–3 students, and we'll expect more from larger teams; we expect that most people will choose to be in teams of size 2. Projects can relate to any distributed systems topic, and we encourage you to relate them to your own research. We're handing out a list of possible topics for those needing a starting point. A strict requirement is that every project must have a quantitative result — in other words, purely paper designs are not sufficient.

We've provided a few suggestions for project ideas. But you're welcome (indeed, encouraged!) to devise your own project ideas not on that list.

The project will be due in five steps:

Everybody registered for the course should already have had an instructional UNIX account created for them by the department support staff, and have been notified of it. Using this account, you can remotely log into (via ssh) the attu.cs.washington.edu compute cluster. You can find more information about instructional resources here.

If the compute cluster doesn't meet your needs, you may want to consider one of the following experimental platforms, which we can help you get access to:

Problem Sets

Three problem sets will be assigned during the quarter to verify your understanding of the concepts being presented in the papers. A thorough understanding of the papers is a prerequisite to being able to understand other work in the area, or to put your own research in an appropriate context.

Final

The final will take place Wednesday, June 12 from 2pm – 4:20 pm. It will constitute 20% of your grade. Questions on the final will be drawn from the required portions of the reading list.

Grading

We'll grade the course using the following breakdown: