CSE 452: Distributed Systems

Course Overview

Distributed systems are central to many aspects of how computers are used, from web applications to e-commerce to content distribution. This senior-level course will cover abstractions and implementation techniques for the construction of distributed systems, including client-server computing, the web, cloud computing, peer-to-peer systems, and distributed storage systems. Topics will include remote procedure call, preventing and finding errors in distributed programs, maintaining consistency of distributed state, fault tolerance, high availability, and scaling.

Alumni of this course say that it is among the most intellectually challenging courses that they have taken at UW, and among the most relevant to their future careers. We will attempt to live up to that recommendation. We believe the best way to learn the material is to implement the ideas presented in the course, and so there is a substantial programming project.

The calendar page provides a detailed topic list for the course, including readings, problem sets, and labs. In addition, we are using Gitlab, Ed, Canvas, and Gradescope for various parts of the course.

Lecture and Sections

Lectures are MWF at 11:30 in CSE2 G20. (Please double check all room assignments as they may change.) We have set up the class for Panopto lecture capture. However, we encourage you to attend lecture in person, if at all possible.

Sections will largely focus on the labs and (unlike many CSE classes) you will need to attend to be able to complete the assignments. Sections are not Panopto captured - you must attend live. (Let us know if you are quarantined, and we will try to work something out.)

Course Resources

Course Staff

Tom Anderson, instructor tom@cs.washington.edu
Rich Chen, TA rc2002@cs.washington.edu
Anthony Chung, TA achung99@cs.washington.edu
David Dai, TA kun02@cs.washington.edu
Scott Dang, TA scottvd@cs.washington.edu
Theo Gregersen, TA theoag@cs.washington.edu
Caleb Huang, TA ch233@cs.washington.edu
Yafqa Khan, TA yafqak@cs.washington.edu
Khushi Khandelwal, TA khushik@cs.washington.edu
Sidharth Lakshmanan, TA sidlak@cs.washington.edu
Ashay Manocha, TA, ashay9@cs.washington.edu
Arman Mohammad, TA ibm5@cs.washington.edu

You can email the entire staff (including Tom, despite the fact that the email address says "TAs") at cse452-tas@cs.washington.edu. However, you will likely receive a more reliable response if you make a private post on Ed (see below).

Staff Contact

The best way to contact the staff is to make a post on Ed. If your question is likely to be useful to other students, please consider making it public. (You can make the post "anonymous" to hide your identity from your classmates, but note that course staff can still see your identity on anonymous posts. See anonymous feedback at the very bottom of this page for submitting feedback without revealing your identity to course staff.) If your message is not relevant to other students or if it contains source code, make it private. We prefer you send messages via Ed if at all possible, because Ed lets us track whether we have handled the request. If you need to contact an individual staff member directly, you can also use email.

Assignments

There are four kinds of assignments in this class.

Labs. The labs are significant programming projects. There are four total. The first lab is done individually; the remainder are done in teams of two.
Design documents. There is one for each lab. The design doc for Lab 1 is to get you familiar with the concept. The ones for labs 2, 3, and 4 are due a week before the lab, so that we can provide timely feedback.
Problem sets. Most of the problem sets are focused on helping you think through the labs. Some test specific lecture topics. All deadlines will be in gradescope, and you will have at least a week to complete each problem set. These are to be done individually unless otherwise noted.
Blog posts. During the last two weeks of the quarter, we will assign and discuss three research papers. Prior to class you will write a post on the discussion board about some aspect of the paper. This is due before the lecture in which the paper is discussed.

See the sections below for more information on each kind of assignment, as well as the sections on grading, the late policy, and academic honesty.

Labs

The core of the course is to build a highly available, scalable, fault tolerant, and transactional key-value store. Key-value stores are widely used in cloud computing. The project is written in Java, derived from a similar one designed for the MIT graduate distributed systems course. A hallmark of our project infrastructure is extensive support for thorough testing and debugging. Each lab has a model-checking-based test suite that you can use to validate your implementation.

Lab 0 and Lab 1 of the project are to be done individually. (Note that Lab 0 does not have a turnin.) The other labs (2-4) are to be done with a partner. After the introductory labs (lab 0 and lab 1), each of the other labs is due every two to three weeks.

The labs are autograded by a model checker, and that means you will need to come up with solutions that work in all cases. Except for the writeup, the labs are (for the most part) self-grading - we give you all of the test cases we run.

The best way to write code that works in all cases do this is to carefully think your design before writing any code. Design before code is especially important for distributed systems where the number of possible code paths is exponential in the number of messages, the cost of uncaught errors in production code is enormous, and it is often infeasible to catch every possible error through testing.

This is very different from other CSE classes you may have taken. If you think we are just saying that and we don't really mean it, please understand that it may take you as much as ten times as long to complete the assignments if you start writing code before you carefully think through the design with your partner. Debugging typos can be laborious but is a reality we all face. Debugging distributed system design errors by testing is extremely time consuming.

We strongly recommend you get an early start on the labs. Many students underestimate the difficulty of the assignments, and leave themselves too little time to finish in time. The most common comment about the labs is that students wished they had gotten an earlier start. See below for the late policy.

Lab Design Documents

To encourage you to design then code, we ask for a design document for each lab, worth a sizeable portion of the total value of the lab, due a week in advance of the lab due date (except for lab 1) and then optionally revised and resubmitted with the lab for W credit. You and your partner should agree on the design document before writing any code. Of course, you may find from time to time that you need to update the design, but you want to try to minimize that.

The spec for the design document is here. An example design document (for a very simple protocol we are not asking you to build) is here.

Finally, we also ask you to provide a short post-mortem with each lab: what worked for you, and what didn't.

Problem Sets

For most lecture topics (which may span one or more class meetings), we will assign a problem set specific to that topic. These are to be completed individually.

Readings and Blogs

There is no textbook for this course. Instead, we will assign various tutorial and research papers. Although we do not have assignments directly relating to the tutorials, they are important for understanding the material in the lectures, labs, and problem sets.

In the final two weeks of the class, we will shift gears and read and discuss three research papers describing various practical distributed systems, along with a lecture or two on more recent research topics. Our goal with this portion of the class is to expose you to more of the context for how the ideas in the class might be applied.

A secondary purpose is to help make you more comfortable with reading papers. At first, reading papers can seem extremely hard, as they often assume knowledge beyond what you know. However, with time and practice, you can get better at it. Most of UW CSE's graduate classes are based on reading and analyzing papers, so if you might want to take further courses in systems topics at UW (e.g., attending the systems seminar CSE 590S, or in the fifth year masters program), this is a good, low stakes, iopportunity to practice.

Being able to read research papers is also an essential skill if you want to do undergraduate research. And many companies will expect you (especially as you are promoted into positions of leadership) to be able to read and understand recent research papers, as they often suggest new avenues for improving a company's existing products.

For the three papers we've assigned, we ask students to submit a blog post on each paper. The blog discussions are submitted on Canvas under the Discussions tab. Note the canvas label for the course in canvas is misnamed - M552 Distributed Systems and not 452 Distributed Systems.

Blog posts must have two clearly labeled parts.

First, semi-objectively reconstruct one aspect of the paper in your own words. The goal is to help you summarize and formalize your thoughts about the paper. Since we are only reading famous, well-cited papers, you may find it easy to use online tools to generate this portion. Please don't. This is an exercise intended to help you improve.
We emphasize pick one aspect to summarize. Here are some questions you can use as starting points; please use each perspective at most once across the different papers.
- Goal statement: What problem is the paper trying to solve, and why is it important?
- Hypothesis: Identify one key hypothesis of the paper (there almost always is at least one), and summarize what data the paper presents in support of that hypothesis.
- Design: Pick one aspect of the design and explain it, as if to someone else in the class.
- Evaluation: Pick a graph from the evaluation section, explain what it shows, and what conclusion one can draw from the data.
Usually this part will not contain first-person pronouns like "I" and "me".
Second, extend or expand on some aspect of the paper. Easiest is to expand on something you just summarized. Typically this part will contain first-person pronouns like "I" and "me", and it is acceptable/expected for you to draw on your own personal experience in writing this paragraph.
One way to extend your analysis of a paper is to pick one of the perspectives below, and discuss the paper from that perspective. Across the assigned papers, please use a specific perspective at most once - we want you to practice thinking about papers from multiple points of view.
- Background. Did you find something confusing about the paper, where it assumed you had some prior knowledge that was necessary to understand it? Go investigate that topic and write a short paragraph explaining it to other students.
- Related work. All research papers include some discussion of prior solutions to the problem being posed, even if they think their approach is better. Summarize one of those papers, and then compare and contrast that work with the paper we are reading. Is one always better, or are there some cases where the prior work will be better?
- Context. Did the paper make implicit or explicit technical assumptions (e.g., on the application mix or hardware capabilities such as relative speed of the CPU and the network). Identify one such assumption and discuss whether it still applies.
- Open questions: No paper is a complete answer. What additional work or open questions are left by the paper, and how might you answer them?
- Later work. What happened after the paper was published, in terms of other projects or technology change, to amplify or undercut the approach? For example, did someone later on take a different approach, and was it successful?
- Skeptic: You've been tasked with identifying the weaknesses of the approach, design, or evaluation. What are they?
- Advocate: You've been tasked with identifying the strengths of the approach, design, or evaluation. What are they?
- Side view: Pick some other part of computer science (operating systems, machine learning, databases, programming languages, software engineering, etc.) where you think the ideas in the paper might apply, and explain. Feel free to be creative.
- Future retrospective: In ten years, how well do you think the hypothesis of this paper will stand up to the passage of time, or will it be invalidated by further technology developments?
- Follow up: Pick an answer given by one of the other students in the class that you disagree with, and (in the same thread) explain your point of disagreement.

Typically, each part of your blog will be a paragraph long, so your entire blog will be two paragraphs. We prefer quality over quantity. Try to focus deeply on just one aspect of the paper about which you have something interesting to say. If you have several ideas, pick just one. If you find that you don't end up having anything interesting to say, change your topic.

Your blog post may also extend a discussion started by someone else. When writing this type of post, you still need to reconstruct and extend. For example, you might add or better explain details that the original post missed in your reconstruction, and then use these additional details to respond to the original extension (either by strengthening the original argument by presenting new evidence, or by arguing for a different conclusion by presenting other evidence).

Blog posts are graded with 15 points for each part. Occasionally we will award additional points for especially insightful posts.

Blog posts are due before the lecture in which the paper is discussed.

Grades

The class will be graded on an additive points system.

Lab 1: 160 points (lab 1 tests sum to 320, so for this lab points are divided by 2) + 30 design document
Lab 2: 330 points + 60 design document
Lab 3: 355 points + 70 design document
Lab 4: 495 points + 60 design document
Problem sets: 350 total points
Blog posts: 90 total points, 30 for each of 3 papers
Total points possible: 2000

There is no midterm or final exam.

The course is not curved. Your grade on a 4.0 scale is computed by the following formula. \[ \min\left( \left.\left\lfloor\frac{\mathrm{points}}{40}\right\rfloor \middle/ 10 - 0.7 \right.,\ 4.0 \right) \] In other words, your grade will be computed by the following table:

Points	Grade on 4.0 scale
≥1880	4.0
≥1840	3.9
≥1800	3.8
≥1760	3.7
≥1720	3.6
≥1680	3.5
≥1640	3.4
≥1600	3.3
...
≥1480	3.0
...
≥1360	2.7 (grad sat)
...
≥1080	2.0 (undergrad sat)
...

This grading scheme may be somewhat unfamiliar to you. We will discuss it on the first day of lecture. Be sure you understand it, and feel free to ask any questions about it.

An important consequence the additive points-based grading scheme is that there is a sense in which every assignment is optional. If you are unable to complete some assignment, simply make sure to complete the remaining ones - we give you room to miss points and still get a good grade. For example, if you miss 15% of the available points, that is still some form of an A grade.

Since you will receive the same number of points as your partner for your combined work on labs 2-4, it is essential during partner-matching that you communicate expectations about your grade target. If you find yourself stuck in a situation where your partner wants to do significantly more or less work than you do, please contact the staff.

Most students find that they are able to complete all of the assignments to a high degree of quality, but that the assignments require a decent amount of effort. Students find the class time-consuming but rewarding, and so grades in this class are generally high.

If you are taking the class S/NS, note that per university policy, undergraduates receive S credit for any grade 2.0 or higher, while graduate students (including BS/MS students) receive S credit for any grade 2.7 or higher. Thus, an undergraduate would need to earn at least 1080 points to receive S credit, while a graduate student would need to earn at least 1360 points to receive S credit.

Late Policy

Our late policy is designed to give students maximal flexibility without having to ask us for permission in most cases, while still allowing us to grade assignments and get them back to students in a timely fashion. If you have an extenuating circumstance that causes you not to be able to complete the work on time, especially if due to something outside of your control (such as your lab partner has an illness or drops the class), please contact the staff or email the instructor. Often, we are able to work something out that is agreeable to everyone involved.

Each kind of assignment has a separate late policy. Be sure you understand the differences. Contact course staff if you have any questions.

All assignments, regardless of late policy or extensions, must be turned in by the end of the day Thursday June 6, unless the instructor has given specific and individual permission. We generally only grant this for students who would otherwise fail the class.
Each lab (except the last one!) comes with a 48 hour grace period, during which work is accepted without penalty. We then deduct 1% off your score, just for that assignment, for each additional day that it is late. In other words, if you are making progress, even slowly, you should keep working on the lab, but if you are stuck you should go ahead and turn it in. Since turning assignments in late cuts into the available time you have for the next lab, only use this flexibility if necessary. To turn in your lab after the 48 hour grace period, contact the staff.
For design documents, there is no grace period. This is because these are graded manually and quick feedback is essential.
Problem sets also have a 48-hour grace period, during which work is accepted without penalty. No credit is granted after the grace period expires.
Blog posts are due before the beginning of the lecture in which the paper is discussed. You may turn in a blog post within 24 hours after the original deadline for half credit. After 24 hours, no credit will be given.

Note that, unlike some other course policies you might be familiar with, in this class there is no cap on how much total grace time you can use over the quarter. You can use the grace period on every single problem set and lab (except the last one) and still get full credit. Note that blog posts are different, since the 24 hour grace period on a blog post is for half credit.

Academic Honesty

Please read CSE's Academic Misconduct Policy.

You are encouraged to discuss all aspects of the course with and ask for help from the instructor, the TAs, and other students. However, do not cross this line: Do not share code or written text. Do not look at lab or problem set solutions that might be on the Internet. Do not use someone else's code or text in your solutions or responses. Do not use any online AI tool to write your code. It is ok to share ideas, explain your code to someone to see if they know why it doesn't work, or to help someone else debug if they've run into a wall.

Some work in this class will be completed with a partner or group. The rest is to be done individually. We will clearly mark each piece of work as to whether it is to be done with your partner or individually. Please contact the staff if you are unsure about any part of how this policy applies to an assignment.

Lab 1 is individual, as is the design document for Lab 1.
Labs 2-4 (and their design documents) are with a partner.
Problem sets are individual.
Blog posts are individual.

Partner work

The labs are difficult, and working with a partner can make things easier because you can discuss the details of your design together. Working with a partner is also a serious responsibility: your partner is relying on you to communicate and collaborate effectively. Do not agree to work with a partner if you are not willing to commit to these responsibilities.

A common misconception about working with a partner is that you should "split up the work". This is usually a terrible idea, especially in this class, because the hard part of the lab is the design process. It might take you 10 or more iterations to design your protocol correctly, and all parts of a distributed protocol can depend heavily on all other parts of the protocol, so there is no way to "split up" this design work. Instead, you should plan to pair program (i.e., work together synchronously) during the design process until you are confident that you have a correct design and have written the design document together. After that, you can either continue to pair program your implementation (we recommend this approach) or you can try to split up the implementation work. When in doubt, work together synchronously. When you do work separately, establish a clear communication channel and keep each other posted about where you are stuck.

To summarize, by entering a partnership, you are agreeing, at minimum, to:

Communicate frequently with your partner during the weeks where design documents and labs are due.
Collaborate synchronously on the design document and any challenging implementation tasks.
Inform your partner in advance if you will be unavailable for communication or collaboration.

We take the partner agreement extremely seriously and will enforce it. If you flagrantly and repeatedly abandon your partner, you can expect to take a zero on the lab regardless of how much work you put into it.

If your partner breaks this agreement, first try to communicate with them about it. If that doesn't work, contact the course staff.

If you do decide to work with a partner, you can either find a partner on your own (and let us know) or you can ask us to match you with someone. We will send more information about the partnering process later in the quarter closer to lab 2.

When looking for a partner, be sure to communicate about grade expectations. Also, if you plan to opt in to W credit, you should find a partner who wants W credit as well.

452 vs M 552

We are not offering M 552 this quarter.

W credit

We are continuing the pilot program to offer opt-in W credit for CSE 452.

All students, whether aiming to recieve W credit or not, will need to do a substantial amount of writing for this class (see above under design documents).

W credit in addition requires not only a substantial amount of writing, but also revision of that writing. So students who want W credit in 452 need to submit revised versions of their design documents for labs 2-4 (due dates on Gradescope). The revisions should take into account staff feedback on the design, and include any updates to the design that you encountered while implementing the lab.

Anonymous Feedback

Anonymous feedback can be sent to the instructor or TAs via feedback.cs.washington.edu.