# CSE 452: Distributed Systems

## Course Overview

Distributed systems are central to many aspects of how computers are used, from web applications to e-commerce to content distribution. This senior-level course will cover abstractions and implementation techniques for the construction of distributed systems, including client-server computing, the web, cloud computing, peer-to-peer systems, and distributed storage systems. Topics will include remote procedure call, preventing and finding errors in distributed programs, maintaining consistency of distributed state, fault tolerance, high availability, and scaling.

Alumni of this course say that it is among the most intellectually challenging courses that they have taken at UW, and among the most relevant to their future careers. We will attempt to live up to that recommendation. We believe the best way to learn the material is to implement the ideas presented in the course, and so there is a substantial programming project.

The calendar page provides a detailed topic list for the course, including readings, problem sets, and labs. In addition, we are using Gitlab, Ed, and Gradescope for various parts of the course.

## Lecture and Sections

Following university guidance, the first week of the course will be offered remotely on Zoom. Starting in the second week, lectures will be in person in SIG 134 and recorded via Panopto.

Sections will largely focus on the labs and you will need to attend to be able to complete the assignments.

Lectures will be recorded and made available to all students.

## Course Staff

• James Wilcox, instructor, he/him or they/them jrw12@cs.washington.edu
• Ani Canumalla, TA, he/him, anirudhc@cs.washington.edu
• Xun Cao, TA, he/him, whdecx@cs.washington.edu
• Logan Gnanapragasam, TA, he/him, gnanabit@cs.washington.edu
• Chase Lee, TA, he/him, chase412@cs.washington.edu
• Jolin Tsai, TA, she/her, ylt1215@cs.washington.edu
• Ivy Wang, TA, she/her, fw29@cs.washington.edu
• Robin Yang, TA, he/him, yangy87@cs.washington.edu

You can email the entire staff (including James, despite the fact that the email address says "TAs") at cse452-tas@cs.washington.edu, but we usually prefer you make a private post on Ed if possible.

## Staff Contact

The best way to contact the staff is to make a post on Ed. If your question is likely to be useful to other students, please consider making it public. (You can make the post "anonymous" to hide your identity from your classmates, but note that course staff can still see your identity on anonymous posts. See anonymous feedback at the very bottom of this page for submitting feedback without revealing your identity to course staff.) If your message is not relevant to other students, make it private. We prefer you send messages via Ed if at all possible, because it allows any staff member to assist you. If you need to contact an individual staff member directly, you can also use email.

## Assignments

There are three kinds of assignments in this class.

• Problem sets. Weekly in weeks when labs are not due for the first eight weeks (5 total). Typically due on Friday night.
• Labs. Four total, due roughly every three weeks. Typically due on Friday night.
• Blog posts. For every lecture in the last three weeks of the quarter, we will read a modern research paper, and you will write a post on the discussion board about some aspect of the paper. Due the night before the lecture in which the paper is discussed.

See the sections below for more information on each kind of assignment, as well as the sections on grading, the late policy, and academic honesty.

### Problem Sets

We will assign a small number of exercises for almost every lecture throughout the quarter. To get the most out of the class, we encourage you to work the problems as soon after the corresponding as possible, but in order to provide flexibility, they are not due until the end of the following Friday. (The exception is the first week, where we will make the questions easy enough that you can do them by Friday.) By default, problems are to be completed individually. We will label some problems specifically to be worked on with your lab partner. See below for the late policy.

### Labs

The core of the course is to build a highly available, scalable, fault tolerant, and transactional key-value store. Key-value stores are widely used in cloud computing. The project is written in Java, derived from a similar one designed for the MIT graduate distributed systems course. A hallmark of our project infrastructure is extensive support for thorough testing and debugging. Each lab has a model-checking-based test suite that you can use to validate your implementation.

Lab 0 and Lab 1 of the project are to be done individually. (Note that Lab 0 does not have a turnin.) The other labs (2-4) are to be done with a partner. After the introductory labs (lab 0 and lab 1), each of the other labs is due roughly every three weeks.

The labs are autograded by a model checker, and that means you will need to come up with solutions that work in all cases. We strongly recommend that you think through your design before writing any code. Debugging typos can be laborious but is a reality we all face. Debugging design errors by testing is extremely time consuming, particularly if you are rushing to meet a deadline.

We strongly recommend you get an early start on the labs. Many students underestimate the difficulty of the assignments, and leave themselves too little time to finish in time. The most common comment about the labs is that students wished they had gotten an earlier start. See below for the late policy.

There is no textbook for this course. Instead, we will assign various tutorial and research papers.

Some of the research papers in the reading list are marked "(blog)" in the calendar. For these papers, you may earn 50 points for participating in an online discussion of the paper on Ed.

Blog posts must have two clearly labeled parts.

• First, semi-objectively reconstruct some aspect of the paper in your own words. For example, if the paper presents a system design, you might summarize the design. If the paper presents empirical evidence for a conclusion, summarize the statement of the conclusion, the evidence for it, and why the conclusion actually follows from the evidence. Usually this part will not contain first-person pronouns like "I" and "me".

• Second, subjectively critique the aspect of the paper you reconstructed in the first part. "Critique" does not necessarily have a negative connotation. You can critique by agreeing with the paper, for example by presenting concurring evidence or drawing additional connections that were not included in the paper. Other critiques might include: Do you buy the argument, or is it flawed or just confusing? Is the empirical evidence convincing, or is it missing something? Is the design still relevant today, and does it apply to other important problems? Can you compare the paper or system to other work you know about in the area, especially work that was done after the paper was published? Was there some aspect of the paper that was confusing? An important kind of critique is asking questions about the design. You do not necessarily need to answer questions that you pose. Usually this part will contain first-person pronouns like "I" and "me".

Because the two parts of the blog post are related (the second part critiques the same aspect that the first part reconstructed), you will usually have to have the second part in mind when choosing which aspect to reconstruct. Typically, each part of your blog will be one relatively short paragraph of just 3–5 sentences. (So your entire blog will be two such paragraphs.) Try to focus on just one aspect of the paper about which you have something interesting to say. If you have several ideas, pick just one.

Your blog post may also extend a discussion started by someone else. When writing this type of post, you still need to reconstruct and critique. For example, you might add or better explain details that the original post missed in your reconstruction, and then use these additional details to respond to the original critique (either by strengthening the original argument by presenting new evidence, or by arguing for a different conclusion by presenting other evidence).

Blog posts are graded on a completion basis with 25 points for each part. Occasionally we will award an additional 10 points or so for especially insightful posts.

Blog posts are due the night before the lecture in which the paper is discussed. See below for the late policy.

• Problem sets: 50 points per assignment for 3 assignments, totaling 150 points
• 250 free points due to canceled problem sets, totaling 250 points
• Lab 1: 320 points
• Lab 2: 325 points
• Lab 3: 355 points plus 10 free points due to a typo on an earlier version of this page
• Lab 4: 495 points
• Blog posts: 50 points per post for 8 posts, totaling 400 points
• Total points possible: 2305
There is no midterm or final exam.

The course is not curved. Your grade on a 4.0 scale is computed by the following formula. $\min\left( \left.\left\lfloor\frac{\mathrm{points}}{50}\right\rfloor \middle/ 10 \right.,\ 4.0 \right)$ In other words, your grade will be computed by the following table:

≥2000 4.0
≥1950 3.9
≥1900 3.8
≥1850 3.7
≥1800 3.6
≥1750 3.5
≥1700 3.4
≥1650 3.3
≥1600 3.2
≥1550 3.1
≥1500 3.0
etc. ...

This grading scheme may be somewhat unfamiliar to you. We will discuss it on the first day of lecture. Be sure you understand it, and feel free to ask any questions about it.

An important consequence the additive points-based grading scheme is that there is a sense in which every assignment is optional. You are most welcome to select a grade to target and strategically decide to simply not turn in assignments whose points you do not need.

Since you will receive the same number of points as your partner for your combined work on labs 2-4, it is essential during partner-matching that you communicate expectations about grade achievements. If you find yourself stuck in a situation where your partner wants to do significantly more or less work than you do, please contact the staff.

Most students find that they are able to complete all of the assignments to a high degree of quality, but that the assignments require a decent amount of effort. Students find the class time-consuming but rewarding, and grades in this class are generally very high.

If you are taking the class S/NS, note that per university policy, undergraduates receive S credit for any grade 2.0 or higher, while graduate students (including BS/MS students) receive S credit for any grade 2.7 or higher. Thus, an undergraduate would need to earn at least 1000 points to receive S credit, while a graduate student would need to earn at least 1350 points to receive S credit.

## Late Policy

Each kind of assignment has a separate late policy. Be sure you understand the differences. Contact course staff if you have any questions, or if you need an extension.

• For every problem set, there is a 48-hour grace period, during which work is accepted without penalty. No credit after the grace period expires.
• Each lab comes with a 5-day grace period, during which work is accepted without penalty. No credit after the grace period expires.
• Blog posts are due the night before the lecture in which the paper is discussed. Or, you can turn in a blog post within 48 hours after the original deadline for half credit. After 48 hours, no credit will be given.

Note that, unlike some other course policies you might be familiar with, in this class there is no cap on how much total grace time you can use over the quarter. You can use all 48 grace hours on every single problem set, and all 5 grace days on every lab and still get full credit. (But blog posts are different, since the 48 hour period on a blog post is for half credit only.)

Finally, if you have an extenuating circumstance that causes you not to be able to complete the work on time, please contact the staff. Often, we are able to work something out that is agreeable to everyone involved.