CSE 452: Distributed Systems

Course Overview

Distributed systems have become central to many aspects of how computers are used, from web applications to e-commerce to content distribution. This senior-level course will cover abstractions and implementation techniques for the construction of distributed systems, including client server computing, the web, cloud computing, peer-to-peer systems, and distributed storage systems. Topics will include remote procedure call, preventing and finding errors in distributed programs, maintaining consistency of distributed state, fault tolerance, high availability, and scaling.

Alumni of this course say that it is among the most intellectually challenging courses that they have taken at UW, and among the most relevant to their future careers. We will attempt to live up to that recommendation. We believe the best way to learn the material is to implement the ideas presented in the course, and so there is a substantial programming project. There will be no midterm or final exam. The course is not curved.

The calendar page provides a detailed syllabus for the course, including readings and lab due dates. In addition, we are using canvas, gitlab, edstem, and gradescope for various parts of the course.

Lecture and Sections

The calendar page provides lecture and section topics, assigned readings, and pointers to videos and slides. Videos are also available through the Canvas Panopto tab (for pre-taped videos) and Zoom recordings (for live-taped recordings).

We have pre-taped videos for most of the content we would normally present live in lecture, so that we can make the scheduled "lecture" slot more interactive. Much of the lecture time will use zoom for small group exercises - while the main portion will be taped, the small group exercises will not be. If you aren't attending live, you will need to do these exercises on your own. You do not need to ask our permission for that.

In place of multi-topic problem sets, we will have smaller, more frequent exercises tied to each topic. These will generally be due one week after the relevant lecture. These assignments are on gradescope.

In addition, there will be some graded in-class small group assignments. These serve two purposes. One is to ask students to explore possible solutions to problems, such as load balancing, before we describe how things are done in practice. This approach is called constructive failure, and has been shown to lead to better learning outcomes than the traditional lecture approach. In addition, we will also do some group design exercises later in the quarter.

Our intent is to keep the amount of content the same as previous quarters (in particular the labs are similar), so we may not use all of the assigned lecture time. The recorded lectures will be as long as needed for a particular topic - they are not in 50 minute chunks.

We ask everyone to complete any assigned videos before the relevant live class. Since you won't be able to interrupt these videos for questions, we ask you to pause and post to EdStem any question you have while watching - we'll create a thread for each class to gather those in one place. We will try to answer them directly or live in class.

Where there is assigned reading, our recommendation is that you watch the video first, then do the reading, then come to class, and then do the assigned exercises for the class.

For the live portion of lecture and section, we will set things up so that your audio will be muted when you log in. Even a small amount of background noise can become distracting to everyone else. You may use the zoom chat room to ask further questions, and it is also ok to unmute and interrupt.

Sections will largely focus on the labs and you will need to attend to be able to complete the assignments.

The instructor and TA's will attempt to remain long enough to answer all pending questions at the end of lecture/section.

Video capture of lectures and sections should appear in Canvas a few hours after the end of the lecture/section. If this doesn't happen, ping us in EdStem - the postprocessing by Zoom is a bit unpredictable, and then there is a manual step where we need to release the video.

We encourage students to use the video for review purposes only. Despite our best effort, it is likely the video capture may fail to work some percentage of the time and it is best to get your questions answered in real time.

Lectures MWF 3:30-4:20 Tom
Section AE Thurs 11:30-12:20 Aileen, Andrew, Guramrit
Section AA Thurs 12:30-1:20 Luxi, Jiriu, Samantha
Section AB Thurs 1:30-2:20 Anirudh, Roy, Adnan
Section AC Thurs 2:30-3:20 CJ, Dao, Eddy

Instructor Contact and Office Hours

The staff email address will likely be answered more quickly than directed to individual TA's.

Person, Role email Office Hours
Any Staff cse452-tas@cs
Tom Anderson, Instructor tom@cs Wednesday 4:30 - 5:30pm
Adnan Ahmad, TA adnana2@cs Tuesday 6:00 - 8:00pm
Aileen Zeng, TA aileenz@cs Monday 1:30 - 2:30pm
Andrew Wei, TA nowei@cs Tuesday, Wednesday 8:30 - 9:30pm
Anirudh Canumalla, TA anirudhc@cs Friday 9:30 - 10:30am
CJ Lin, TA xijiel@cs Monday 10:30 - 11:30am
Dao Yi, TA daoyee@cs Thursday 5:00 - 6:00pm
Eddy Zhou, TA zty0911@cs Monday 8:30 - 9:30pm
Guramrit Singh, TA gsingh98@cs Thursday 4:00 - 5:00pm
Jiuru Li, TA lij93@cs Tuesday 2:30 - 3:30pm
Luxi Wang, TA louis99@cs Tuesday 5:30 - 6:30pm
Raden (Roy) Pradana, TA rrp2901@cs Wednesday 9:30 - 10:30am
Samantha Miller, TA sm237@cs Wednesday 11:00 - 12:00pm

Project

The core of the course is to build a highly available, scalable, fault tolerant, and transactional key-value store. Key-value stores are widely used in cloud computing. The project is written in Java, derived from a similar one designed for the MIT graduate distributed systems course. Relative to earlier versions, we have completely re-written the testing and debugging framework for the labs, and we have extended the project to include multi-key transactions. Each lab has an extensive test suite that you can use to validate your implementation.

Lab 0 and Lab 1 of the project are to be done individually. (Note that Lab 0 does not have a turnin.) The other labs (2-4) are to be done in teams of 2. After the introductory labs (lab 0 and lab 1), each of the other labs is due roughly every three weeks. Note that the individual exercises are to be turned in individually.

The labs are autograded by a model checker, and that means you will need to come up with solutions that work in all cases. We strongly recommend that you think through your design before writing any code. Debugging typos can be laborious but is a reality we all face. Debugging design errors by testing is extremely time consuming, particularly if you are rushing to meet a deadline.

We strongly recommend you get an early start on the labs. Many students underestimate the difficulty of the assignments, and leave themselves too little time to finish in time. The most common comment about the labs is that students wished they had gotten an earlier start.

All of the labs are available, and we have taped versions of all of the relevant lectures and sections describing each lab, available under the Canvas Panopto tab.

However, since it is so common for students to run out of time, we would like you to focus on getting the project done, and less on the implication for your grade. For flexibility, we will automatically grant each group six slip days for the project assignments, for you to use at your discretion on labs 1, 2, and 3. These are calendar days - weekends and holidays count. There are no slip days for problem sets. Regardless of your remaining slip days, all assignments must be turned in by the end of the day of the final exam for this class, at 11:59pm on June 10. In other words, slip days are not available for the last assignment. Generally speaking, students who fall behind find it difficult to complete the final lab. That is, it is best to plan to use zero slip days - they are meant for unforeseen circumstances.

There will be a 1% per-day penalty applied to that lab assignment grade (note exercises may not be turned in late) for anyone who turns in a late assignment, once you have used up your slip days. is small enough that if you need to take a few extra days to complete an assignment, it is generally worth it to do so, but you should turn it in once you have stopped making progress. This late penalty will be applied at the time we calculate final grades - we will otherwise grade those assignments normally.

Finally, students sometimes figure out how to do a lab only after they turn it in. That's ok! We will allow you to turn in a revised lab 2 or lab 3, for half credit of any points you missed, with the same due date as the lab 4 deadline. Lab four builds on lab three. You can get partial credit on some parts of lab 4 without a completely working lab 3, but you will need a mostly working version of lab three before moving onto lab four.

The TA's from last quarter have written a set of helper notes for navigating the labs, and we will pin those in Edstem.

Grades

There is no midterm or final exam.

The course is not curved. Most students find that they are able to complete all of the assignments to a high degree of quality. We do not consider small deductions particularly significant; you can get a few points off per assignment and still get an A.

If you are taking the class Sat/UnSat, note that the cutoff for a SAT is roughly 65%. That is, it is perfectly acceptable to completely skip lab 4 and still pass the course. Other grades are interpolated between those extremes.

Readings

There is no textbook for this course. Instead, we will assign various tutorial and research papers; these readings are to be done before the class discussion, as we will take the papers as a starting point, not the end point, of the class discussion.

All of the readings are available for free for UW students; this should work automatically even if you are off campus (via your UW NetID). If you run into trouble, postpend offcampus.lib.washington.edu to the domain name inside the link, as in, http://dl.acm.org.offcampus.lib.washington.edu/citation.cfm?id=359563.

Readings are to be completed before each lecture are listed on the course calendar.

Blogs

For some of the research papers in the reading list (excluding some of the tutorials), we will post a few discussion questions a few days prior to the lecture where we plan to discuss the paper.

We would like you to practice reading the papers critically -- to try to understand them, but also to relate them to other material that you may know, or to later work that occurred after the original publication.

To that end, we will divide students into online discussion groups, one per section, organized through Canvas. Prior to the start of the class discussing each paper, students are required to post a short reply to the discussion thread in Canvas with a comment, observation, or question about the paper being discussed. You may use one of the questions we provide, or pose a question of your own. Valid posts can also take a position on a design choice in the paper, discuss the applicability of the paper to an important problem, or add some recent perspective. Posts may also pose a question about the paper, e.g., if you you found something confusing. You may also answer someone else's question. You can also explain why you don't agree with someone else's post (debate is good!).

Posts can be as short as a few sentences; a single paragraph is normal. We aren't looking for essays.

Blog posts are graded, but we use the best four over the quarter so you don't need to write more than four posts (out of eight papers). We will not grade any blog entries posted after the class period begins.

We grade blog posts on a five point scale. We give extra credit if we thought you had something insightful to say, and full credit if your post convinced us that you engaged with the paper. Less than full credit implies you should try to be more substantive.

Discussion board

We will be using edstem for mediating questions about the lectures, labs, and exercises.

Academic Honesty

Please read CSE's Academic Misconduct Policy.

For the project, you are encouraged to ask for help, from the instructor, from the TA's, and from other students. A general rule of thumb is to ask for help if you remain stuck for more than half an hour.

However, do not cross this line: For the project, do not share code or text. Do not look at project solutions that might be on the Internet. Do not use someone else's code or text in your solutions. Sharing ideas, explaining your code to someone to see if they know why it doesn't work, even helping someone else debug if they've run into a wall, all that is ok.

Similarly, discussions in class about the homework exercises are allowed, but otherwise all work on the exercises must be done individually. You may ask clarifying questions of the TA's and the instructor.


Anonymous Feedback

Anonymous feedback can be sent to the instructor or TAs via feedback.cs.washington.edu.