CSE 453/M553 (490L/599L): Data Center Systems

Course Overview

Warehouse scale data centers have become the dominant computing infrastructure for large scale applications, from online scalable services like Facebook and Zoom, as well as enterprise computing of all sorts. Data centers are also the fastest growing segment of the computer industry, with intense competition and rapid adoption of new technology innovations, changing almost every aspect of how computer systems are built and used.

This course will be a cross-disciplinary investigation of the technologies underlying next generation data centers and some of the challenges needed to leverage those technologies.

Pre-requisite: CSE 332 and 333; recommended: CSE 451 or 452.

We believe the best way to learn the material is to implement the ideas presented in the course, and so there is a substantial programming project, written in Rust. There will be no midterm or final exam. The course is not curved.

The calendar page provides a detailed syllabus for the course, including readings and lab due dates. In addition, we are using canvas, gitlab, edstem, and gradescope for various parts of the course. We would like to thank the Whiteley Center for hosting some of the initial planning work for the first version of this class.

Lecture and Sections

The calendar page provides lecture and section topics, assigned readings, and pointers to slides and lecture notes. Lecture (and hopefully section) videos will be available through the Canvas Panopto tab.

We have set up Panopto automated lecture capture for those who need to miss class, and we hope to do something similar for one of the sections. We encourage students to use the video only when it is not possible to attend class, or for review purposes. Despite our best effort, it is likely the video capture will fail to work some percentage of the time (e.g., if the microphone doesn't work). It is also best to get your questions answered in real time.

We plan to have frequent exercises tied to each topic. In addition, we plan to use some of the sections for an extended group design exercise. These will generally be due one week after the relevant lecture/section. These assignments will appear on gradescope.

Where there is assigned reading, you should do the reading before class.

Most sections will focus on the labs and it is likely that you will need to attend to be able to complete the assignments.

Lectures MWF 3:30-4:20 Tom
Section LA Thurs 1:30-2:20 Anirudh, Guramrit
Section LB Thurs 2:30-3:20 Anirudh, Kevin
Section LC Thurs 3:30-4:20 Kevin, Guramrit

Instructor Contact and Office Hours

The staff email address will likely be answered more quickly than directed to individual TA's.

Person, Role email Office Hours
Any Staff cse453-staff@cs
Tom Anderson, Instructor tom@cs Monday 7:00 - 8:00pm (on Zoom)
Anirudh Canumalla, TA anirudhc@cs
Jiuru Li, TA lij93@cs
Guramrit Singh, TA gsingh98@cs
Steven Su, TA suz8@cs
Kevin Zhao, TA kwzhao@cs

Project

The project for the course consists of three labs and a fourth open-ended assignment. The labs ask you to build key parts of a multi-tier web application, including caching, bounded stale cache consistency, and gossip-based load balancing. The open-ended assignment can extend one of the earlier labs or implement some other concept related to the course (such as a resource orchestrator). We also welcome prototypes of labs we could use in future iterations of this class.

Students may work on the labs in groups of 2 or 3, and groups can include a mixture of undergraduates and masters students. Note however that our expectations for the open-ended assignment will scale up proportionately for groups of three and in proportion to the number of masters students participating in each group.

The first three labs (as well as the getting started lab 0) are written in Rust, a language we chose because it is both cool and (once mastered) will lead you to spend less time debugging without sacrificing real-time performance.

For flexibility, we will automatically grant each group six slip days for the lab assignments, for you to use at your discretion on labs 1, 2, and 3. These are calendar days - weekends and holidays count.

There will be a 1% per-day penalty applied to that lab assignment grade for any group who turns in a late assignment, once you have used up your slip days. This is small enough that if you need to take a few extra days to complete an assignment, it is generally worth it to do so, but you should turn it in once you have stopped making progress. This late penalty will be applied at the time we calculate final grades - we will otherwise grade those assignments normally.

There are no slip days for blogs, problem sets, or for the open ended assignment due at the end of the quarter. We will not be able to grade any of these assignments that are turned in late..

Grades

There is no midterm or final exam.

The course is not curved. We would like nothing better than to give everyone an A. We do not consider small deductions particularly significant.

If you are taking the class Sat/UnSat, the cutoff for a SAT is roughly 65%. Other grades are interpolated between those extremes.

Readings

The textbook for this course is the third edition of The Datacenter as a Computer, by Barroso, Holzle, and Ranganathan. You can request a free PDF of this book from the publisher website linked above.

In parts of the class, we will go into topics at much greater depth than the textbook. For these, we will assign a set of recent research papers as well as papers describing industrial practice. These readings are to be done before the class discussion, as we will take the papers as a starting point, not the end point, of the class discussion.

All of the readings are available for free for UW students; this should work automatically even if you are off campus (via your UW NetID). If you run into trouble, postpend offcampus.lib.washington.edu to the domain name inside the link, as in, http://dl.acm.org.offcampus.lib.washington.edu/citation.cfm?id=359563.

Readings to be completed before each lecture will be listed on the course calendar. Note: We expect to add readings to the initial syllabus.

Blogs

For the papers in the reading list, we will post a few discussion questions to Canvas a few days prior to the lecture. We would like you to practice reading the papers critically -- to try to understand them, but also to relate them to other material that you may know (e.g., from summer internships).

To that end, prior to the start of the class discussing each paper, students are required to post a short reply to the discussion thread in Canvas with a comment, observation, or question about the paper being discussed. You may use one of the questions we provide, or pose a question of your own. Valid posts can also take a position on a design choice in the paper, discuss the applicability of the paper to an important problem, or add some recent perspective. Posts may also pose a question about the paper, e.g., if you you found something confusing. You may also answer someone else's question. You can also explain why you don't agree with someone else's post (debate is good!).

Posts can be as short as a few sentences; a single paragraph is normal. We aren't looking for essays.

Blog posts are graded, but we throw out the lowest few scores. Undergraduates are required to blog the floor of half of the papers listed; we encourage you to read the blogs for the remainder. Masters students are required to blog floor of 80% of the papers in the reading list. We will not grade blog entries posted after the class period begins.

We grade blog posts on a five point scale. We give extra credit if we thought you had something insightful to say, and full credit if your post convinced us that you engaged with the paper. Less than full credit implies you should try to be more substantive in your replies.

Discussion board

We will be using edstem for mediating questions about the lectures, labs, and exercises.

Academic Honesty

Please read CSE's Academic Misconduct Policy.

For the labs, you are encouraged to ask for help, from the instructor, from the TA's, and from other students. A general rule of thumb is to ask for help if you remain stuck for more than half an hour.

However, do not cross this line: Do not share code or text. Do not look at project solutions that might be on the Internet. Do not use someone else's code or text in your solutions. Sharing ideas, explaining your code to someone to see if they know why it doesn't work, even helping someone else debug if they've run into a wall, all that is ok.

Similarly, discussions in class about the homework exercises are allowed, but otherwise all work on the exercises must be done individually. You may ask clarifying questions of the TA's and the instructor.


Anonymous Feedback

Anonymous feedback can be sent to the instructor or TAs via feedback.cs.washington.edu.