Reliable Software Systems
DESCRIPTION Nowadays, software engineers build pieces of systems that rely on other systems, and other systems rely on them. In this interconnected world, every engineer needs to know how to identify and mitigate failures in their system. Furthermore, many companies even have their software engineers doing operational/oncall work for their own systems, so engineers should be even more motivated to build reliable systems. This seminar will help students become familiar with various industry practices for creating and running reliable software systems.
ADMINISTRATIVE Instructor: Alyssa Pittman
Contact: smooo [at] cs.washington.edu
(yes that's three o's, make sure you get them all)

Expectations: This is a one credit, C/NC course. Please show up, ask questions, and discuss!

Feedback: I welcome your input, either in person or online (supports anonymous feedback).
Don't know how to say it? Check out the SBI model for a structure of how to give effective feedback.

SCHEDULE

Week 1: Reliability? Systems?

Slides: PDF
Motivating outage: Maersk Shipping
Optional readings:

Week 2: Expect Failure

Slides: PDF
Motivating outage: Slack
Optional readings:

Week 3: Monitoring

Slides: PDF
Motivating outage: Instapaper
Demo: Honeycomb monitoring
Optional readings:

Week 4: Preproduction

Slides: PDF
Motivating example: Healthcare.gov launch
Optional readings:

Week 5: Production

Slides: PDF
Motivating example: Windows 10 October 2018 Update
Optional readings: Bonus interview on Chaos Engineering since we ran out of time in class: Testing in Production the Netflix Way

Week 6: Data Integrity

Slides: PDF
Motivating example: Gitlab data loss
Optional readings:

Week 7: Designing Scalable APIs

Slides: PDF
Motivating example: Twitter v1 API outages and retirement
Optional readings:

Week 8: Scalable Design Patterns

Slides: PDF
Motivating example: Foursquare
Optional readings:

Week 9: Team Culture

Slides: PDF
Motivating example: Titan II
Optional readings:

Week 10: Students' Choice: Netflix, Youtube, Twitter Architecture

Slides: PDF
Motivating example: early Netflix
Optional readings: