CSEP 590A Syllabus

Syllabus (Updated: 04/26/2006)

If you need a login to retrieve any of these papers, use the UW Library Off-Campus Proxy system.

Week 1 (3/29/2006): Motivating Examples
- Eric Brewer, Lessons from Giant Scale Services, IEEE Internet Computing, 2001. (pdf)
- Mockapetris and Dunlap, Development of the Domain Name System, SIGCOMM 1995. (link)
- Peterson, Anderson, Culler, and Roscoe. A Blueprint for Introducing Disruptive Change into the Internet. HotNets 2002. (link)
- Guest Speaker: Dennis Lee, Amazon (students that took P551 winter quarter can arrive at 7:30)
Back to Navigation
Week 2 (4/5/06): Distributed Synchronization
- Leslie Lamport. Time, Clocks, and the Ordering of Events in a Distributed System. Communications of the ACM, Vol. 21, No. 7 (July 1978), pp. 558-565. (pdf)
- Chandy and Lamport, Distributed Snapshots: Determining the Global States of a Distributed System, ACM TOCS, pp. 63-75, Feb. 1985. (link)
- F. Cristian, A Probabilistic Approach to Clock Synchronization, Proceedings of the 9th International Conference on Distributed Computing Systems (ICDCS), pages 288-296, Newport Beach, California, June 1989. (link)
- Birman, Chapter 13-14 Ignore
- OPTIONAL: Baboaglu and Marzullo, Consistent Global States of Distributed Systems: Fundamental Concepts and Mechanisms, 1993. Textbook version of the week 3 and 4 material. (link)
Back to Navigation
Week 3 (4/12/06): Process Groups / Causal Ordering
- Slides for this week (CSENetId-protected)
- D.R. Cheriton and D. Skeen, Understanding the Limitations of Causal and Totally Ordered Multicast, SOSP 1993. (link)
- Baboaglu and Marzullo, Consistent Global States of Distributed Systems: Fundamental Concepts and Mechanisms, 1993. Textbook version of the week 3 and 4 material. (link)
- Kenneth P. Birman, The Process Group Approach to Reliable Distributed Computing. Communications of the ACM (CACM), 36(12):37-53, December 1993. (link)
- OPTIONAL: Kenneth P. Birman, A Response to Cheriton and Skeen's Criticism of Causal and Totally Ordered Communication, October, 1993. (link)
- OPTIONAL: Birman, Chapter 15-16
Back to Navigation
Week 4 (4/19/06): Distributed Agreement
- Slides for this week (CSENetId-protected)
- Leslie Lamport, Part Time Parliament, ACM TOCS vol. 16, no. 2, 133-169. (pdf)
- Leslie Lamport, Paxos Made Simple, ACM SIGACT News, 2001. (pdf)
Week 5 (4/26/06): Programming Models: RPC, SOAP, web services, Grid, AJAX
- Slides: Atomic Commit (pdf), GENI (pdf), Consistency (ppt)
- J. Ousterhout. The Role of Distributed State. CMU Computer Science: A 25th Anniversary Commemorative. ACM Press Anthology Series, R. Rashid (Ed.), July 1991. (link)
- Jim Gray, Distributed Computing Economics, Microsoft Technical Report, 2003. (link)
- Tim O'Reilly, What is Web 2.0?, September 2005. (link)
- Philip A. Bernstein, Vassos Hadzilacos and Nathan Goodman. Distributed Recovery. Chapter 7 in Concurrency Control and Recovery in Database Systems. (link)
- Skim the following:
  
  RPC:
  
  Birman, Chapter 4 (preferred)
  or
  Andrew D. Birrell and Bruce Jay Nelson. Implementing Remote Procedure Calls. ACM Trans. on Computer Systems 2(1), February 1984, pp. 39-59. (pdf)
  
  Web services, etc.:
  
  Birman, Chapter 10 (preferred)
  or
  The microsoft developer's guide to web services (link)
Back to Navigation
Week 6 (5/3/06): Fault Tolerance
- Slides:
  - Paxos (slide 24+) (pdf)
  - Paxos Wrapup (slides 1-8) (pdf)
  - State Machine Replication (pdf)
  - More State Machine Replication (pdf)
  - Byzantine Fault Tolerance (slides 135-167) (pdf)
  - Paxos Byzantine Fault Tolerance (pdf)
- Fred B. Schneider, Implementing Fault Tolerant Services Using the State Machine Approach: A Tutorial, ACM Computing Surveys, 22(4): 299-319, December 1990. (link)
- Castro and Liskov, Practical Byzantine Fault Tolerance, OSDI 98. (link) (pdf)
- Lamport, Shostak and Pease. The Byzantine Generals Problem. ACM TOPLAS, July 1982. (pdf)
- OPTIONAL: Lowell, Chandra, Chen. Exploring Failure Transparency and the Limits of Generic Recovery, OSDI 2000. (pdf)
- OPTIONAL: Birman, Chapter 24
Back to Navigation
Week 7 (5/10/06): Weakly Consistent Distributed Systems
- Guest Lecturer: Arvind Krishnamurthy (his slides)
- Demers et al., Epidemic Algorithms for Replicated Database Maintenance, PODC, 1987. (link)
- James Kistler and M. Satyanarayanan. Disconnected Operation in the Coda File System. ACM Trans. on Computer Systems 10(1), February 1992, pp. 3-25. (link)
- Terry et al. Managing Update Conflicts in Bayou, a Weakly Connected Replicated Storage System. SOSP 1995. (pdf)
- OPTIONAL: Dahlin et al., End to End WAN Service Availability, USITS 2001. (link)
Back to Navigation
Week 8 (5/17/06): Scalability and Peer to Peer
- Stoica et al., Chord: A scalable peer-to-peer lookup service for Internet applications, SIGCOMM 2001. (link)
- Cohen, Incentives Build Robustness in Bit Torrent, Workshop on Economics of Peer to Peer Systems, 2003. (link)
- Levis et al., Trickle: A Self-Regulating Algorithm for Code Propagation and Maintenance in Wireless Sensor Networks, NSDI 2004. (link)
- Kleinberg, Navigation in a Small World, Nature, 2000. (pdf)
- OPTIONAL: T. Anderson, M. Dahlin, J. Neefe, D. Patterson, D. Roselli, and R. Wang. Serverless Network File Systems. ACM TOCS 1996. (pdf)
Back to Navigation
Week 9 (5/24/06): Security and Robustness
- Slides for this week: (ppt)
- Ellison and Schneier, Ten Risks of PKI, Computer Security Journal, 2000. (link)
- Lampson, Computer Security in the Real World, 2001. (pdf) (ppt)
- Bellovin and Merritt, Limitations of the Kerberos Protocol, USENIX 1991. (link)
- Anderson et al., Design Considerations for Robust Internet Protocols, HotNets 2002. (pdf)
- OPTIONAL: Birman, Chapter 22
Back to Navigation
Week 10 (5/31/06): Putting it all Together
- Slides for this week:
  - FarSite (ppt)
  - Google File System (pdf)
  - Pastry (ppt)
- Rowstron and Druschel, Storage Management and Caching in PAST, A Large Scale Peer-to-Peer Storage Utility, SOSP 2001. (pdf)
- Adya et al., FARSITE: Federated, Available Reliable Storage for an Incompletely Trusted Environment, OSDI 2002. (pdf)
- Ghemawat et al., The Google File System, SOSP 2003. (pdf)
- OPTIONAL: Birrell et al., Experience with Grapevine: The growth of a distributed system, ACM TOCS 1984. (link)
Back to Navigation

CSEP 590A: Distributed Systems Syllabus

Navigation

Syllabus (Updated: 04/26/2006)

Week 1 (3/29/2006): Motivating Examples

Week 2 (4/5/06): Distributed Synchronization

Week 3 (4/12/06): Process Groups / Causal Ordering

Week 4 (4/19/06): Distributed Agreement

Week 5 (4/26/06): Programming Models: RPC, SOAP, web services, Grid, AJAX

Week 6 (5/3/06): Fault Tolerance

Week 7 (5/10/06): Weakly Consistent Distributed Systems

Week 8 (5/17/06): Scalability and Peer to Peer

Week 9 (5/24/06): Security and Robustness

Week 10 (5/31/06): Putting it all Together

CSEP 590A: Distributed Systems
Syllabus