CSE logo University of Washington Computer Science & Engineering
 CSE 490h: Problem-solving on large-scale clusters: theory and applications
  CSE Home   About Us    Search    Contact Info 

 Home
 Calendar & Readings
 Grading & Objectives
 Projects
 Staff
 Anonymous Feedback
 Mail archiveCSE only
   

The following is a list of topics covered in this course, and their approximate order. Readings are expected to be completed before lecture of that day.

Wed, Jan 3

We will explore the underlying concepts behind the MapReduce system: fold and map.

Project: Project 0 released and completed Cancelled due to problems with cluster
Reading: None
Slides: PowerPoint
Ungraded Quiz: Student survey

Mon, Jan 8

In day 2, we will introduce parallel systems. In particular, we will discuss design considerations for creating a parallelized architecture.

Project: None; begin experimenting with HaDoop Cancelled due to problems with cluster
Reading: None
Slides and Exercises: PowerPoint slides and group exercises
Quiz: fold() and map() review

Wed, Jan 10

Together with our understanding of fold and map, we will describe the design of the MapReduce system and go over the HaDoop implementation of MapReduce.

Project: Project 1 released; begin thinking about project 2
Reading: MapReduce paper and discussion questions Cancelled due to shortened lecture
Slides: PowerPoint
Quiz: data dependancies quiz Cancelled due to shortened lecture

Tue, Jan 16

Alden's lab hours cancelled

Wed, Jan 17

Albert and Hannah will have make-up office hours from 2:30-3:30 in the lab and will be available for scoping project 2 proposals.

In class, we will discuss MapReduce paper in groups, focusing on its design assumptions and tradeoffs.

Project: Project 1 continues; continue thinking about project 2
Reading: MapReduce paper and discussion questions
Slides: A paper discussion doesn't have slides

Thu, Jan 18

Project: Project 1 due at 6 PM

Fri, Jan 19

Project: Project 2 initial proposals due (via email)

Mon, Jan 22

We will do an in-depth study of Sawzall, Google's logging infrastructure built on MapReduce.

Project: Incorporate project 2 feedback
Reading: Sawzall
Slides: A paper discussion doesn't have slides

Wed, Jan 24

Tech talks from Googlers

Project: All project 2 feedback returned; begin project work in earnest
Reading: None
Slides: No slides from this tech talk

Fri, Jan 26

Project: Final project 2 proposals due

Mon, Jan 29

More tech talks from Googlers

Project: Continue working on project 2
Reading: None
Slides: Not yet posted

Wed, Jan 31

We will give a high-level overview of the GFS architecture and introduce principles of distributed system design.

Project: Continue working on project 2
Reading: Introduction to Distributed Systems
Slides: PowerPoint with accompanying worksheet

Mon, Feb 12

Presentation day will occur several weeks after lectures have completed. Students will have a chance to present their completed project 2 to Google engineers as well as other students.

Project: Project 2 completed
Reading: Not yet posted
Slides: Not yet posted