Reading Assignments
The course will rely on various readings which will be posted here. This page will also have
resources and background reading which may be useful at various times. Copyrighted works may require UWNETID.
The course does not have a fixed text, I will be posting readings for the various subjects, and I encourage you
to look for other resources as well. The course is based on the previous versions of the course taught by Karlin and Lee.
We will start with randomized algorithms, and then progress to hashing techniques and algorithms for streamed data.
The course material will be beyond an undergraduate course, but material will be covered from the ground up.
My assumption is that the typical student in the class has had five years to forget their undergraduate algorithms course.
For background references on algorithms, I recommend the Tardos Klienberg text (algorithm design). For the first few weeks,
I am drawing on the text Randomized Algorithms by Motwani and Raghavan.
For course topics, I will start with some material on randomized algorithms and average case.
On source of relevant material are some online lectures by Tim Roughgarden from a course he taught at Stanford.
These are all on youtube – there is a playlist by Rahul Madhavan with 170 videos.
I recommend: 7.1, 7.2 (Probability), 5.1-5.4 (Quicksort), 8.1-8.2 (Randomized Selection), 9.1-9.5 (Minimum Cuts), Future material, 14 (Hash Tables),
15 (Universal Hashing), 16 (Bloom Filters)
Readings
- January 12, Motwani-Raghavan, Chapter 1, Sections 1.1 and 1.3 (Access requires uw net ID).
- January 19, Kleinberg-Tardos, Chapter 1, Section 1 (Access requires uw net ID).
- January 21, Motwani-Raghavan, Chapter 14, Sections 14.6 (Access requires uw net ID).
- January 21, Motwani-Raghavan, Appendices (Access requires uw net ID).
- January 26, Bloom Filter survey, Broder and Mitzenmacher (a good read)
- January 26, Hashing notes from a CMU course. These look pretty good, although some
of the more theoretical comments are not going to be covered.
- January 26, Burton Bloom's original CACM paper introducing the eponymous filter. (Feel free to ignore the math.)
- January 28, Talk by Mike Mitzenmacher at Georgia Tech
- February 2, Lecture notes from Stanford on Heavy Hitters (Section 2.5 is optional)
- February 9, Hyperloglog from Facebook
- February 9, Over the top slide deck (but interesting)
- February 9, Optional reading. Mikkel Thorup, High Speed Hashing for Integers and Strings. Probably more than you want to know about hash functions.
- February 16, There are a bunch of new topics in this lecture - so it would be useful to look at some basic internet resources (e.g., Wikipedia) on the following topics: Coding Theory,
Block Code, Nearest Neighbor Search, Quadtree, Voronoi Diagram, Fortune's algorithm.
- February 18. Bentley's original KD tree paper
- February 23. Lecture notes from Stanford: Similarity metrics, Dimension Reduction
- February 23. Optional stuff: Andrei Broder's Altavista paper, Document Similarity, chapter by Rajaraman, Leskovec and Ullman
- March 9, 11. Linear Programming. Chapter from DasGupta, Papadimitriou, and Vazarani