Welcome to DATA516/CSED516: Scalable Data Systems and Algorithms!

In this course, we will study the specialized systems and algorithms that have been developed to work with data at scale, including parallel database systems, MapReduce and its contemporaries, graph systems, and streaming systems. We will also go over core techniques of cloud platforms; and important scalable algorithms.

Instructor: Dan Suciu; Office hours: Wednesday 3:30-4:20pm

TA: Remy Wang; Office hours: Fridays 2:00-3:00pm

TA: Zechariah Cheung

Canvas (for zoom link and grades): here








Reading assigned papers and writing short statements (15%)

Each statement should be at most one page in length written as a set of bullet points. The statement should demonstrate that you read and thought about the paper.

Homework assignments (60%)

Three assignment involving three big data systems: Redshift, Spark, Snowflake. Some mini-assignments using stream and/or graph databases

Short final project (25%)

Each week, after lecture, we will have a 50-minute section that will give you hands-on demonstrations and tutorials of various big data systems and cloud services. Each section will be connected to either a full assignment or a mini assignment.

Your gitlab repository:

https://gitlab.cs.washington.edu/csed516-2020au/csed516-YourUwIdHere