CSE 344, Winter 2018

Intro to Data Management

Course Info

Course Information and Policies

Instructor: Evan McCarty CSE 214
Lecture: Monday, Wednesday, Friday 3:30-4:20 SIEG 134

Contact Info

Contact Information and Office Hours

Course Email List: You should automatically be subscribed to this list and receive important email. For the most part, expect to receive announcements through the course piazza page.

Course Staff and Office Hours:
  Instructor: Evan McCarty: ejmcc@cs.washington.edu   Mondays and Fridays, 4:30-6:00 or by appointment; Room CSE 214

TAs:
  Joshua Bean: jbean96@cs   Office hours: Tuesday 1:00-2:00, CSE 023
  Allison Chou: aachou@cs   Office hours: Monday 1:00-2:00, CSE 025
  Colin Evans: colin21@cs   Office hours: Wednesday 2:00-3:00, 5th Floor Breakout
  Jayanth Garlapati: jayanth@cs   Office hours: Monday 11:00-12:00, CSE 220
  Jonathan Leang: jleang@cs   Office hours: Tuesday 4:00-5:00, CSE 023
  Cindy Suripto: cindysuu@cs   Office hours: Tuesday 2:00-3:00, CSE 023
  James Wang: jamesw96@cs   Office hours: Tuesday 10:00-11:00, CSE 220

Lectures

Lecture Materials

By end of day, lecture slides will be posted here from the day's lecture. Topics for the following lecture will be uploaded along with that day's slides', along with relevant chapters from the Database Systems textbook by Garcia-Molina, Ullman and Widom (GUW). Readings that are required before lecture will be indicated in bold.
  1. 1. January 3rd: Course Introduction and Motivation   [ pptx | pdf ]
         No reading from GUW
  2. 2. January 5th: The Data Model and introduction to relational databases   [ pptx | pdf ]
         GUW 2.1-2.2
  3. 3. January 8th: SQLite Demo   [ pptx | pdf | code ]
        
  4. 4. January 10th: Joins   [ pptx | pdf ]
         GUW 6.1-6.2
  5. 5. January 12th: Joins and Group By   [ pptx | pdf ]
         GUW 6.3-6.4
  6. 6. January 17th: Subqueries   [ pptx | pdf ]
         GUW 6.3
  7. 7. January 19th: Subqueries II   [ pptx | pdf ]
         GUW 6.3
  8. 8. January 22nd: Relational Algebra   [ pptx | pdf ]
         GUW 2.4
  9. 9. January 24th: Relational Algebra II   [ pptx | pdf ]
         GUW 2.4
  10. 10. January 26th: Datalog   [ pptx | pdf ]
         GUW 5.3
  11. 11. January 29th: Datalog II   [ pptx | pdf ]
         GUW 5.4
  12. 12. January 31st: Intro to Semi-structured data   [ pptx | pdf ]
         GUW 11.1
         Comparing relational to semi-structured and distributed data bases
  13. 13. February 2nd: Data Management with Semi-structured data   [ pptx | pdf ]
         GUW 11.2-11.4; Note that the textbook uses XML, not JSon
  14. 14. February 5th: SQL++   [ pptx | pdf ]
         SQL++ Manual
  15. 15. February 7th: Exam Prep   [ pptx | pdf ]
         Practice Midterm Practice Solutions
  16. 16. February 12th: Physical Plans   [ pptx | pdf ]
         GUW 15.1-15.2
  17. 17. February 14th: Indexing   [ pptx | pdf ]
         GUW 14.1-14.3,15.6
  18. 18. February 16th: Disk Accesses   [ pptx | pdf ]
         GUW 15.2-15.3
  19. 19. February 21st: Plan Cost Estimation   [ pptx | pdf ]
         GUW 15.2-15.3
  20. 20. February 23rd: Intro to Parallel Databases   [ pdf ]
         GUW 20.1,20.3
  21. 21. February 26th: Map/Reduce   [ pptx | pdf ]
         GUW 20.2
  22. 22. February 28th: Entity Relation   [ pptx | pdf ]
         GUW 4.1-4.3
  23. 23. March 2nd: E/R Part 2 / Transaction Intro   [ pptx | pdf ]
         GUW 4.4,4.5
  24. 24. March 5th: Transactions   [ pptx | pdf]
         GUW 18.1-18.3
  25. 25. March 7th: Transactions   [ pptx | pdf]
         GUW 18.3-18.5
  26. 26. March 9th: Transactions   [ pptx | pdf]
         GUW 18.3-18.5
  27. March 11th: Exam Review. 5pm EEB 045   [ Practice Final | Practice Final Solutions ]

Sections

Sections

Section material distributed to TAs will be made available here. Solutions to problems posted here must be gotten in section from the TA.
Sections (All times on Thursdays):
AA: Joshua Bean - 8:30 ARC G070
AB: Colin Evans - 9:30 ARC G070
AC: Jonathan Leang - 10:30 JHN 175
AD: Jonathan Leang - 12:30 PAA A110

TA led sections will be held weekly on Thursdays. You should expect to go to your registered weekly section. They will be incredibly helpful for review, applicable practice of the material, and hints on your homework. Please bring your laptop to section so that you can follow along with examples provided in the section.

  1. Section 1: Setting up Git and SQLite   Help with setup
  2. Section 2: Basic SQL   Slides   Worksheet   Solution
  3. Section 3: More SQL   Slides   Worksheet   Solution
  4. Section 4: SQL and RA   Slides   Worksheet   Solution
  5. Section 5: Datalog   Slides   Worksheet   Solution
  6. Section 7: SQL++   Slides   Worksheet   Solution   Worksheet 2   Solution 2
  7. Section 8: Cost Estimation and MapReduce   Slides   Worksheet   Solution
  8. Section 10: Design Theory   Slides   Notes
  9. Exam Review   Slides
Homeworks

Homework Assignments

Turn in your assignments through the Canvas course page. In general, homework will be posted on Wednesdays and due the following Wednesday at 11:30 (for coding assignments) and 11:00 for the online quizzes.

Coding Assignments: 30% of your grade

Written Quizzes: 10% of your grade

Exams

The midterm for this course will be Friday, February 9th, from 3:30-4:50 in the normal room for lecture and will be 25% of your grade.
The final for this course will be Thursday, March 15th, from 2:30-4:20 in the normal room for lecture and will be 35% of your grade.

The textbook is Database Systems: The Complete Book by Hector Garcia-Molina, Jeffrey D. Ullman and Jennifer Widom, 2nd edition


Acknowledgments: Many of the materials posted here and used in the course have been shared and refined by many other instructors and TAs in previous offerings of CSE344. This version of the course was particularly based on previous offerings by Profs. Cheung and Suciu