CSE 344, Spring 2018
Intro to Data Management
Course Information and Policies
Instructor: Evan McCarty CSE 214
Lecture: Monday, Wednesday, Friday 9:30-10:20 MLR 301
Contact Information and Office Hours
Course Email List: You should automatically be subscribed to this list and receive important email. For the most part, expect to receive announcements through the course piazza page.
Course Staff and Office Hours:
Instructor: Evan McCarty: ejmcc@cs.washington.edu Monday and Wednesday, 12:30-1:30 or by appointment; Room CSE 214
TAs:
Sravan Konda:sravan75@uw Tuesday 3:30-4:20, CSE 007
Ariel Lin:arielin@uw Friday 2:30-3:20, 2nd Floor Breakout andMonday 2:00-2:50, CSE 007
Matthew Liu:liux44@uw Wednesday 1:30-2:20, 4th Floor Breakout
Michelle Prawiro:mp19@uw Monday 11:30-12:20, CSE 007 and Friday 11:30-12:20, 2nd Floor Breakout
Jason Tan:jct96@uw Wednesday 4:30-5:20, CSE 220
Lecture Materials
By end of day, lecture slides will be posted here from the day's lecture. Topics for the following lecture will be uploaded along with that day's slides', along with relevant chapters from the Database Systems textbook by Garcia-Molina, Ullman and Widom (GUW). Readings that are required before lecture will be indicated in bold.- 1. March 26th: Course Introduction and Motivation [ pptx | pdf ] No reading from GUW
- 2. March 28th: The Data Model and introduction to relational databases [ pptx | pdf ] GUW 2.1-2.2
- 3. March 30th: Joins [ pptx | pdf ] GUW 6.1-6.2
- 4. April 2nd: Grouping and Aggregation [ pptx | pdf ] GUW 6.4
- 5. April 4th: Subqueries [ pptx | pdf ] GUW 6.3
- 6. April 6th: Relational Algebra [ pptx | pdf ] GUW 2.4
- 7. April 9th: Datalog [ pptx | pdf ] GUW 5.3
- 8. April 11th: Datalog [ pptx | pdf ] GUW 5.4 Souffle Guide
- 9. April 13th: Intro to Semi-structured data [ pptx | pdf ] GUW 11.1 Comparing relational to semi-structured and distributed data bases
- 10. April 16th: Data Management with Semi-structured data [ pptx | pdf ] GUW 11.2-11.4; Note that the textbook uses XML, not JSon
- 11. April 18th: SQL++ [ pptx | pdf ] SQL++ Manual
- 12. April 20th: Physical Plans [ pptx | pdf ] GUW 15.1-15.2
- 13. April 23rd: Indexing [ pptx | pdf ] GUW 14.1-14.3,15.6
- 14. April 25th: Disk Accesses [ pptx | pdf ] GUW 15.2-15.3
- 15. April 27th: Plan Cost Estimation [ pptx | pdf ] GUW 15.2-15.3
- 16. April 30th: Intro to Parallel Databases [ pptx | pdf ] GUW 13.3,20.1,20.3
- 17. May 2nd: Map/Reduce [ pptx | pdf ] GUW 20.2
- 18. May 4th: Map/Reduce II [ pptx | pdf ] GUW 20.2
- 19. May 7th: Exam Review [ pptx | pdf ] Practice Midterm. Solutions.
- 20. May 9th: Midterm Exam [ No Slides ]
- 21. May 11th: Entity Relations [ pptx | pdf ] GUW 4.1-4.3
- 22. May 14th: E/R constraints [ pptx | pdf ] GUW 4.3-4.6
- 23. May 16th: Normalization [ pptx | pdf ] GUW 3.1-3.3
- 24. May 18th: Lossless Decomposition and SQL Views [ pptx | pdf ] GUW 3.4-3.5
- 25. May 21st: Transactions [ pptx | pdf ] GUW 18.1-18.3
- 26. May 23rd: Scheduling [ pptx | pdf ] GUW 18.3-18.5
- 27. May 25th: Isolation [ pptx | pdf ] GUW 18.3-18.5
- 28. May 30th: Analysis and Ethics [Not on Final Exam] [ pptx | pdf ] Bad Data Science: Debt and Growth NYPD CompStat
- 29. June 1st: Review [ pptx | pdf ] Practice Final. Solutions.
Sections
Section material distributed to TAs will be made available here. Solutions to problems posted here must be gotten in section from the TA. Sections (All times on Thursdays): AA: Matthew Liu - 8:30 MGH 238 AB: Matthew Liu - 9:30 MGH 242 AC: Sravan Konda - 12:30 MGH 228 AD: Jason Tan - 1:30 DEN 212TA led sections will be held weekly on Thursdays. You should expect to go to your registered weekly section. They will be incredibly helpful for review, applicable practice of the material, and hints on your homework. Please bring your laptop to section so that you can follow along with examples provided in the section.
- Section 1: Setting up Git and SQLite Help with setup
- Section 2: Basic SQL Slides Worksheet Solution
- Section 3: Relational Algebra Slides Worksheet Solution
- Section 4: Datalog Slides Worksheet Solution
- Section 5: SQL++ Slides Worksheet Solution
- Section 6: Cost Estimation + Parallel Slides Worksheet Solution Cost Estimation Guide
- Section 7: Map/Reduce + Spark Slides Worksheet Solution Extra Questions Solution
- Section 8: Design Theory Slides Worksheet Solution
- Section 9: Transactions Slides Worksheet Solution
Homework Assignments
Turn in your assignments through the Canvas course page. In general, homework will be posted on Wednesdays and due the following Wednesday at 11:30 (for coding assignments) and 11:00 for the online quizzes. Use git pull upstream master to get the new assignments
Coding Assignments: 30% of your grade
- HW #1. Due Wednesday, April 4th, 2018, 11:30pm
- HW #2. Due Wednesday, April 11th, 2018, 11:30pm
- HW #3. Due Wednesday, April 18th, 2018, 11:30pm
- HW #4. Due Wednesday, April 25th, 2018, 11:30pm
- HW #5. Due Wednesday, May 2nd 2018, 11:30pm
- HW #6. Due Wednesday, May 16th, 2018, 11:30pm
- HW #7. Due Wednesday, May 23rd, 2018, 11:30pm
- HW #8. Due Friday June 1st, 2018, 11:30pm
Written Quizzes: 10% of your grade
- Quiz #1. Due Friday, April 6th, 2018, 11:00pm
- Quiz #2. Due Friday, April 13th, 2018, 11:00pm
- Quiz #3. Due Friday, April 13th, 2018, 11:00pm
- Quiz #4. Due Wednesday, April 18th, 2018, 11:00pm
- Quiz #5. Due Wednesday, April 25th, 2018, 11:00pm
- Quiz #6. Due Wednesday, May 23rd, 2018, 11:00pm
- Quiz #7. Due Wednesday, May 30th, 2018, 11:00pm
Exams
The midterm for this course will be Wednesday, May 9th, from 9:30-10:20 in MLR 301 and will be 25% of your grade. Here is the practice midterm. Here are solutions. Also, Here is a collection of previous 344 exams. The final for this course will be Wednesday, June 6th from 8:30 - 10:20 in MLR 301 and will be 35% of your grade.The textbook is Database Systems: The Complete Book by Hector Garcia-Molina, Jeffrey D. Ullman and Jennifer Widom, 2nd edition
Acknowledgments: Many of the materials posted here and used in the course have been shared and refined by many other instructors and TAs in previous offerings of CSE344. This version of the course was particularly based on previous offerings by Profs. Cheung and Suciu