CSE599D: Advances in Query Processing and Query Optimization

Modern query engines face the dual challenge of increased data size and increased query complexity. SQL queries are increasingly generated by machines (GenAI, applications, user interfaces), can become very complex, and require novel approaches for optimization and for execution. This class covers some query optimization and execution techniques that have been introduced recently to address these challenges. Starting from the two state of the art optimization methods (the Cascades framework, and the Dynamic Programming framework), we plan to discuss extensions to handle large number of joins (hundreds), modern adaptations of Yannakakis' algorithm, factorized databases, approaches that forgo optimization (SkinnerDB and Robust plans), Worst-case Optimal Joins and its implementation in Umbra and in FreeJoin, the use of specialized operators (triangle query or matrix multiplication), and some applications of information theory to cardinality estimation and to query evaluation. The course will consists of lectures, paper readings, in-class discussions, and light homework assignments.

Pre-requisites: a prior course in data management (344 or 444 or 544)

Instructor: Dan Suciu

Lectures: Monday, Wednesday 10-11:20 Room CSE2 271

Calendar here. Please subscribe.

Lecture list: here (content of future lectures is tentative).

Canvas: here.

Ed discussion board: here

Homework list here. Please submit on canvas I strongly recommend using Latex to write your answers. I prepared a simple template here; you may use it, or use your own preferred Latex style.

Zoom link: on canvas.

Grading:

By default, the course is Credit/No-credit.

If you need a numerical grade, send me an email. In that case I will ask you to submit all the homework assignments Notice that, if you want to use the course for CSE++ PhD course requirements or any BSMS course requirements you need to take the course for a numerical grade. A CR/NC will not count towards the course requirements, instead it will function more like a 590/591.

Privacy policy and terms of use