From: Joe Xavier (joexav@microsoft.com)
Date: Tue May 25 2004 - 16:17:59 PDT
Surajit Chaudhuri - An overview of QO in Relational Systems
This is a survey paper covering a number of aspects of query
optimization.
The three main concepts identified to solve the problem of query
optimization are:
- a search space of possible plans
- techniques to extimate the cost of various plans
- an algorithm to search through the execution space
The interesting ideas in the section on the System-R optimizer are:
- the search space for select-projection-join queries corressponds to
the linear sequence of join operators
- the cost model relies on
1. statistics on indexes and relations
2. estimation of the selectivity of predicates
3. estimation of CPU and I/O costs of query execution for each
operator
- the use of dynamic programming and interesting order
dynamic programming - reduce the search space. select only left linear
plans. reduces to O(2**n-1) from O(n!)
interesting order - generalized to physical order - characteristic of a
plan that is not shared by all plans for the same logical expression but
can impace the cost of subsequent operations. Can handle violation of
the principle of optimality in the cost model.
Interesting ideas in the Search Space section:
- query graph model (used in the Startburst system) - nodes are
relations and edges are join predicates between relations
- bushy joins require materialization of intermediate results
- Join(R, S LOJ T) = Join(R,S) LOJ T => joins can be freely reordered
among themselves since the sequence of joins and outerjoins do not
freely commute
- specific cases when the GROUP BY can be pushed down without arbitrary
side-effects
- correlated query blocks (covered in more detail in the Accesss Paths
paper)
Interesting ideas in the Statistics and Cost Estimation section:
- the estimation framework used in System-R
- histograms on base data
- sampling for statistics
Enumeration Architectures:
- Starburst
Query Graph Model representation of the SQL query
- Volcano/Cascades
It would have been nice to have seen some more detail in the section on
Materialized Views
Overall it's a great paper to touch upon the areas of interest in the
topic but doesn't provide an in-depth treatment.
This archive was generated by hypermail 2.1.6 : Tue May 25 2004 - 16:18:02 PDT