From: Steven Balensiefer (alaska@cs.washington.edu)
Date: Mon May 24 2004 - 10:33:10 PDT
Access Path Selection in a Relational Database System
The System R optimizer attacks the problem of query optimization from a
very specific starting point. It only attempts to optimize linear series
of joins (i.e. left-deep trees). Though they don't explicitly say so, this
choice is used to restrict the number or possible orderings when
enumerating possible plans.
The evaluation of possible plans relied both on the cost of the plan, and
the presence of "interesting orderings". In determining the cost, they
introduced the idea of a selectivity factor, which roughly relates to the
fraction of the table that will satisfy a given predicate from the query.
It's imporant to note that the cost derived depended on the number of
tuples, this selectivity factor, and some representation of the cost for
doing the physical access.
The concept of interesting orderings was a more nebulous concept. In
essence, an interesting ordering is a plan where slightly more costly
subplans (i.e. merge-sort join on two tables) yield results which
significantly decrease the cost of a future operation. It felt like they
added this as a concession to the fact that this kind of optimization may
not be perfectly suited for a dynamic programming approach.
The inclusion of the interesting orderings prevents the exclusion of
subplans that aren't locally best, but contribute to the best
solution.
The only difficulty I had with this paper was interpretting the figures 5
and 6. I'm still confused the purported extension from one to the next,
because I fail to see how they map to each other. Also, the inclusion of
the root node confuses me. Fortunately, the description in the text was
more helpful.
This archive was generated by hypermail 2.1.6 : Mon May 24 2004 - 10:33:11 PDT