An Overview of Querry Optimization in Relational Systems

From: Lucas Kreger-Stickles (lucasks@cs.washington.edu)
Date: Mon May 24 2004 - 12:08:55 PDT

  • Next message: Bhushan Mandhani: "Paper 9 Review"

    In this paper the authors intend to provide an overview of the
    techniques employed in optimizing querries for relational databases.
    They begin by indicating querries must be transformed into phsical
    plans, but that many such plans exist with widely differeng costs.
    Therefore, in order to constuct optimal plans one must be able to know
    and enumerate the space of plans which properly execute the given querry
    and have a way to estimate the cost of various concrete plans.

         The author then explors the contributions of system R, a relational
    database that was one of the first to employ many of the still prevelent
    techniques used today in querry optimization. This system used
    statistics regarding realations and indexes, formulas to estimate the
    output of various operations, and formulas to estimate CPU costs
    associated with every operator to estimate the relative costs of various
    querries. One of the nice things about System R is its use of dynamic
    programming techniques such that subplans can be used to create optimal
    plans of greater size (and furthermore can be re-used). The author
    indicates that despitre many of the groundbreaking features of System R,
    its focus on join ordering limited its feasibility in other ares of
    importance to querry optimization.

    Next, the author discusses the search space of various querries. One of
    the interesting comments in this section is that different plans for
    querry exectuion are not necessariuly better, and thus search must be
    constrained in such a way as to not reduce the benifit of optimization.

    The author then goes on to discuss techniques for reducing multi-block
    querreis to querries over a single block and some of the benefits/limits
    of these approaches.

    Next, the author discusses means by which statistics and models can be
    used for cost estimation of plans. One observation here is that cost
    data must be efficent since it is used quite frequently. Another
    interesting point to note is that the statisical summarty of a plan is a
    result of the tables and fields in that plan and is therefore a logical
    property (being the same for all plans which execute that querry),
    however, costs of a plan are physical operations which relate to the
    actuall execution plan.

    Overall, I found the paper to be quite informative, and it ddid a good
    job of explaining and exposing aspects of System R which I did not fully
    understand when I first read the paper. In addition, it brought up many
    good points relating to the potential benifits that can be gained from
    techniques as opposed to the cost of actually employing that technique.

    -Lucas Kreger-Stickles


  • Next message: Bhushan Mandhani: "Paper 9 Review"

    This archive was generated by hypermail 2.1.6 : Mon May 24 2004 - 12:09:00 PDT