review : Query Evaluation Techniques for Large Databases

From: Joe Xavier (joexav@microsoft.com)
Date: Wed May 19 2004 - 11:12:09 PDT

  • Next message: Steven Balensiefer: "The big paper"

    Review: Goetz Graefe - Query Evaluation Techniques for Large Databases

    This paper reviews different operational aspects of a DBMS's query
    execution engine. The most prominent aspects of the sections we reviewed
    are the treatment of sorting and hashing algorithms.

    Sorting - the algorithm described for sorting is essentially quicksort
    of datasets that fit in memory and merge for larger datasets.
    Sorting-based aggregation outperforms nested-loops and aggregating as
    early as possible is useful. Merge-join, the sorting based join
    algorithm is widely used since it's rough cost can be estimated and it
    performs well enough. Can also be used to implement division between
    tables.
    Hashing - hybrid hashing writes only necessary data to disk saving I/O
    cost. Aggregation based on hybrid hash performs better than
    sorting-based algorithm with early aggregation. Although there is an
    overhead in computing the hash value, the search space is made smaller
    and so it outperforms merge-join. Combined with bit map, the hash method
    is efficient for processing division operation.

    The paper also briefly covers Indexing. The paper compares various index
    structures according to their support for ordering and sorted scans and
    their support for point data and range data.

    It would've taken a few more days to do a more thorough review ...


  • Next message: Steven Balensiefer: "The big paper"

    This archive was generated by hypermail 2.1.6 : Wed May 19 2004 - 11:12:18 PDT