Review #7

From: Atri Rudra (atri@cs.washington.edu)
Date: Wed May 19 2004 - 02:25:46 PDT

  • Next message: Aaron Chang: "big paper review"

    Query Evaluation Techniques for Large Databases
    ------------------------------------------------------------

    G. Graefe

    This (humongous) survey looks at various techniques for query execution
    for large databases with emphasis on complex queries. We looked at the
    architecture of query execution engines; Sorting and Hashing; Disk access;
    Aggregation algorithms based on nested loops, sorting and hashing; Binary
    Matching algorithms based on nested loops, sorting and hashing; Scheduling
    and resource allocation in complex queries; Nested Relations and Bit
    Vector Filtering.

    Here are my questions/comments:

    (i) The one things that stood out for me in this paper is the omnipresence
    of nested loops, sorting and hashing. Since this survey was written has
    any new way of doing the algorithms evolved ? If not (which is my guess),
    any particular reason for the non-existence of an alternate paradigm?

    (ii) The interplay between logical and physical algebra is basically the
    association of data structures to algorithms: which begs a question which
    is perhaps related to the last one-- have any other "sophisticated" data
    structures made their way into query execution ?

    (iii) I found the references to iterators in the Sorting section
    interesting as I have
    used iterators in my project (though iterators were used there to go
    through all possible database instances).

    (iv) The part about using statistical information to create better hash
    buckets (specially in the recursive partitioning case) seemed to be "ripe"
    ground for using online learning techniques-- has there been any work in
    this direction ?

    (v) The author mentions aggregations where inputs items are not divided
    into equivalence classes: do people use such queries in real life and/or
    is there a motivating example for such unusual aggregates ?

    (vi) The author says things like what would
    happen if object oriented databases completely take over relational
    databases which is interesting in the view of the Decade of turmoil paper
    we read earlier in the quarter.

    (vii) The author mentions some work using classical economic concepts in
    Sec 8. Two questions: (i) was the work done in the online or offline case
    and (ii) [This is way off on the tangent] Has there been work using
    game theory to solve some database problem ?

    (viii) The concept of having stop points in some query processing
    algorithms seems similar to the OS concepts of timesharing and context
    switching.

    (ix) Nested Relations seem similar to semi structured data like XML: is
    this connection valid ? Also in queries involving NF^2, can the select
    have conditions which involve sets ?

    (x) Overall a pretty comprehensive survey though I am glad we did not have
    to go through all of it :-)


  • Next message: Aaron Chang: "big paper review"

    This archive was generated by hypermail 2.1.6 : Wed May 19 2004 - 02:25:47 PDT