Review

From: Ankur Jain (ankur@cs.washington.edu)
Date: Wed May 19 2004 - 14:29:24 PDT


Query Evaluation Techniques for Large Databases - Goetz Graefe

The paper presents a survey of the several practical techniques that are
used to evaluate queries in databases.

An interesting concept presented in the paper was the iterator model. It
provides a framework to solve arbitarily complex
query plans within a single process, without having the problems of
scheduling, synchronization and performance that
some of the other alternatives face. I was interested though in knowing
why user-level threads each solving seperate
subplans were not used as an alternative. They could have dealt with the
problem of needless context switches that were
the cause of concern with having multiple processes each solving a subplan.
I also liked the idea of having "split" iterators to deliver data to
multiple consumers, though I didnt come across
any physical operator where this was actually used.

The sections that followed essentially talked about the various
techniques that can be used to implement the various
operators. One issue that kept on cropping up in my mind as I read the
various algorithms was whether they are
still used today in the same form when memory has become orders of
magnitude larger and the disk speed not really
showing that kind of exponential growth. I would believe that by now
there would be new techniques that exploit this disparity.

It would also be interesting to see how these algorithms would perform
in multiprocessor settings - that I
am sure will open up an entirely new set of problems and interesting
solutions. I believe that
would be the theme of section 10 of the paper, which unfortunately I
didn't get the time to read.

I really liked the idea presented at the fag end of section 8 where they
talk how plans could be dynamically changed
during the execution of a plan depending upon the outcome of the
execution of subplans. This infact brings to the
next question that I had. I got the impression that this kind of dynamic
change of plans is not done. However,
choosing the best technique would most often involve looking not just at
the schema of the
relations but also some statistics or some properties about the table
itself such as number of tuples it has, etc.
The paper conspicuously missed how these properties are figured out and
how these decisions are taken,
that is if at all this is done.



This archive was generated by hypermail 2.1.6 : Wed May 19 2004 - 14:29:24 PDT