From: Li Yan (lanti@u.washington.edu)
Date: Mon May 24 2004 - 02:53:50 PDT
This paper mainly talks about access path selection, part of
query optimization. Since the computation involved is
insignificant compared with the total cost of query
execution, and will result in a significant reduction factor
of query execution cost, it is worthwhile to perform
optimization to get a good query plan.
One thing in System R looks a bit counter intuitive is to
have multiple relations in a single segment and a segment
scan will retrieve tuples other that the requested relation
potentially. Not sure about the decision made in such a
storage organization.
The so-called sargable predicate serves a simple yet good
example of pushing down selection, so as to avoid further
processing of such uninteresting tuples, in an effort to
save CPU usage. However, this paper says most of System R's
CPU usage is spent on RSS, is it from empirical statistics
or there is a justification, say, memory management takes up
the majority of compuation resource?
Statistics are key to cost estimation, but it is too
expensive to update the values after every update command,
INSERT/UPDATE/DELETE, then what exactly does the Dynamic
Updating do to keep it current? Is it simple store a copy of
temperary values and lock-n-update statistics periodically?
The selection criteria takes into account of interesting
ordering, which makes sense, since there is a cost
associated with sorting resulting tuples if such an order is
required.
It is well worth noting that hash-join is missing from the
discussion in multi-relation join. Maybe due to a historical
reason.
This archive was generated by hypermail 2.1.6 : Mon May 24 2004 - 02:53:56 PDT