From: Aaron Chang (anc327@yahoo.com)
Date: Wed May 05 2004 - 10:37:41 PDT
cse544
rev6
the central point of this paper is to show it is
feasible to evaluate Xpath
expressions in xml streams at pretty decent read rates
using DFA. the question
then becomes whether or not it is better to use a lazy
or eager approach.
the authors conclude it a lazy approach might be
better because of the size
of DFA states need to process. a minimalized procedure
saves times without
sacrificing accuracy. overall a paper of practical
significance in the
optimization of xml querying in real time data
processing.
the general approach in using DFAs is to break down
xpath expressions into
1 DFA. this can be performed at runtime while
processing SAX events as they
are read in from the xml document. DFA can be
exponential in size, so some
kind of optimization must be done. the key difference
between eager and lazy
DFAs is that the latter is depth dependent, and not on
the # of xpath expressions
which can be exponential. i liked the usage of
examples to hammer home this
point.
given the modest hardware requirements used in testing
this algorithm, 5.4 Mb/s
is pretty good. however, i suspect at some point, even
this data processing
rate might be too slow. which makes me think maybe
xpath expressions might
need some kind of further simplification.
__________________________________
Do you Yahoo!?
Win a $20,000 Career Makeover at Yahoo! HotJobs
http://hotjobs.sweepstakes.yahoo.com/careermakeover
This archive was generated by hypermail 2.1.6 : Wed May 05 2004 - 10:37:47 PDT