rev 6

From: Aaron Chang (anc327@yahoo.com)
Date: Wed May 05 2004 - 10:37:41 PDT

  • Next message: Alexander Moshchuk: "Processing XML Streams with Deterministic Automata"

    cse544
    rev6

    the central point of this paper is to show it is
    feasible to evaluate Xpath
    expressions in xml streams at pretty decent read rates
    using DFA. the question
    then becomes whether or not it is better to use a lazy
    or eager approach.
    the authors conclude it a lazy approach might be
    better because of the size
    of DFA states need to process. a minimalized procedure
    saves times without
    sacrificing accuracy. overall a paper of practical
    significance in the
    optimization of xml querying in real time data
    processing.

    the general approach in using DFAs is to break down
    xpath expressions into
    1 DFA. this can be performed at runtime while
    processing SAX events as they
    are read in from the xml document. DFA can be
    exponential in size, so some
    kind of optimization must be done. the key difference
    between eager and lazy
    DFAs is that the latter is depth dependent, and not on
    the # of xpath expressions
    which can be exponential. i liked the usage of
    examples to hammer home this
    point.

    given the modest hardware requirements used in testing
    this algorithm, 5.4 Mb/s
    is pretty good. however, i suspect at some point, even
    this data processing
    rate might be too slow. which makes me think maybe
    xpath expressions might
    need some kind of further simplification.

            
                    
    __________________________________
    Do you Yahoo!?
    Win a $20,000 Career Makeover at Yahoo! HotJobs
    http://hotjobs.sweepstakes.yahoo.com/careermakeover


  • Next message: Alexander Moshchuk: "Processing XML Streams with Deterministic Automata"

    This archive was generated by hypermail 2.1.6 : Wed May 05 2004 - 10:37:47 PDT