DFA

From: Michael Gubanov (mgubanov@cs.washington.edu)
Date: Wed May 05 2004 - 12:00:51 PDT

  • Next message: Ankur Jain: "(no subject)"

    Paper: Processing XML Streams with Deterministic Automata

    Summary: The paper proves the feasibility of using eager/lazy DFA
    construction in practice
    for XML stream processing. It bounds the number of states DFA could have
    in the worst case in respect to the number of XPath expressions to be
    evaluated
    on the XML stream.

    Main Ideas:
    - For eager DFA the number of "*" is the only source of exponential blowup
    in the case of a single, linear XPath expression. For multiple expressions
    more important is the number of expressions containing "//" and it is
    indeed the source of the exponential growth.

    - Whereas, query trees converted to DFAs and used for XML stream filtering
    do not support predicates directly, it could be worked around easily
    by decomposing them into several XPath expressions.

    - Eager DFA has no more than k(1+n) states in the case with no wild-cards
    and there is an exponential upper bound in the case with when
    there are wild-cards.

    - The extra exponential blowup in the case with eager DFA and multiple
    XPath expressions is exactly the number of expressions with "//".

    - In the case with lazy DFA it is surprising that the number of states
    does not depend on the number of expressions at all, but depends on their
    depth.

    - The number of states in lazy DFA is limited by 1+G where G is a data guide

    of an XML stream.

    Relevance:
    I think it is an interesting and useful work. As Web-Services spread more
    widely, therefore making XML streams essential for data exchange, fast XML
    stream
    processing will become one of the main issues.


  • Next message: Ankur Jain: "(no subject)"

    This archive was generated by hypermail 2.1.6 : Wed May 05 2004 - 12:00:57 PDT