Review: Processing XML Streams with Deterministic Automata

From: Lucas Kreger-Stickles (lucasks@cs.washington.edu)
Date: Wed May 05 2004 - 11:41:52 PDT

  • Next message: Atri Rudra: "Review #6"
    Processing XML Streams with Deterministic Automata
    Review by Lucas Kreger-Stickles

        In their paper the authors present a system for evaluating XPath expressions through the use of DFAs.  The authors point out that while in the worst case DFAs for XPath expressions grow exponentially, this worst case is not often encountered.  They indicate that much of this exponential growth can be atributed to * and // operations in the XPath expressions.  The authors then go on to describe how the use of  a DFA which is constructed 'lazily', improves the practicality of such a technique.
          The authors then describe the mechanisms of their system.  First a set of querries are given and turned into a querry tree (which consisits of many XPath instructions), then this tree is converted into an NFA which is finally converted into a DFA.
        The authors indicate that processing XPath with DFAs is quite efficent but that it is the size that can cause problems.  It is in this context that the authors compare and contracst eager and lazy DFA construction.  While an eager DFA is constructed at the begining of compuation a lazy DFA is constructed on the fly as needed.  The hope is that the vast majority of states int eh DFA will never need to be evaluated, thus keepign the size of the DFA in check.
        Finaly, the authors conducted limited experiments to validate their technology and which showed a 'warm-up' phase (durring which the lazy DFA was adding states quite frequently) followed by an extended period of high troughput.
     

  • Next message: Atri Rudra: "Review #6"

    This archive was generated by hypermail 2.1.6 : Wed May 05 2004 - 11:41:57 PDT