Review : Processing XML Streams with Deterministic Automata

From: Joe Xavier (joexav@microsoft.com)
Date: Wed May 05 2004 - 10:20:34 PDT

  • Next message: Stavan Parikh: "Processing XML Streams with Deterministic Automata"

    Review : Processing XML Streams with Deterministic Automata - Green et al.
     
    The paper mainly deals with showing how to use Deterministic Finite Automata for evaluating a large number of XPath expressions on an XML stream. The main idea is to convert all XPath expressions into a single DFA and evaluate it on the input stream.
     
    XML Routing is mentioned as an application for efficiently processing XPath exps against a large throughput of XML. The paper first talks about the event-based processing model using a SAX parser.
    The next section deals with processing DFAs. The approach is to convert a large collection of XPath expressions into a single DFA by first converting to a Nondeterministic FA and then converting the NFA to a DFA. The main issue here is the size of the DFA generated which is analyzed in section 4. Two techniques are discussed - eager and lazy generation of DFAs.
    Eager DFAs - two theorems for determining the upper bound on the size of the DFAs are discussed. The first is for a single XPath expression and the second is for a set of XPath expressions.
    Lazy DFAs are constructed at run-time. Initially there's only a single state and each time there's a missing state it is computed and the transition is updated. Two results are presented giving uppre bounds on the number of states on in the lazy DFA that are specific to XML data. The first exploits the schema and the other the data guide.
    The section on experimental results is impressive in that the throughput is constant. The lazy DFA performs better.
     

     


  • Next message: Stavan Parikh: "Processing XML Streams with Deterministic Automata"

    This archive was generated by hypermail 2.1.6 : Wed May 05 2004 - 10:21:36 PDT