Processing XML Streams with Deterministic Automata

From: Stavan Parikh (stavan@cs.washington.edu)
Date: Wed May 05 2004 - 10:29:28 PDT

  • Next message: Aaron Chang: "rev 6"

    Green et al. present a mechanism for processing XML streams using DFAs.

    The authors realize that XPath expressions can be converted to DFAs in a simple manner. Further while eager conversion can lead to exponential growth, lazy evaluation keeps the growth manageable. The authors realize that in real data sets even though the size of the DFAs can be exponential it is bounded and allows processing. The most intresting result they present is that on lazy evaluation the size of the DFA is independent of the number of XPath expressions.

    They evaluated their system on a singal data set in comparison to XFilter. As the main claim in this paper is that their system gives good throughput for distributing XML data to multiple destinations - it seems that their evaluation is insufficient. They never explain why they compare to XFliter and why this comparsion makes sense. Further evaluation on a variety of data sets would have been more convincing.

    Overall while I think that their work has merit. Their levaraging of a simple DFA scheme to parse XML data is a neat idea and as DFAs are simple they would be easy to implement and build systems on. However, I would be more convinced that this is the right way only after I have seen more evaluation.

     


  • Next message: Aaron Chang: "rev 6"

    This archive was generated by hypermail 2.1.6 : Wed May 05 2004 - 10:29:28 PDT