(no subject)

From: Ankur Jain (ankur@cs.washington.edu)
Date: Fri May 07 2004 - 09:43:19 PDT

  • Next message: Stavan Parikh: "Query Evaluation Techniques for Large Databases"

    Processing XML Streams with Deterministic Automata - Green et al.

     

    The conventional way to evaluate a large number of XPath expressions on an
    XML stream was to construct NFA's. But inspite of heavy optimizations, the
    throughput achieved using these NFA's was orders of magnitude lower than
    what the authors had in mind. Converting the NFA's to DFA's leads to an
    exponential blowup of states and thus was a seemingly infeasible approach.
    The paper's most significant contribution comes here where they show that
    under some assumptions on the XML data, in lazily constructed DFA's, the
    number of states is manageable - and hence can used to efficiently solve the
    problem at hand. They follow up this insight with some solid analytical
    arguments and experimental results to make their approach convincing.

     

    If the intention is, as mentioned in the introduction, to make forwarding
    decisions based on XPath expressions, then this approach is understandable,
    otherwise if it is really just about collecting statistical information
    about the data stream, Monte-Carlo techniques would perhaps have made more
    sense. I was also wondering if such a large number of XPath queries on XML
    streams is indeed a reality even in the former scenario (perhaps this doubt
    is more to do with what exactly are XML routers, where are they deployed,
    what is their functionality).


  • Next message: Stavan Parikh: "Query Evaluation Techniques for Large Databases"

    This archive was generated by hypermail 2.1.6 : Fri May 07 2004 - 09:43:21 PDT