Review #2

From: Atri Rudra (atri@cs.washington.edu)
Date: Sun Apr 18 2004 - 19:28:24 PDT

  • Next message: Steven Balensiefer: "Relational Databases for Querying XML Documents"

    Relational Databases for Querying XML Documents: Limitations and Opportunities
    ------------------------------------------------------------------------------

    This paper presents techniques to convert XML documents to relational
    tuples, translate semi-structured queries over XML document to SQL
    queries and convert the results back to XML. As authors argue this
    approach has the merit of using techniques developed over 20 years
    vis-a- vis the relatively new techniques of semi-structured query languages.

    The approach is divided into three broad steps:

    (i) Conversion of DTDs into relational schemas (and the corresponding
    conversion of the XML documents into relational tuples). First the regular expressions
    in the DTD are simplified. These set of transformations can
    result in the loss of information about the relative orders (which the
    authors point out can be retained by adding extra information). However, I
    am not sure how representing the "+" operator by the "*" operator does not
    introduce extra information. The authors then consider converting the
    simplified DTDs to relational schemas by inlining techniques: that is,
    "pack" as many descendants of an element into a single tuple as possible.
    This methodology, termed as basic is presented with techniques to handle
    set-valued attributes and recursion. This approach however, results in
    many tables being created. The idea in Shared technique is to create one
    table for element nodes that are shared and share them. The authors also
    consider the Hybrid technique which is sort of a middle ground between
    the Basic and Shared techniques. Evaluations of the two techniques are
    also presented (the Basic technique ran out of memory in many sample DTDs
    and is omitted from the results).

    (ii) Converting semi-structured queries to SQL. The authors give conversion
    techniques for simple path queries, simple recursive path queries and
    arbitrary path expression (the last one is converted into a bunch of
    simple recursive path queries).

    (iii) Converting results of SQL queries to XML. The authors consider a
    some cases and give techniques for conversion. However, the techniques
    presented for queries which return complex XML elements are not
    very satisfactory.

    Overall, the main bottleneck seems to be in Step (iii). The authors
    mention some techniques which if incorporated in the relational systems
    would aid in easier handling of XML queries. It would be interesting to
    see how Chris addresses these issues on his conversion work.

    The paper as a whole was not as much a satisfying read as the Essence of
    XML paper which may in turn be due to the fact that while this work is more of
    a hack while the other paper was a crisp and sound theoretical result :-)


  • Next message: Steven Balensiefer: "Relational Databases for Querying XML Documents"

    This archive was generated by hypermail 2.1.6 : Sun Apr 18 2004 - 19:28:26 PDT