RDBXML

From: Michael Gubanov (mgubanov@cs.washington.edu)
Date: Mon Apr 19 2004 - 11:17:03 PDT

  • Next message: Stavan Parikh: "Relation Databases for Querying XML Documents"

    Paper: Relational Databases for Querying XML Documents: Limitations and
    Opportunities,

                J. Shanmugasundaram, K. Tufte, G. He, C. Zhang, D. DeWitt,
    J. Naughton

     

    Summary: The paper describes and evaluates new algorithms to
    automatically

    map XML DTDs to relational schemas and after that leverage the existing
    power

    of relational engine by translating queries over original XML data to
    SQL queries

    against created relational schema and exporting the result back to XML.

     

    Main Ideas:

     

    The main challenge in "conservative" approach (round-tripping through
    RDB)

    the authors selected is to be able to build an effective mapping from
    XML DTD

    to RDB schema. This further impacts query performance a lot and finally

    the ability to leverage existing RDB power (indexes, query optimizer,
    etc).

     

     

    - Simplification technique of DTD before convertion was proposed

     

    - Notion of DTD graph was proposed and used as a basis for three

      DTD translation algorithms

     

    Three new algorithms operating on DTD graph were proposed and evaluated:

     

     - Basic inlining. Drawback: Creates too many target relations, which
    results

       in large number of SQL queries to generate when translating XML query

     

     - Shared inlining. Drawback: Creates way less relations than basic, but
    this results

       in way too many joins in translated SQL query

     

     - Hybrid inlining: Shared + inlining additional elements thus
    alleviating

       the drawbacks of shared and basic

     

    - Semi-structured to SQL query conversion techniques for

      simple path queries, simple recursive path queries and

      arbitrary path expressions

     

    - Conversion of SQL query output to XML

     

    Flows: It would be probably worth taking some specific Internet/Intranet

    XML application and let the colors of Hybrid algorithm shine even
    brighter

    on the specific example.

     

    Relevance: The problem is definitely urgent for today's number of

    XML-enabled Internet/Intranet applications and agility of growing

    number of Web-Services.


  • Next message: Stavan Parikh: "Relation Databases for Querying XML Documents"

    This archive was generated by hypermail 2.1.6 : Mon Apr 19 2004 - 11:17:02 PDT