RDBXML

From: Michael Gubanov (mgubanov@cs.washington.edu)
Date: Mon Apr 19 2004 - 11:17:03 PDT

Next message: Stavan Parikh: "Relation Databases for Querying XML Documents"

Previous message: Neva Cherniavsky: "Review of Relational Databases for Querying XML Documents"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]

Paper: Relational Databases for Querying XML Documents: Limitations and
Opportunities,

J. Shanmugasundaram, K. Tufte, G. He, C. Zhang, D. DeWitt,
J. Naughton

Summary: The paper describes and evaluates new algorithms to
automatically

map XML DTDs to relational schemas and after that leverage the existing
power

of relational engine by translating queries over original XML data to
SQL queries

against created relational schema and exporting the result back to XML.

Main Ideas:

The main challenge in "conservative" approach (round-tripping through
RDB)

the authors selected is to be able to build an effective mapping from
XML DTD

to RDB schema. This further impacts query performance a lot and finally

the ability to leverage existing RDB power (indexes, query optimizer,
etc).

- Simplification technique of DTD before convertion was proposed

- Notion of DTD graph was proposed and used as a basis for three

DTD translation algorithms

Three new algorithms operating on DTD graph were proposed and evaluated:

- Basic inlining. Drawback: Creates too many target relations, which
results

in large number of SQL queries to generate when translating XML query

- Shared inlining. Drawback: Creates way less relations than basic, but
this results

in way too many joins in translated SQL query

- Hybrid inlining: Shared + inlining additional elements thus
alleviating

the drawbacks of shared and basic

- Semi-structured to SQL query conversion techniques for

simple path queries, simple recursive path queries and

arbitrary path expressions

- Conversion of SQL query output to XML

Flows: It would be probably worth taking some specific Internet/Intranet

XML application and let the colors of Hybrid algorithm shine even
brighter

on the specific example.

Relevance: The problem is definitely urgent for today's number of

XML-enabled Internet/Intranet applications and agility of growing

number of Web-Services.

Next message: Stavan Parikh: "Relation Databases for Querying XML Documents"

Previous message: Neva Cherniavsky: "Review of Relational Databases for Querying XML Documents"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]

This archive was generated by hypermail 2.1.6 : Mon Apr 19 2004 - 11:17:02 PDT