From: Joe Xavier (joexav@microsoft.com)
Date: Mon May 03 2004 - 10:45:29 PDT
This paper does a good job in summing up the ideas behind Datalog based
QO and compares "Information manifold" and "Tsimmis", two information
integration systems.
The paper discusses information integration using views. The first
section provides the groundwork for these ideas with a discussion of
conjunctive queries (CQ), Datalog programs and their containment.
A CQ is a rule with subgoals that have Extensional Database predicates
(stored procs). The rest of the section deals with determining the
containment of CQs withing each other. The different cases covered are
1. regular conjunctive queries
2. CQs with negation
3. CQs with arithmetic comparisons
After some discussion of Datalog programs, the section spends some time
on containment of Datalog programs within CQs.
The next section talks about the relationship between query containment
algorithms and information integration calling it 'synthesizing queries
from views'. Each view has a definition in terms of Extensible database
predicates and the supposition is that these definitions are conjunctive
queries. If a queris Q is expressed in EDB predicates, the query
synthesis is the problem of finding all valid solutions S (expression in
terms of the views) for Q. A solution S is valid if when the views are
replaced by their definitions it gives the expansion query E which is
equivalent to the original query Q.
The next section jumps in to the actual architecture of infromation
integration systems. A common architecture is to have a number of data
sources wrapped by software that translates between the source's app
specific details. These are called mediators. Although these medators
are not actual data, they can be queried for an unified view od the
data. The paper then goes on to talk about two research projects using
mediators: Information manifold and Tsimmis. The two systems differ in a
number of ways and these details are covered.
The paper was a great read. I really liked the formalism that was
sprinkled in the paper without losing actual architectural details and
implementation aspects.
This archive was generated by hypermail 2.1.6 : Mon May 03 2004 - 10:45:35 PDT