CSE 590DB Class Schedule

Week ofMondayWednesdayFriday
March 30IntroductionRelational Databases Relational Databases
April 6Datalog and SQL Object Oriented DatabasesObject Oriented Databases
April 13Object Oriented Databases Object Relational Databases (Ashutosh Tiwary) Object Relational Databases (Ashutosh Tiwary)
April 20 Semistructured Data (Marc Friedman) Extracting Structure (Joshua Redstone) Data Mining (Rachel Pottinger)
April 27 Query Flocks (Dawn Werner) Indexing in Databases vs. Indexing in IR Systems (Eric Selberg) Storage/Relational Operators (Dan Fasulo)
May 4Query Optimization (Surajit Chaudhuri) Query Optimization (Surajit Chaudhuri) Query Optimization (Surajit Chaudhuri)
May 11 Efficient Mid-Query Re-Optimization of Sub-Optimal Query Execution Plans (Zack Ives) &

GMAP: A Versatile Tool for Physical Data Independence(Sujay Parekh)

Sujay Parekh (cont.) &

Wavelets for Cost Estimation (Mike Ernst)

Mike Ernst (cont.) &

An Efficient, Cost-Driven Index Selection Tool for Microsoft SQL Server (Robert Chen)

May 18 Data Integration (Alon Levy) Data Intergration paper (Omid Madani) Strudel (Brian Michalowski)
May 25 Memorial Day; no class Other web site management systems: Areneus, Yat, WIRM, WebOQL (Rex Jakobovits) Willam Cohen (see below)
June 1SIGMOD/PODS SIGMOD/PODS Expressive Power of Query Languages (Eric Anderson)


William Cohen's talk on May 29:

How to Answer Questions using Information from the Web: A Similarity-Based Database that Reasons with Structured Collections of Text

William Cohen, AT&T Labs--Research

Currently, information on the Web can only be accessed by browsing, or by keyword search. Ideally, one would like to use information on the Web to answer complex queries---queries that require deduction to answer. One way of accomplishing this is to translate several information sources into a single common knowledge base, and then query that knowledge base; a number of "knowledge integration" systems that work like this have been built. A drawback of this approach, however, is that the initial translation step is expensive in terms of human effort. In my talk I will propose a new way of representing knowledge that is midway between the representation used by a conventional database, and the representation used by a full-text search engine. Specifically, I will propose representing information with a collection of documents organized into relations, and argue that this scheme is more appropriate for representing the sort of loosely coupled, heterogeneous information sources typically found on the Web. I will then present a logic that uses this representation to efficiently approximate certain database operations by reasoning about the similarity of pairs of documents. Similarity is measured using the vector space model, a metric widely used in statistical information retrieval.

Relevant on-line sites

A long paper, to appear in SIGMOD-98

A shorter summary