Database Group Meeting
Overview
This quarter the Database group
meeting will be largely used for presenting current research at the Database
group at UW with a few presenters coming from outside UW.
Meetings will be held in CSE 605 Database Lab unless specified
otherwise. We meet from 3pm to 4pm.
The group meetings sponsored by
Yahoo! as part of the Yahoo! Database Talk Series are labeled with.
Mailing List
You can sign up for the mailing
list here. Send mail to that list at uw-db at cs.
Schedule
October
14, 2009 |
|
October
21, 2009 |
-
Cancelled - |
October
28, 2009 |
|
November
6, 2009 |
|
November
11, 2009 |
VeteranŐs
Day |
November
18, 2009 |
|
November
25, 2009 |
Philip A. Bernstein, MSR |
December
2, 2009 |
Prasang |
December
9, 2009 |
544 Presentations (starts at 1:30pm) |
Details
Week
1:
Bridging the Gap Between Intensional
and Extensional Query Evaluation in Probabilistic Databases
Presenter: Abhay Kumar Jha
Abstract: There are two broad approaches to query
evaluation over probabilistic databases : 1) Intensional Methods proceed by
manipulating expressions over symbolic events associated with uncertain tuples.
This approach is very general and can be applied to any query, but requires an
expensive post-processing phase, which involves some general-purpose
probabilistic inference 2) Extensional Methods, on the other hand evaluate the
query by translating operations over symbolic events to a query plan;
extensional methods scale well, but they are restricted to safe queries. In
this paper, we bridge this gap by proposing an approach that can translate the
evaluation of any query into extensional operators, followed by some
post-processing that requires probabilistic inference. Our approach uses
characteristics of the data to adapt smoothly between the two evaluation strategies.
If the query is safe or becomes safe because of the data instance, then the
evaluation is completely extensional and inside the database. If the query/data
combination departs from the ideal setting of a safe query, then some
intensional processing is performed, whose complexity depends only on the
distance from the ideal setting.
Week 3: SciDB
Presenter:
Emad
Soroush
Abstract:
Demo.
Week
5: Large-scale Information Extraction from
the Web
Presenter: Nilesh Dalvi
Abstract: A significant
portion of web pages embed interesting and valuable semantic content suitable
for structured representation. The traditional Information Extraction
techniques, however, fall short of achieving high quality extraction at Web
scale. In this talk, I will outline some of the work going on at Yahoo!
Research on addressing the challenges of Information Extraction on a Web scale.
I will focus on wrapper-based
techniques, which exploit the HTML structure of websites to extract the
information of interest. I will address two problems: (i) making wrappers more
robust to changes in websites, and (ii) enabling learning of wrappers from
automatically obtained noisy training data.
Week
7: TBA
Presenter:
Wolfgang
Gatterbauer
Abstract:
TBA
Week 8:
Hyder: A Transactional Indexed Record Manager for
Shared Flash Storage
Presenter: Philip A. Bernstein,
Microsoft Research and Affiliate Professor at CSE, University of Washington
Abstract: An enormous
increase in the I/O rate to shared storage is made possible by the availability
of large flash storage chips and cheap high-speed network switches. Hyder is a
research project to develop a new transactional indexed-record manager based on
these technologies. It's a data-sharing system, where all compute servers have
direct access to shared flash storage and no direct-attached disk. Its main
feature is that it scales out without partitioning the database or application.
It is therefore well suited to a data center environment, where scale-out is
especially important and where specialized flash hardware and networking can be
cost-effective. The software architecture that makes this possible is radically
different than classical transactional record managers. It uses log-structured
record storage, sliding-window RAID, binary search trees, and optimistic
concurrency control. There is no locking, ARIES-style logging, or B-trees.
After a brief discussion of motivation, I will spend most of the talk
describing the architecture. This work is joint with Colin Reid, also at
Microsoft.
Students yet to present this academic
year
Abhay, Emad,
Wolfgang, Prasang, Nodira, Kate, Julie, Vibhor, YongChul, Kristi, Marianne,
Yingyi, Alexandra