Text Box: Database Group Meeting

 

 

Overview

This quarter the Database group meeting will be used mostly for presenting current

research as well as for inviting speakers from outside of CSE UW.

 

Meetings will be held in CSE 605 Database Lab unless specified otherwise.

 

 The group meeting is sponsored by Yahoo! as part of the Yahoo! Database Talk Series.

 

Mailing List

You can sign up for the mailing list here. Send mail to that list at uw-db at cs.

 

Schedule

Date

Time

Presenter

Title

Mon, Apr 6

03:00pm

Brian Cooper (Yahoo! Research)

PNUTS: Yahoo!'s Massive Scale Data Platform

Wed, April 15

2.30pm

Kristi Morton

 

Wed, April 22

2.30pm

Wolfgang Gatterbauer

 

Wed, April 29

2.30pm

 

 

Wed, May 6

2.30pm

 

 

Wed, May 13

2.30pm

Abhay Jha

 

Wed, May 20

2.30pm

 

Web, May 27

2.30pm

Evan

 

Wed, June 3

2.30pm

Prasang

 

Wed, Jun 10

2.30pm

Marianne

 

Wed, June 17

2.30pm

Nodira

 

 

Details

Scalable Query Processing in Probabilistic Databases with SPROUT

Abstract:

I'll describe PNUTS, a massively parallel and geographically distributed database system for Yahoo!'s web applications. When we set out to design PNUTS, our goal was to build a database system that could scale to thousands of servers, but still provide useful DBMS features like indexes, transactions, query optimization, views, and so on. Of course, to reach that scale you have to give up some of the richness of those features, and I'll talk about the tradeoffs that we have faced and the decisions we've made. PNUTS provides data storage organized as hashed or ordered tables, low latency for large numbers of concurrent requests including updates and queries, and novel per-record consistency guarantees. It is a hosted, centrally managed, and geographically distributed service, and utilizes automated load-balancing and failover to reduce operational complexity. The first version of the system is currently serving in production. I'll describe the motivation for PNUTS and the design and implementation of its table storage and replication layers, and then present experimental results. I'll also discuss experiences building a real production system out of research ideas, and how trying to build a system that actually had to work in production changed our vision and research approach to the system.

 

Short Bio:

Brian Cooper is a research scientist at Yahoo! Research. Before that he was an assistant professor at Georgia Tech, and before that he was a PhD student at Stanford. His interests are in building distributed systems, and in particular, distributed systems that do database-style management and processing of data. At Yahoo! he works on building very large distributed data storage and processing systems. In previous lives he has worked on self-adaptive peer-to-peer systems, distributed streaming event processing, reliable distributed archival data storage, and XML indexing.

 

Speaker schedule:

http://reserve.cs.washington.edu/visitor/week.php?year=2009&month=04&day=06&area=5&room=1385