Organised by: Magda Balazinska
The Database Group meets on Mondays at 2.30pm-3.20pm in CSE 405, Allen Center.
This quarter's theme is Parallel Data Processing.
Upcoming talks are announced on uw-db@cs. Please sign up for the mailing list.
Date | Presenter | Talk Title |
Oct 03 | Paris | Overview of modern parallel data processing engines |
Oct 10 | Kristi | Parallel Data Processing Engines: GAMMA |
Oct 17 | Prasang | Parallel Data Processing Engines: BUBBA |
Oct 24 | Shengliang | Parallel Data Processing Engines: VOLCANO |
Oct 31 | Cancelled for Sigmod | |
Nov 07 | Emad | Query Optimization in Parallel DBMSs |
Nov 14 -> Nov 18 | YongChul | Skew in Parallel DBMSs |
Dec 05 | Abhay | Theory in Parallel DBMSs |
Dev 12 | Nodira | Scheduling in Parallel DBMSs |
Nov 16 | Jingjing | Fault-tolerance in Parallel DBMSs |
Teradata, Greenplum, and Neteezza architectures. Presentation based on online documentation.
The Gamma Database Machine Project, D. J. Dewitt et. al., IEEE Transactions on Knowledge and Data Engineering, Volume 2 Issue 1, March 1990.
Data placement in Bubba, George Copeland et. al., SIGMOD ’88.
The following is an overview paper of Bubba but we will not discuss it:
Prototyping Bubba, A Highly Parallel Database System H. Boral et. al. IEEE Transactions on Knowledge and Data Engineering Volume 2 Issue 1, March 1990 http:dl.acm.orgcitation.cfm?id=627396 http:ieeexplore.ieee.orgstamp/stamp.jsp?tp=&arnumber=50903
Volcano— An Extensible and Parallel Query Evaluation System G. Graefe IEEE Transactions on Knowledge and Data Engineering Volume 6 Issue 1, February 1994 http:dl.acm.org/citation.cfm?id=627558
Suggested papers are below. Please feel free to pick a better paper. There are many papers on this topic.
The following paper looks at shared-memory systems but introduces the key idea of two-phase optimization so it would be worth reading it.
Optimization of parallel query execution plans in XPRS Hong, W.; Stonebraker, M.; http:ieeexplore.ieee.orgxplsabs_all.jsp?arnumber=183106&tag=1
The following paper optimizes for runtime instead of throughput: Query optimization for parallel execution Sumit Ganguly et. al. SIGMOD ’92
Multi-dimensional resource scheduling for parallel queries Minos N. Garofalakis and Yannis E. Ioannidis SIGMOD ’96
Or the following VLDB’97 paper by the same authors instead: Parallel Query Scheduling and Optimization with Time-and Space-Shared Resources.
A Taxonomy and Performance Model of Data Skew Effects in Parallel Joins Christopher B. Walton et. al. VLDB’91
Fault Tolerance Issues in Data Declustering for Parallel Database Systems (1994) by Leana Golubchik , Richard R. Muntz Bulletin of the Technical Committee on Data Engineering
Neil Immerman. Expressibility and Parallel Complexity. SIAM J. Comput. 18(3): 625-638 (1989)
Dan Suciu, Val Tannen. A Query Language for NC. PODS 1994: 167-178
Feel free to send comments to Prasang.