Large-scale parallel data analysis in the cloud
Week 1 - Fri, September 26th
SCOPE: Easy and Efficient Parallel Processing of Massive Data SetsRonnie Chaiken, Bob Jenkins (Microsoft), Paul Larson(Microsoft Research, USA), Bill Ramsey, Darren Shakib, Simon Weaver (Microsoft), Jingren Zhou (Microsoft Research, USA). VLDB 2008.
Background: Pig-Latin, Dryad, MapReduce, etc.
Presenter:
Week 2 - Fri, October 3rd
DryadLINQ: A System for General-Purpose Distributed Data-Parallel Computing Using a High-Level Language [CSENetID]Yuan Yu, Michael Isard, Dennis Fetterly, and Mihai Budiu (Microsoft Research); òlfar Erlingsson (Reykjav’k University and Microsoft Research); Pradeep Kumar Gunda and Jon Currey (Microsoft Research). OSDI 2008.
Background: Dryad, MapReduce, BigTable, Pig, etc. Mostly Pig and MapReduce, though.
Presenter: Nicholas Murphy
Week 3 - Fri, October 10th
Automatic Optimization of Parallel Dataflow ProgramsC. Olston, B. Reed, A. Silberstein and U. Srivastava.
2008 USENIX Annual Technical Conference, Boston, Massachusetts, June 2008.
Presenter: YongChul Kwon
Database as a service in the cloud
Week 4 - Fri, October 17th
Dynamo: amazon's highly available key-value storeGiuseppe DeCandia, et. al. SOSP'07
Related: Amazon's Web services, Facebook Cassandra, and Google MegaStore
Presenter: Nodira Khoussainova & Flavio Pfaffhauser
Week 5 - Fri, October 24th
PNUTS: Yahoo!'s Hosted Data Serving PlatformBrian Cooper, Raghu Ramakrishnan, Utkarsh Srivastava, Adam Silberstein (Yahoo! Research), Phil Bohannon(Yahoo!), Hans-Arno Jacobsen (Yahoo! Research and University of Toronto), Nick Puz, Daniel Weaver, Ramana Yerneni (Yahoo! Research). VLDB 2008
Related: SSDS, Amazon SimpleDB, Google App Engine, building a databse on S3
Presenter: Tom Bergan
Impact of flash memory
Week 6 - Fri, October 31st
Flashing Up The Storage LayerIoannis Koltsidas, Stratis Viglas (University of Edinburgh). VLDB 2008.
Related: A Case for Flash Memory SSD in Enterprise Database Applications
Sang-Won Lee (Sungkyunkwan University), Bongki Moon (University of Arizona), Chanik Park (Samsung Electronics), Jae-Myung Kim (Altibase), Sang-Woo Kim (Sungkyunkwan University). SIGMOD 2008
Presenter: Michael J Cafarella & Christopher M Ré
Week 7 - Wed, November 5th
No meeting.Combining computation and data management in the cloud
Week 8 - Wed, November 12th
Clustera: An Integrated Computation and Data Management System [CSENetID]David DeWitt, Eric Robinson, Srinath Shankar, Erik Paulson, Jeffrey Naughton, Andrew Krioukov, Joshua Royalty (UW - Madison). VLDB 2008
Presenter: Katherine Moore & Kristi Morton
Scientific data management in the cloud
Week 9 - Fri, November 21st
[SUBJECT TO CHANGE]Scalable Multi-Query Optimization for Exploratory Queries over Federated Scientific Databases
Dieter Van de Craen, Frank Neven (Hasselt University), Anastasios Kementsietsidis (IBM T.J. Watson Research Center), Stijn Vansummeren (Hasselt University). VLDB 2008
Presenter:Prasang Upadhyaya