Informal Summer Reading Group on Cloud Computing and Data Intensive Computing

Covering SOCC and SIGMOD 2010 papers


Thursday, June 24th
SOCC Keynotes 2 and 3.

Thursday, July 1st.

SOCC Keynote 1.

Pregel: A System for Large-Scale Graph Processing Greg Malewicz,
Google, Inc.; Matthew Austern, Google, Inc.; Aart Bik, Google, Inc.;
James Dehnert, Google, Inc.; Ilan Horn, Google, Inc.; Naty Leiser,
Google, Inc.; Grzegorz Czajkowski, Google, Inc.

Thursday, July 8th.
SOCC: Characterizing Cloud Computing Hardware Reliability: Kashi
Vishwanath (Microsoft Research) , Nachi Nagappan (Microsoft Research)

SOCC: A Self-Organized, Fault-Tolerant and Scalable Replication scheme
for Cloud Storage: Nicolas Bonvin (EPFL), Thanasis Papaioannou (EPFL),
Karl Aberer (EPFL)

Thursday, July 15th.

SOCC: Making Cloud Intermediate Data Fault-Tolerant: Steve Ko
(Princeton University) , Imranul Hoque (University of Illinois at
Urbana-Champaign) , Brian Cho (University of Illinois at
Urbana-Champaign) , Indranil Gupta (University of Illinois at

SOCC: Characterizing, Modeling, and Generating Workload Spikes for
Stateful Services : Peter Bodik (UC Berkeley) , Armando Fox (UC
Berkeley) , Michael Franklin (UC Berkeley) , Michael Jordan (UC
Berkeley) , David Patterson (UC Berkeley)

Thursday, July 22nd (Magda is out-of-town, YongChul is leading)

SOCC: Stateful Bulk Processing for Incremental Algorithms: Dionysios
Logothetis (UC San Diego), Christopher Olston (Yahoo! Research) ,
Benjamin Reed (Yahoo! Research) , Kevin Webb (UC San Diego) , Kenneth
Yocum (UC San Diego)

SOCC: Comet: Batched Stream Processing for Data Intensive Distributed
Computing: Bingsheng He (Microsoft Research), Mao Yang (Microsoft
Research) , Zhenyu Guo (Microsoft Research) , Rishan Chen (Beijing
University) , Wei Lin (Microsoft Research) , Bing Su (Microsoft
Research) , lidong Zhou (Microsoft Research)

FRIDAY, July 30th (Moving to Friday because Magda is out-of-town on Thursday)

SIGMOD: An Evaluation of Alternative Architectures for Transaction
Processing in the Cloud Simon Loesing, ETH Zurich; Tim Kraska, ETH
Zurich; Donald Kossmann, ETH Zurich

SIGMOD: Low Overhead Concurrency Control in Partitioned DBMSs
Evan Jones, MIT; Daniel Abadi, Yale; Samuel Madden, MIT

Thursday, August 5th

SOCC: Google Fusion Tables: Data Management, Integration and
Collaboration in the Cloud: Alon Halevy (Google) , Hector Gonzalez
(Google) , Jayant Madhavan (Google) , Christian Jensen (Aalborg
University) , Jonathan Goldberg-Kidon (MIT) , Warren Shen (Google) ,
Rebecca Shapley (Google) , Anno Langen (Google)

SIGMOD - Industrial: Google Fusion Tables: Data Management,
Integration and Collaboration in the Cloud Jonathan Goldberg-Kidon
(Google Inc.), Hector Gonzalez (Google Inc.), Alon Halevy (Google
Inc.), Christian Jensen (Google Inc.), Anno Langen (Google Inc.),
Jayant Madhavan (Google Inc.), Rebecca Shapely (Google Inc.)

SIGMOD - Industrial: OpenII: An Open Source Information Integration
Toolkit Len Seligman (MITRE) , Peter Mork (The MITRE Corporation),
Alon Halevy (Google), Ken Smith (MITRE), Michael Carey (UC Irvine),
Kuang Chen (University of California at Berkeley), Chris Wolf (MITRE),
Jayant Madhavan (Google), Akshay Kannan (University of California at

Thursday, August 12th

SIGMOD: Efficient Parallel Set-Similarity Joins Using MapReduce. Rares
Vernica, University of California, Irvine; Michael Carey, UC Irvine;
Chen Li, Univ of California, Irvine and BiMaple

SIGMOD: The DataPath System: A Data-Centric Analytic Processing Engine
for Large Data Warehouses Subi Arumugam, U Florida; Alin Dobra, UFL;
Christopher Jermaine, Rice U.; Luis Perez, Rice University; Niketan
Pansare, Rice University

Thursday, August 19th

SOCC: RACS: A Case for Cloud Storage Diversity: Lonnie Princehouse (Cornell
University) , Hussam Abu-Libdeh (Cornell University) , Hakim
Weatherspoon (Cornell University)

SOCC: G-Store: A Scalable Data Store for Transactional Multi key
Access in the Cloud: Sudipto Das (UC Santa Barbara) , Divyakant
Agrawal (UC Santa Barbara) , Amr El Abbadi (UC Santa Barbara)


Thursday, August 26th

CANCELLED PAPER NOT AVAILABLE - SIGMOD - Industrial: Experiences Evolving a New Analytical Platform:
What Works and What's Missing Jeff Hammerbacher (Cloudera)

SIGMOD - Industrial: Integrating Hadoop and parallel DBMS
Yu Xu (Teradata), Pekka Kostamaa (Teradata), Like Gao (Teradata)

SIGMOD - Industrial: A Comparison of Join Algorithms for Log
Processing in MapReduce Spyros Blanas (University of Wisconsin),
Jignesh Patel (University of Wisconsin), Vuk Ercegovac , Jun Rao (IBM
Research), Eugene Shekita (IBM Almaden Research Center), Yuanyuan Tian
(IBM Almaden Research Center)

Thursday, September 2nd.

SIGMOD - Industrial: Ricardo: Integrating R and Hadoop Yannis Sismanis
(IBM Almaden) , Sudipto Das (UC Santa Barbara), Rainer Gemulla (IBM
Almaden Research Center), Peter Haas (IBM Almaden Research Center),
Kevin Beyer (IBM Almaden Research Center), John McPherson (IBM Almaden
Research Center)

SIGMOD - Industrial: Datawarehousing and Analytics Infrastructure at
Facebook Ashish Thusoo (Facebook) , Dhruba Borthakur (Facebook)

SIGMOD - Industrial: Extreme Scale with Full SQL Language Support in
Microsoft SQL Azure Nigele Ellis (Microsoft) , Gopal Kakivaya , Dave
Campbell (Microsoft)


% --------------------------------------------------
% Papers that we will not cover but that I encourage you
% to scheme and read if relevant to your work
% --------------------------------------------------
SOCC: Benchmarking Cloud Serving Systems with YCSB : Brian Cooper
(Yahoo! Research) , Adam Silberstein (Yahoo! Research) , Erwin Tam
(Yahoo! Research) , Raghu Ramakrishnan (Yahoo!) , Russell Sears
(Yahoo! Research)

SOCC: Automated Software Testing as a Service: George Candea (EPFL) ,
Stefan Bucur (EPFL) , Cristian Zamfir (EPFL) [Position Paper]

SOCC: Fluxo: A System for Internet Service Programming by Non-expert
Developers: Emre Kiciman (Microsoft Research), Benjamin Livshits
(Microsoft Research), Madanlal Musuvathi (Microsoft Research), Kevin
Webb (University of California San Diego)

SOCC: Nephele/PACs: A Programming Model and Execution Framework for
Web-Scale Analytical Processing: Stephan Ewen (TU Berlin) , Fabian
Hueske (TU Berlin) , Daniel Warneke (TU Berlin) , Dominic Battré (TU
Berlin) , Volker Markl (TU Berlin) , Odej Kao (TU Berlin)

SOCC: The Case For PIQL: A Performance Insightful Query Language:
Michael Armbrust (UC Berkeley), Nick Lanham (UC Berkeley), Stephen Tu
(UC Berkeley) , Armando Fox (UC Berkeley) , Michael Franklin (UC
Berkeley) , David Patterson (UC Berkeley) [Position paper]

SOCC: Towards Automatic Optimization of MapReduce Programs : Shivnath
Babu (Duke University) [Position Paper]

SOCC: Hermes: Clustering Users in Large-Scale E-mail Services: Thomas
Karagiannis (Microsoft Research), Christos Gkantsidis (Microsoft
Research), Dushyanth Narayanan (Microsoft Research) , Antony Rowstron
(Microsoft Research)

SOCC: Defining Future Platform Requirements for e-Science Clouds:
Lavanya Ramakrishnan (Lawrence Berkeley National Lab), Keith
Jackson(Lawrence Berkeley National Lab) , Shane Canon (Lawrence
Berkeley National Lab) , Shreyas Cholia (Laurence Berkeley National
Lab) , John Shalf (Lawrence Berkeley National Lab) [Position paper]

SOCC: An Operating System for Multicore and Clouds: Mechanisms and
Implementation : David Wentzlaff (MIT) , Charles Gruenwald III (MIT
CSAIL) , Nathan Beckmann (MIT CSAIL) , Kevin Modzelewski (MIT CSAIL) ,
Adam Belay (MIT CSAIL) , Lamia Youseff (MIT CSAIL) , Jason Miller (MIT
CSAIL) , Anant Agarwal (MIT CSAIL)

SOCC: Differential Virtual Time (DVT): Rethinking I/O Service
Differentiation for Virtual Machines: Mukil Kesavan (Georgia Institute
of Technology), Ada Gavrilovska (Georgia Institute of Technology),
Karsten Schwan (Georgia Institute of Technology)

SIGMOD: Automatic Contention Detection and Amelioration for
Data-Intensive Operations. John Cieslewicz, Columbia University;
Kenneth Ross, Columbia University; Kyoho Satsumi, Columbia University;
Yang Ye, Columbia University

SIGMOD: Indexing Multi-dimensional Data in a Cloud System. Jinbao Wang,
Harbin Institute of Technology; Hong Gao, Harbin Institute of
Technology; Sai Wu, National Univ. of Singapore; Beng chin Ooi,
National University of Singapore

SIGMOD: Efficient Querying and Maintenance of Network Provenance at
Internet-Scale Wenchao Zhou, University of Pennsylvania; Micah Sherr,
University of Pennsylvania; Tao Tao, University of Pennsylvania;
Xiaozhou Li, University of Pennsylvania; Boon Thau Loo, University of
Pennsylvania; Yun Mao, University of Pennsylvania

SOCC: Lithium: Virtual Machine Storage for the Cloud: Jacob Hansen
(VMware) , Eric Jul (Bell Labs, Dublin)

SOCC: Virtual Machine Power Metering and Provisioning: Aman Kansal
(Microsoft Research) , Feng Zhao (Microsoft Research) , Jie Liu
(Microsoft Research) , Nupur Kothari (USC) , Arka Bhattacharya (IIT

SOCC: Robust and Flexible Power-Proportional Storage: Hrishikesh Amur
(Georgia Institute of Technology) , James Cipar (Carnegie Mellon
University) , Varun Gupta (Carnegie Mellon University) , Michael
Kozuch (Intel Corporation) , Gregory Ganger (Carnegie Mellon
University) , Karsten Schwan (Georgia Institute of Technology)