In this seminar we will read some foundational papers on OLAP Data Cube which helps data analysis
by displaying aggregate values along different dimensions of a multidimensional database.
We will also study some large scale data analytics systems from academia and industry.
Week | Topic | Paper(s) | Link | Slides | Presenter | |
---|---|---|---|---|---|---|
Monday, 1/6 | Overview and Organization | |||||
1 | Monday, 1/13 | Basics | A. Stonebraker's blog on data warehouses |
paper A (restr) paper B (restr) paper C (restr) |
slides (restr) | Laurel and Jeremy |
Monday, 1/20
Martin Luther King, Jr's Birthday |
||||||
2 | Monday, 1/27 | Implementation and Index |
Harinarayan et. al. A Efficient implementation, SIGMOD 1996 B Index selection, ICDE 1997 |
paper A (restr)
paper B (restr) |
slides (restr) | Dominik and Shumo |
3 | Monday, 2/3 | Iceberg and Sparse Cubes |
A Fang et. al.: Iceberg queries, VLDB 1998 B (optional) Xin et. al.: Iceberg cube, VLDB 2003 C Ross and Srivastava: Sparse cubes, VLDB 1997 |
paper A (restr)
paper B (restr) paper C (restr) |
slides 1(restr)
slides 2(restr) |
Eric and Stephen |
4 | Monday, 2/10 | Distributed Materialization and High Dimensional OLAP |
A Nandi et. al.: Distributed cube materialization, ICDE 2011 B Li et. al.: Minimal cubing, VLDB 2004 |
paper A (restr)
paper B (restr) |
slides (restr) | Paris and Daniel |
Monday, 2/17
President's day |
||||||
5 | Monday, 2/24 | Interactive Analysis and Complex Data |
Sarawagi and Sathe: A Roll ups, VLDB 2001 B (optional) Interactive analysis, SIGMOD 2000 C Pedersen et. al.: complex data, 2001 |
paper A (restr)
paper B (restr) paper C (restr) |
slides 1 (restr)
slides 2(restr) |
Jennifer and Prasang |
6 | Monday, 3/3 | Data Analytics Systems (Academia) |
A. Shark, SIGMOD 2013 B. Spark, HotCloud 2010 C. MAD, VLDB 2009 |
paper A (restr) paper B (restr) paper C (restr) |
slides 1(restr)
slides 2(restr) slides 3(restr) |
Ryan (Shark) Emad (Spark) and Ben (MAD) |
7 | Monday, 3/10 | Data Analytics Systems (Industry) |
LinkedIn: A. (Main) Web-scale analytics , VLDB Endow. 2012 B. (Optional) big data ecosystem, SIGMOD 2013 Twitter: C. (Main) Logging infrastructure VLDB Endow. 2012 D. (Optional) query suggestion architecture, SIGMOD 2013 |
paper A (restr)
paper B (restr) paper C (restr) paper D (restr) |
slides 1(restr)
slides 2(restr) |
Kanit (LinkedIn) and Shengliang (Twitter) |
To sign up for presentations, please send an email to Sudeepa Roy, sudeepa@cs.
Book Chapter Jensen et. al.: Multidimensional Databases and Data Warehousing, 2010 paper (restr)
1. Feng et. al.: Towards a unified architecture for in-RDBMS analytics, SIGMOD 2012 paper (restr)
2. Huang et. al.: Cumulon: optimizing statistical data analysis in the cloud, SIGMOD 2013 paper (restr)
3. Melnik et. al.: Dremel: Interactive Analysis of Web-Scale Datasets. (Google) VLDB 2009 paper (restr) (covered in Autumn 13)