ESTIMATED TIME: 4 hours.
DUE DATE: Wednesday, 11/5, 11:59pm >
TURN IN INSTRUCTIONS: Turn in the file tune.sql to the Catalyst drop box.
POSTGRES, JDBC DRIVERS, CLASSPATH: You need the same setup as in project 2. See instructions there.
STARTER CODE: DatabaseGenerator.java, importDatabase.sql, TestQueries.java, Dowload them to your local directory. You should not modify any of them. You will examine importDatabse.sql and TestQueries.java.
FORCING POSTGRES TO OPTIMIZE DECENTLY: Find the file
postgresql.conf (it is is normally in the directory where your
postgres data file resides). Set
CREATE YOUR DATABASE
createdb project3
RUN THE STARTER CODE Edit TestQueries.java to update the username and password, then:
javac DatabaseGenerator.java javac TestQueries.java java DatabaseGenerator 100 psql -f importDatabase.sql project3 java TestQueries project3 1The numbers 100 and 1 represent the scale factors; normally you wouldn't change them, but if you want you may increase them test your solution more extensively.
If you get any error up to here, contact Michael or myself.
TESTING AND RUNNING YOUR SOLUTION All you have to do in this project is to create a file tune.sql; you will turn in only this file. After you create the file or update it, to test your solution do this:
dropdb project3 createdb project3 psql -f importDatabase.sql project3 psql -f tune.sql project3 java TestQueries project3 1You need to drop the database then re-create it because TestQueries performs some insertions/deletions, so the database is no longer the same.
R(K,A,B,C,D,E) S(K,A,B,C), T(K,D, E) -- S.B is a foreign key to T.K U(K,A,B)You don't care what the database means or contains, all you care about this database is that there are some users that access it extensively by running TestQueries.java. You are in charge of administrating the database and optimizing it for performance. As you saw, TestQueries.java takes several minutes to run: you need to improve that.
Create a file tune.sql where you put some star dust to sprinkle over the database that makes TestQueries run faster. The customers don't care what you do to the database, but after your star dust has settled, TestQueries will run much faster and your customers will love you.
Here are the rules.
explain select count(*) from R where A = 88;
An unpleasant surprise is postgres does not always cooperate with your instructor's lectures. Those lectures assume that SQL does certain optimizations, but postgres does not do them. I managed to find out how optimize horizontal partitions, and we will discuss this in class. (If you want to read yourself, you'll find it here http://www.postgresql.org/docs/8.3/interactive/ddl-partitioning.html, but don't worry about it, we'll discuss this in class). I did not figure out how to do vertical partitions, and the lack of optimization does affect performance. Here we will cheat a bit, details in class.