Homework 3: JDBC, Updates and Transactions

ESTIMATED TIME: 12 hours.

HOW TO TURN IN: Submit your files using this Catalyst Dropbox .  The due date is May 2, 11:45 pm.

SOFTWARE: Please Install Software that you will need for this homework.

JDBC DRIVERS: In order to connect to a database from Java, you need database-specific JDBC drivers on your classpath. You will need two JDBC drivers, one for Postgres and the other for SQL Server (follow the instructions on the Installing Software page to download them). After downloading these jar files, you need to modify your CLASSPATH and include them there. On Windows machines, go to Control Panel -> System -> Advanced. Click on the Environment Variables button and choose the CLASSPATH to edit. Make sure you include the complete paths for these two jar files. The path should be to the files themselves, not just the directory containing them. Alternatively, you can download the CSEP544 shell launcher script  and modify it  to reflect the location of these jar files on your machine.

POSTGRES: Instructions on setting up PostgreSQL are here.

STARTER CODE: Project2.java, Query.java Dowload them to your local directory. You will need to modify Query.java (you can modify Project2.java if you want to, but it's not needed for the project.)

RUNNING THE STARTER CODE: Create the 'CUSTOMER' database on PostgreSQL. Then run the following in a Windows shell.You may choose to work in a Java IDE like Eclipse instead of working from the command-line as shown below.

javac Project2.java
javac Query.java
java Project2 uid passwd
(The starter code does not authenticate the user, so uid and passwd are ignored: one of your jobs is to modify the starter code to do the authentication.) At this point you should see this:
 *** Please enter one of the following comands *** 
> search <movie title>
> plan [<plan id>]
> rent <movie id>
> return <movie id>
> fastsearch <movie title>
> quit
>
The command 'search' works (sort of). Try typing:
search Nixon
After a few seconds you should start getting movie titles containing 'Nixon', and their directors. (You don't yet get the actors: one of your jobs is to list the actors.)

PROJECT DESCRIPTION

Develop a simple video-on-demand application. This is a typical application using a database: the only thing we are missing is a Web interface. You will connect to two databases, one on Postgres, the other on SQL Server.

You sell video streams to customers. You have a contract with a content provider that has videos of all the movies in the IMDB database, and you resell this content to your customers. Your first task is to create a database of your customers. Next, your application allows the user to search the IMDB movie database (the starter code already does that), and to rent movies (which we assume are delivered by the content provider; we don't do this part in the project). Once a customers rents a movie, she can watch it as many times as she wants, until she decides to "return" it to your store. You need to keep track of which customers are currently renting which movies.

There are two important restrictions:

  1. Your contract with the content provider allows you to rent each movie to at most one customer at any one time: the movie needs to be first returned before you may rent it again to another customer (or the same customer).
  2. Your own business model imposes a second important restriction: your store is based on subscription (much like Netflix), that allows customers to rent up to a maximum number of movies. Once they reach that number you will deny them more rentals, until they return a movie. You offer a few different rental plans, each with its own monthly fee and maximum number of movies.

Task 1

Design and create the CUSTOMER database in Postgres.

What to turn in: a script file with CREATE TABLE and with INSERT statements for this database.

The database has the following entity sets:

E1. Customer: a customer has an id (integer), a login, a password, a first name and a last name.

E2. Plan. Each plan has a plan id (integer), a name ("Basic", "Rental Plus", "Super Access" or a new plan of your choice), the maximum number of rentals allowed ("Basic" allows one movie; "Rental plus" allows three; "Super Access" allows five), and the monthly fee (set this to a value of your choice for each).

E3. Rental: a "rental" entity represents the fact that a movie was rented by a customer. The movie is identified by a movie id (from the IMDB database). The rental has a status that can be open, or closed. When a customer first rents a movie, then you create an open entry in Rentals; when he returns it you updated it to closed (you don't delete it). Keeping the rental history helps you improve your business by doing data mining (but we don't do this in this class.)

In addition there are the following relationships:

R1. Each customer has exactly one rental plan

R2. Each rental refers to exactly one customer. (It also refers to a single movie, but that's in a different database, so we don't model that as a relationship.)

Create a script file with CREATE TABLE statements, and with INSERT statements that populate each table with a few tuples (say 2-8 tuples): you will turn in this script file. Write a separate script file with DROP TABLE statements: it's useful to run it whenever you find a bug in your schema or data (don't turn in this file).

Task 2

Write the java application, by completing the starter code.

What to turn in: the two java files Project2.java and Query.java and the corresponding class files.

The application is a simple command-line java program. A "real" application will have a Web interface instead, but that requires a lot of programming that is unrelated to database content of this course. Your java application needs to connect to two databases: the IMDB database on IPROJSRV, and your own CUSTOMER database on your local machine running PostgreSQL.

When your application is started it reads a user name and password from the command line. It validates them against the database, then retains the user id throughout the session. All rentals/returns are on behalf of this single customer: to change the customer you will quit the application and restart it with another customer. Much of this logic is already provided in the starter code.

Once the application is started, the user can select one of the following transactions. (We call each action a transcation: some of them you really need to write as SQL transactions, others are interactions with the database that do not require transactions.)

  1. The "search" transaction: the user types in a string, and you return:

    The starter code already returns the movies and directors: you still need to retrieve and print the actors and the availability status.

    In the starter code, the directors are obtained using a "dependent join" (sometimes called "nested loop join"). You should use the same technique for retrieving the actors: there is a more efficient way, but we'll implement it later.

    WHAT YOU WILL LEARN in 1: (a) how to run SQL queries from Java, (b) what a dependent join is, and how easy it is to embed it in an application, (c) how slow the dependent joins are.

  2. The "plan" transaction. Here the customer types in a plan id and you set his new plan to this plan id. How does the customers know what plan id's are available ? They type in "plan" without any plan id, and then you will list all available plans, their names, and their terms (maximum number of movies available for rental and monthly fees).

    WHAT YOU WILL LEARN in 2: simple database updates from Java.

  3. The "rent" transaction. The user types in a movie id, and you will "rent" that movie to the customer.

  4. The "return" transaction. The user types in the movie id to be returned. You update your records to mark the return.

    WHAT YOU WILL LEARN in 3 and 4, and also in 2: the typical interaction between the application and the database; basic data integration (you integrate CUSTOMER with IMDB); the need for transactions and rollback.

  5. The "fastsearch" function. Here you will fix the performance problem in the "search" function: "fastsearch" has exactly the same functionality as "search" (except that it doesn't have to return the availability status), but where all the dependent joins are replaced with regular joins, executed in the database engine.

    HINT: add ORDER BY to each of the three queries so you can merge-join the three streams efficiently.

    NOTE: this task requires a little bit more creativity (and more work) than each of the others, but it is still easy.

    NOTE 2: Depending on your setting, "fastsearch" may actually run slower than regular "search": if that happens for the right reason, then you are OK. In "fastsearch" you are running upfront three SQL queries, which are more complex that the single SQL query executed in "search": this initial step will take longer. Subsequently however, "fastsearch" no longer issues queries, and interates over all movies, actors, and directors exactly once, while search needs to issue two new SQL queries for each movie (not counting the computation of the availability status). Thus, the time it takes to the first answer may be longer in "fastsearch", but subsequent answers should be returned "fast".

    WHAT YOU WILL LEARN in 5: the database system is much better at performing the join than the java application.

In addition you have to provide a minimal amount of user-friendliness: at each iteration of the main loop you will print the current customer's name, and tell them how many additional movies they can rent (given their current plan and the number of movies that they have already rented). WHAT YOU WILL LEARN: give the user as much data as you can, but not more.

COMMENTS: