CSE 344 section 06 -- Topics for homework 5 ================================================= ABOUT JDBC In homework 5, you will extend a Java program that connects to a database using Java Database Connectivity, or JDBC. JDBC is part of the Java libraries; the docs on JDBC are here: http://download.oracle.com/javase/6/docs/technotes/guides/jdbc/ By using JDBC objects and methods, you can talk to your database server, sending SQL queries and updates as Java strings, and retrieving the results. JDBC does not know how to connect to any DBMS by itself; each DBMS vendor provides a "JDBC driver" as a Java library archive (.jar file), which you load into the Java VM so that JDBC can use it to connect to your database system. ==================== THE HW5 STARTER CODE The starter code for homework 5 consists of 4 items: - Query.java and VideoStore.java, the starter source files - postgresql-8.4-701.jdbc4.jar, the Postgres JDBC driver - dbconn.properties dbconn.properties is a Java "properties file", which is just a fancy name for a file with one key-value pair on each line, like so: # This is a comment. Comment lines must have # or ! as the first # non-whitespace character. firstKey = 123string secondKey=another string, which can have any punctuation prefix.thirdKey = Key prefixes are often used by convention; Java ignores them (The exact syntax can be found at: http://download.oracle.com/javase/6/docs/api/java/util/Properties.html ) Query.java loads the properties file into a java.util.Properties object. It uses the Properties object to get various configuration info: - videostore.jdbc_driver - the class name of the JDBC driver for Postgres - videostore.imdb_url - Where to find the IMDB database on Postgres (see below) - videostore.postgres_username - Username to log into Postgres - videostore.postgres_password - Password for Postgres imdb_url is a JDBC "connection URL", which tells JDBC how to find your database. The JDBC URL syntax is: jdbc:DRIVER_IDENTIFIER:DRIVER_SPECIFIC where DRIVER_IDENTIFIER identifies what JDBC driver to use, and DRIVER_SPECIFIC is some information interpreted by the driver alone. For Postgres, the DRIVER_IDENTIFIER is "postgresql". To connect to a local Postgres database, you give the database name for the DRIVER_SPECIFIC string, hence you will have a URL like this: jdbc:postgresql:YOUR_DATABASE_NAME ======================== RUNNING THE STARTER CODE First, we need to modify dbconn.properties with the right config info for our Postgres install, by changing imdb_url, postgres_username, and postgres_password properties. On my machine: videostore.imdb_url = jdbc:postgresql:imdb videostore.postgres_username = michaelr videostore.postgres_password = 123456 (123456 is not my real password, by the way :) If you run Postgres on your own computer, you will probably have "postgres" for the username, and the same password you used in homework 3. If you run it from the basement lab, you should use your CSE username, and the password can be anything as Postgres ignores it. Now start up Postgres if needed. Once that is done, you are ready to open a command line. First you need to change to the folder where you unzipped the starter code. Then you need to set the CLASSPATH variable, which Java uses to find pre-compiled classes. You need to have Java look in the current folder (a dot, ".") and the .jar file in the starter code ("postgresql-8.4-701.jdbc4.jar"). (The .jar file is in the current folder too, but Java customarily ignores it unless you specify it directly, or specify "./*" instead of plain ".".) On Windows, you do: set CLASSPATH=.;postgresql-8.4-701.jdbc4.jar (Note the semicolon ; separating the two entries.) On Linux or Mac OSX, you do: export CLASSPATH=.:postgresql-8.4-701.jdbc4.jar (Note the colon : separator.) Now that you've set the CLASSPATH, you can compile the source files: javac -g VideoStore.java Query.java and run the starter code: java VideoStore But the program errors out, saying it needs a username and password. We can just enter a dummy one at this point, because the starter program doesn't check usernames or passwords anyway: java VideoStore alice password And hooray! You get a list of possible commands. Type "quit" to quit. ========================================= CONNECTING TO A DATABASE SERVER WITH JDBC Connecting to a database is a two-step process. First, you make sure that the JDBC driver for your DBMS is loaded. In all recent versions of Java, adding the driver to the CLASSPATH is enough, but in very old versions, you need to explicitly reference the driver's top-level Java class to load it, as we do in Query.openConnection(): postgreSQLDriver = configProps.getProperty("videostore.jdbc_driver"); /* load other properties from configProps... */ Class.forName(postgreSQLDriver).newInstance(); Then we can use the JDBC Java.sql.DriverManager class to create the connection, giving the connection URL, username, and password: conn = DriverManager.getConnection(postgreSQLUrl, // database postgreSQLUser, // user postgreSQLPassword); // password This returns a java.sql.Connection object called `conn', which we keep around as an instance field of the Query object. =============== RUNNING A QUERY The search transaction, in Query.transaction_search(), runs several SQL queries; the first one is always the one referred to by the local variable searchSql. This builds a SQL command string `searchSql' by concatenating together various pieces. Then, it does: Statement searchStatement = conn.createStatement(); ResultSet movie_set = searchStatement.executeQuery(searchSql); First, we ask the Connection to create a Statement object that will run SQL against that connection. Then we call the Statement's executeQuery() method, passing in the SQL string `searchSql'. This returns a ResultSet object, which represents an iterator over the tuples of the result table for the SQL query in `searchSql'. To obtain data from the ResultSet, transaction_search() does this: while (movie_set.next()) { int mid = movie_set.getInt(1); System.out.println("ID: " + mid + " NAME: " + movie_set.getString(2) + " YEAR: " + movie_set.getString(3)); // ... } Initially, the ResultSet iterator points to "just before" the first result tuple. We use ResultSet.next() to advance to the next tuple; it returns true until there are no more tuples to read, then it returns false. To access the attributes of the current tuple, we use the getX() functions. getInt(N) returns the value of attribute N (numbering from left to right, starting at 1 - not 0!), assuming it is an integer that fits in the range of Java `int'. Similarly for getString(), getLong(), and so on. =========================================== PREPAREDSTATEMENT AND PARAMETERIZED QUERIES The first query run by search incorporates a variable parameter by interpolating its value through string concatenation: /* Interpolate the movie title into the SQL string */ String searchSql = SEARCH_SQL_BEGIN + movie_title + SEARCH_SQL_END; search() also demonstrates another way to work with variable parameters. Notice that the DIRECTOR_MID_SQL SQL code has a ? mark in it: private static final String DIRECTOR_MID_SQL = "SELECT y.* " + "FROM movie_directors x, directors y " + "WHERE x.mid = ? and x.did = y.id"; (Also notice the spaces after each line; remember that we are building this string by concatenating together other string literals, which *don't* include the newlines!) The ? denotes a parameter which will be substituted later, but the substitution will be done by Java itself, not by us through string interpolation. To use SQL strings with ? marks, we need to use a PreparedStatement object instead of a Statement object. In prepareStatements(), we ask the database connection to give us a PreparedStatement corresponding to DIRECTOR_MID_SQL: directorMidStatement = conn.prepareStatement(DIRECTOR_MID_SQL); Then, in transaction_search(), we can interpolate the parameter and then run the statement as follows: directorMidStatement.clearParameters(); directorMidStatement.setInt(1, mid); ResultSet director_set = directorMidStatement.executeQuery(); After clearing the old parameter set, if any, we tell the PreparedStatement to set its 1st parameter (again, 1-based) to our value `mid'. Java will take care of interpolating this parameter into the SQL code for us. (Actually, what usually happens is that the JDBC driver sends the parameter separately from the SQL, using features in the DBMS wire protocol.) Then we run the query by calling the executeQuery() method (but without a SQL string; we already supplied it when we created the PreparedStatement). In addition to parameters, PreparedStatements have another advantage over Statements - they can be reused without having to provide the SQL code all over again. This saves typing, obviously, and may also be faster for the DBMS, which doesn't have to compile the same SQL code over and over. ==================== ADDING OUR OWN QUERY Enough of seeing how other people write queries. Let's add our own query to Query.java. For convenience, we hijack the transaction printPersonalData(). This gets run each time the program's command-line prompt is shown. It's supposed to show some information about the logged-in customer, but because we have no customers yet, let's hijack it to show some information about a random movie. Add the following to the body: PreparedStatement pstmt = conn.prepareStatement("SELECT name, year FROM Movie WHERE id = ?"); pstmt.setInt(1, 147577); ResultSet results = pstmt.executeQuery(); while (results.next()) { System.out.println("name=" + results.getString(1)); System.out.println("year=" + results.getInt("year")); System.out.println(); } results.close(); All of this we have seen before, with two exceptions: - You can pass strings to ResultSet.getX(); this gets the attribute with the given name, as opposed to the one with the given column position. - After exiting the loop, we close() the ResultSet. It's generally a good practice to close any JDBC objects once we're done with them. It lets the DBMS free up the resources associated with that prepared SQL command, or that query result set, or that connection. ======================= DATABASE UPDATES IN SQL We haven't really covered how to make changes to a database using SQL; you've seen it in passing in class and section, and it was covered in homework 1. For reference, there are four types of database updates to existing tables: Inserting a single new tuple into a table: INSERT INTO R(x,y,z) VALUES ('the X value', 42, 32.7); * If attributes in table R aren't specified after R's name in the INSERT, their values default to null. * You don't need to have the attribute names in parentheses at all, if you give the attribute values in the same order as the names appear in CREATE TABLE. (sqlite .schema, postgres \dt) Inserting a query result set into a table: INSERT INTO R(x,y,z) SELECT a, b, c FROM [...]; The full query statement replaces the single VALUES expression. Updating existing tuples in a table: UPDATE R SET x = 'quick brown fox', z = 25.3 WHERE y = 42; The WHERE clause works as it does in a query, selecting the tuples in R whose values are to be updated per the SET clause. Deleting tuples from a table: DELETE FROM R WHERE y <> 1000; You can omit the WHERE clause from both UPDATE and DELETE statements. But you probably *really* don't want to do that, because then every tuple will be changed or deleted. ========================== DATABASE UPDATES FROM JAVA Last, and maybe most, we discuss how to run database updates from Java. There are methods in Statement and PreparedStatement for this purpose: Statement.executeUpdate(updateSQLString) PreparedStatement.executeUpdate() Instead of returning a iterator over matching tuples (because updates don't return a tuple set), these methods return the number of tuples that were affected by the update. They are otherwise similar to queries. N.B.: executeUpdate() will throw an exception if the SQL code returns a tuple set, while the earlier executeQuery() will throw if the SQL code *does not* return a tuple set!