Starting with PostgreSQL in the Windows Instructional Lab

Are you using PostgreSQL on your own computer?

If so, then some of these instructions may not apply to you. Make sure you have read the instructions on installing software for working from home, and in particular that you understand how to adapt these instructions to a personal PostgreSQL installation.

Introduction

To help you gain experience with multiple database management systems, we will use the open-source PostgreSQL DBMS along with SQL Server in CSE 444. Instead of running on a central server such as IISQLSRV, you will run Postgres on your personal computer with data stored in your private files.

For the second and third projects you'll run Postgres on the Windows computers in the undergrad labs. This page is a brief introduction; if you would like to learn more about Postgres, here is a more detailed tutorial. (We recommend skipping the tutorial's material on the SQL language, especially joins; refer to your book and the lecture notes instead.)

Getting a PostgreSQL command prompt

Unlike SQL Server, Postgres is not usually used through a graphical interface. Rather, the server and client tools are typically accessed from a command line. To get a command shell that is set up to run these tools on the Lab machines, download and run the Postgres shell launcher script, 444shell.cmd. This script sets the command search PATH variable to include the Postgres commands, and the Java classpath to include the Postgres and SQL Server JDBC drivers (needed for project 2).

To test that 444shell.cmd has set up the tools paths correctly, enter the command psql -?. You should see a usage message like the following:

psql is the PostgreSQL interactive terminal.

Usage:
  psql [OPTIONS]... [DBNAME [USERNAME]]
[[Continues...]]

The shell created by 444shell.cmd uses the MS-DOS/Windows command language, not the Unix sh/bash shell you may already know. Here is a tutorial on using DOS and the command shell; although it is quite out of date, much of it is still useful. Here is Microsoft's official documentation on the Windows shell.

Creating a data folder

Both SQL Server and Postgres store databases in ordinary files within the filesystem. On IISQLSRV, all the data files are located in a single directory which you cannot access. When you use Postgres, however, you will create a data folder just for yourself. To create the data files, do the following:

  1. Create the folder where you want to put the data files. This can be in your Unix home directory Z:\ , but access to that network drive may be slow. For faster access, you can put it on a flash drive, or on the local disk (C:) of the computer; however, any data you put pn the local disk is lost when you log out.
  2. Open a Postgres shell using the environment script above.
  3. Run the initdb command to create the data files:
    > initdb -D Z:\pgsql_data

Starting and stopping PostgreSQL

To start a Postgres server, run the pg_ctl start command by choosing Start→Run, giving the name of the data folder you created earlier:

> "C:\Program Files\PostgreSQL\8.4\bin\pg_ctl.exe" start -D Z:\pgsql_data

Remember: don't type this into the Postgres shell you got from 444shell.cmd, type it into the Start→Run menu.

This will pop up a shell window that displays server starting among other messages. When done, you can stop PostgreSQL either by hitting Ctrl-C in the shell window, or by using the pg_ctl stop command:

> "C:\Program Files\PostgreSQL\8.4\bin\pg_ctl.exe" stop -D Z:\pgsql_data

On Windows Vista and 7, you can also type these commands into the Start menu's search box, instead of the Start→Run box.

Creating a PostgreSQL database

Just as in SQL Server, your tables must be placed in a database, which you must create. To create a database named my_database, run the createdb command from a Postgres shell that you got from 444shell.cmd:

> createdb my_database

Unlike SQL Server, Postgres is case-sensitive when looking up database and table names. However, Postgres automatically lowercases all names given in SQL code, so the case-sensitivity only affects non-SQL code that uses the database name, such as:

In such code, always spell the database name exactly as you created it, respecting case.

Running queries with psql

To run SQL queries on SQL Server, you use SQL Server Management Studio. In Postgres, you use the psql utility. Once the Postgres server is running, you can run psql by opening a Postgres shell and typing

> psql my_database

where my_database is the name of the database you want to use. (If you omit the database name, psql defaults to accessing the database with the same name as your username.)

When psql opens, you will see a message like this:

psql (8.4.2)
Type "help" for help.

alice=#

The line alice=# is the prompt for SQL statements which are sent to the database server, or non-SQL commands interpreted by psql. Here, "alice" is the name of the database.

To exit psql, type \q and press Enter:

alice=# \q

C:\>

Note that there is no semicolon after \q; this is required because \q and other backslash-quoted commands are not SQL and are interpreted by psql, rather than the Postgres server.

Entering queries

Type in a SQL statement to run it. SQL statements can be split across multiple lines; to send the SQL statement to the server, end the statement with a semicolon and press Enter. Depending on the command, psql will either respond with a confirmation message:

alice=# DELETE FROM hw1_data
alice-# WHERE name='name';
DELETE 1

or display the results of the query in a table:

alice=# SELECT * FROM hw1_data;
  name  | discount | month | price
--------+----------+-------+-------
 bar1   | 15%      | apr   | 19
 bar8   | 15%      | apr   | 19
 gizmo3 | 15%      | apr   | 19
 gizmo7 | 15%      | apr   | 19
 mouse1 | 15%      | apr   | 19
 bar1   | 15%      | aug   | 19
 bar8   | 15%      | aug   | 19
 gizmo3 | 15%      | aug   | 19
 gizmo7 | 15%      | aug   | 19
 mouse1 | 15%      | aug   | 19
 bar1   | 33%      | dec   | 19
 bar8   | 33%      | dec   | 19
 gizmo3 | 33%      | dec   | 19
 [[Continues...]]
 (426 rows)

If the result table is too large to fit in the shell window, it will be shown one window-ful at a time; press Enter to go on to the next window, until the end.

If you make a mistake while typing in a query, you can use the up-arrow and down-arrow keys on the keyboard to move between previously entered lines, which you can then edit and resubmit.

Running queries from an SQL file

As in SQL Server Management Studio, you can use psql to run SQL code from an external file as well as from interactive input. This can be done with the \i psql command:

alice=# \i 'query.sql'

Note that psql follows Postgres in allowing backslash escape sequences in character strings. This means that an absolute pathname like D:\subdir\query.sql must be written either by doubling the backslashes, as in 'D:\\subdir\\query.sql', or turning them into forward slashes, as in 'D:/subdir/query.sql'.

Alternativeley, you can run psql with the query file directly from the shell:

> psql -f "query.sql" my_database

Saving query output to a file

You can send the output of queries to a file instead of (not in addition to) your console with the \o psql command:

alice=# \o 'query_output.txt'

Note that the SQL code of queries will not be saved to the file. To stop saving query output and send it to your console again, use the same command, but omit the filename:

alice=# \o

Copying data from a file into a table

You can import data from a file on the client computer into an existing database table using the \copy psql command:

alice=# \copy hw1_data from 'hw1_data.txt'

For the first homework, there is an SQL script that imports the data from the data file into a new table hw1_data. This script uses the \copy command internally.

Getting info about tables and database objects

In SQL Server Management Studio, you can view information about the columns, constraints, and indices on a table through the tree view on the left side of the Management Studio window. To get similar information in Postgres, you use the \d psql command:

alice=# \d hw1_data
           Table "public.hw1_data"
  Column  |         Type          | Modifiers
----------+-----------------------+-----------
 name     | character varying(50) |
 discount | character varying(50) |
 month    | character varying(50) |
 price    | character varying(50) |
 

Getting info about query plans

In Management Studio, you can get the estimated plan for a SQL query by choosing the Query → Display Estimated Execution Plan menu item. The equivalent in Postgres is the EXPLAIN statement of SQL, which produces a plain-text representation of the physical query plan. (Instead of EXPLAIN, SQL Server has the SET SHOWPLAN and SET STATISTICS families of statements, but the syntax is different.)

Here is an example of the use of EXPLAIN on a simple query:

alice=# EXPLAIN SELECT * FROM hw1_data;
                        QUERY PLAN
-----------------------------------------------------------
 Seq Scan on hw1_data  (cost=0.00..7.26 rows=426 width=17)
(1 row)

Management Studio also lets you view the actual plan for a query, by turning on the Query → Include Actual Execution Plan menu option before running the query. The equivalent function in Postgres is the EXPLAIN ANALYZE variation of the EXPLAIN statement:

alice=# EXPLAIN ANALYZE SELECT * FROM hw1_data;
                        QUERY PLAN
-----------------------------------------------------------
 Seq Scan on hw1_data  (cost=0.00..7.26 rows=426 width=17)
(actual time=0.011..0.183 rows=426 loops=1)
 Total runtime: 0.390 ms
(2 rows)