Getting Started with PostgreSQL

Introduction: Why PostgreSQL?

As you know, we will be using Microsoft SQL Server on IPROJSRV in this class. However, to reduce the load on IPROJSRV, and to gain experience with multiple database systems, we will be also be using the open-source PostgreSQL database system, running on your home machine with data stored in your private files.

You will have the option of using either SQL Server or PostgreSQL on the last problem in the second homework assignment. This document will tell you most of what you need to know to do this problem in PostgreSQL. Later, in the third homework, you will be required to use PostgreSQL to implement the back-end database for your program (the CUSTOMER database).

Getting a PostgreSQL command prompt

Unlike SQL Server, PostgreSQL is not usually used through a graphical interface. Rather, the server and client tools are typically accessed from a command line. You can get a command shell in Windows by running cmd.exe. The CSEP544 shell launcher script will also open a shell for you.  Type psql -U postgres at the prompt, and hit Enter. Here, postgres represents the username of the database superuser. You will be asked for a password. Enter the password you had specified during the installation. You should see a usage message similar to the following:

Welcome to psql 8.3.7, the PostgreSQL interactive terminal.
[[Continues...]]
postgres=#

Enter \q to quit the PostgreSQL shell.


Creating a PostgreSQL database

Just as in SQL Server, your tables must be placed in a database. There are two ways to create a database in PostgreSQL; here we will describe how to create them directly from the Windows shell, as opposed to doing so from SQL code. To create a database named my_database, do the following:

  1. Open a Windows shell.
  2. Use the createdb command to create the database:
    > createdb -U postgres my_database

Note that, unlike SQL Server, PostgreSQL is case-sensitive when looking up database and table names. However, PostgreSQL automatically lowercases all names given in SQL code, so for you, the case-sensitivity should only affect database names (since createdb is not an SQL statement). Make sure to be consistent in case when naming a PostgreSQL database; for example, if you ran createdb CUSTOMER, access the database using the name CUSTOMER exactly, respecting case.

Running queries with psql

To run SQL queries on SQL Server, you use SQL Server Management Studio (aka "SSMS" or "SqlWb"). In PostgreSQL, you use the psql utility. This command-line program doesn't have the fancy features of SSMS; however, psql is easy enough to use, provided you don't have so much data that a plain-text display of that data becomes unreadable. If you prefer a GUI interface for running queries, try pgAdmin III, located in the Start->Programs menu under "PostgreSQL 8.3" on the Windows Lab machines, or check out the PostgreSQL wiki's list of GUI query tools.

Running psql

 You can run psql by opening a Windows shell and typing

> psql -U postgres my_database

where my_database is the name of the database you want to use. Enter the password when prompted. When psql opens, it will show the following message:

Welcome to psql 8.3.7, the PostgreSQL interactive terminal.

Type: \copyright for distribution terms
\h for help with SQL commands
\? for help with psql commands
\g or terminate with semicolon to execute query
\q to quit

my_database=#

The line my_database=# is the prompt for SQL statements which are sent to the database server, or non-SQL commands interpreted by psql. Here, "my_database" is the name of the database; it may differ on your system. As the message suggests, you exit psql by typing \q and hitting Enter.

Entering queries

To run an SQL statement, just type it in. SQL statements can be split across multiple lines; to send the SQL statement to the server, end the statement with a semicolon and hit Enter. Depending on the command, psql will either respond with a confirmation message:

my_database=# DELETE FROM hw1_data WHERE name='name';
DELETE 1

or display the results of the query in a table:

my_database=# SELECT * FROM hw1_data;
  name  | discount | month | price
--------+----------+-------+-------
 bar1   | 15%      | apr   | 19
 bar8   | 15%      | apr   | 19
 gizmo3 | 15%      | apr   | 19
 gizmo7 | 15%      | apr   | 19
 mouse1 | 15%      | apr   | 19
 bar1   | 15%      | aug   | 19
 bar8   | 15%      | aug   | 19
 gizmo3 | 15%      | aug   | 19
 gizmo7 | 15%      | aug   | 19
 mouse1 | 15%      | aug   | 19
 bar1   | 33%      | dec   | 19
 bar8   | 33%      | dec   | 19
 gizmo3 | 33%      | dec   | 19
 [[Continues...]]
 (426 rows)

If the result table is too large to fit in the shell window, it will be shown one window-ful at a time; press Enter to go on to the next window, until the end.

If you made a mistake while typing in a query, you can use the up-arrow and down-arrow keys on the keyboard to move between previously entered lines, which you can then edit and resubmit.

Running queries from an SQL file

As in SQL Server Management Studio, you can use psql to run SQL code from an external file as well as from interactive input. This can be done with the \i psql command:

my_database=# \i 'query.sql'

Note that psql follows PostgreSQL in allowing backslash escape sequences in character strings. This means that an absolute pathname like D:\subdir\query.sql must be written either by doubling the backslashes, as in 'D:\\subdir\\query.sql', or turning them into forward slashes, as in 'D:/subdir/query.sql'.

Alternatively, you can run psql with the query file directly from the shell:

> psql -U postgres -f "query.sql" my_database

Saving query output to a file

You can send the output of queries to a file instead of (not in addition to) to your console with the \o psql command:

my_database=# \o 'query_output.txt'

Note that the SQL code of queries will not be saved to the file. To stop saving query output and send it to your console again, use the same command, but omit the filename:

my_database=# \o

Copying data from a file into a table

You can import data from a file on the client computer into an existing database table using the \copy psql command:

my_database=# \copy hw1_data from 'hw1-data.txt'

Getting info about tables and database objects

In SQL Server Management Studio, you can view information about the columns, constraints, and indices on a table through the tree view on the left side of the SSMS window. To get similar information in PostgreSQL, you use the \d psql command:

my_database=# \d hw1_data
           Table "public.hw1_data"
  Column  |         Type          | Modifiers
----------+-----------------------+-----------
 name     | character varying(50) |
 discount | character varying(50) |
 month    | character varying(50) |
 price    | character varying(50) |
 

Getting info about query plans

In SSMS, you can request that the estimated plan for a query be displayed by choosing the Query->Display Estimated Execution Plan menu item. The equivalent in PostgreSQL is the EXPLAIN statement of SQL, which produces a plain-text representation of the physical query plan. (EXPLAIN is not part of the SQL standard and does not appear in SQL Server. In SQL Server, the SET SHOWPLAN and SET STATISTICS families of statements provide similar functionality to EXPLAIN, but the syntax is awkward.)

Here is an example of the use of EXPLAIN on a simple query:

my_database=# EXPLAIN SELECT * FROM hw1_data;
                        QUERY PLAN
-----------------------------------------------------------
 Seq Scan on hw1_data  (cost=0.00..7.26 rows=426 width=17)
(1 row)

SSMS also lets you view the actual plan for a query once the query has been executed, by turning on the Query->Include Actual Execution Plan menu option before running the query. The equivalent function in PostgreSQL is the EXPLAIN ANALYZE variation of the EXPLAIN statement:

my_database=# EXPLAIN ANALYZE SELECT * FROM hw1_data;
                        QUERY PLAN
-----------------------------------------------------------
 Seq Scan on hw1_data  (cost=0.00..7.26 rows=426 width=17)
(actual time=0.011..0.183 rows=426 loops=1)
 Total runtime: 0.390 ms
(2 rows)