SQL
CSE 190 M (Web Programming) Spring 2008
University of Washington
Except where otherwise noted, the contents of this presentation are © Copyright 2008 Marty Stepp, Jessica Miller, and Amit Levy, and are licensed under the Creative Commons Attribution 2.5 License.
Database design
- database design : the act of deciding the schema for a database
- database schema: a description of what tables a database should have, what columns each table should contain, which columns' values must be unique, etc.
- some database design principles:
- keep it simple, stupid (KISS)
- eliminate redundancy, especially redundancy of lengthy data (strings)
First database design
- what's good and bad about this design?
- uses only one table, but contains redundancy
Second database design
- splitting data into two tables (linked by Student ID) avoids redundancy
- this is also called normalizing the database
- normalized tables are often linked by unique integer IDs
IMDb database
Actor
id | fname | lname | gender |
433259 | William | Shatner | M |
797926 | Britney | Spears | F |
831289 | Sigourney | Weaver | F |
... |
Movie
id | name | year |
112290 | Fight Club | 1999 |
209658 | Meet the Parents | 2000 |
210511 | Memento | 2000 |
... |
Cast
aid | mid | Role |
433259 | 313398 | Capt. James T. Kirk |
433259 | 407323 | Sgt. T.J. Hooker |
797926 | 342189 | Herself |
... |
- database name is
imdb
on server webster.cs.washington.edu
- other tables:
- Director (id, fname, lname)
- Movie_Director (did, mid)
- Movie_Genre (mid, genre)
Practice problem: Movie search
- Write a PHP script that connects to the
imdb
database on webster
and searches for all movies whose names match a given prefix, displaying them as an HTML table. Assume that the prefix is a query string parameter passed into the script.
- Consider modifying the code so that, if only one movie matches, it will print the IDs of all actors who acted in that movie. (This isn't very useful, but we'll improve it next time.)
Combining multiple tables: cross product
SELECT column(s) FROM table1, table2, ..., tableN;
SELECT * FROM Student, Grades;
- cross product : combines each row of first table with each row of second
- achieved in SQL by specifying multiple tables in
FROM
clause of SELECT
statement
- produces M * N rows, where table 1 has M rows and table 2 has N
- you probably don't want to do this (too much irrelevant data)
Cross product example
SELECT * FROM Student, Grades;
name | SID | email | SID | CID | grade |
142 | Bart | bart@fox.com | 142 | CPS116 | B- |
142 | Bart | bart@fox.com | 142 | CPS114 | B |
142 | Bart | bart@fox.com | 123 | CPS116 | B+ |
142 | Bart | bart@fox.com | 857 | CPS116 | A+ |
142 | Bart | bart@fox.com | 857 | CPS130 | A+ |
142 | Bart | bart@fox.com | 456 | CPS114 | C |
123 | Milhouse | milhouse@fox.com | 142 | CPS116 | B- |
123 | Milhouse | milhouse@fox.com | 142 | CPS114 | B |
... (24 rows returned) |
- much of the data is meaningless (e.g. Bart mixed with Lisa's courses)
- some columns might repeat (SID)
Joins
SELECT column(s) FROM table1, table2, ..., tableN WHERE condition(s);
SELECT column(s) FROM table1
JOIN table2 ON condition(s)
...
JOIN tableN ON condition(s);
SELECT * FROM Student, Grade WHERE Student.SID = Grade.SID;
- join : a relational database operation that combines records from two or more tables if they satisfy certain conditions
- often the rows are linked by key column values
Join example
SELECT * FROM Student, Grade WHERE Student.SID = Grade.SID;
name | SID | email | SID | CID | grade |
123 | Milhouse | milhouse@fox.com | 123 | CPS116 | B+ |
142 | Bart | bart@fox.com | 142 | CPS116 | B- |
142 | Bart | bart@fox.com | 142 | CPS114 | B |
456 | Ralph | ralph@fox.com | 456 | CPS114 | C |
857 | Lisa | lisa@fox.com | 857 | CPS116 | A+ |
857 | Lisa | lisa@fox.com | 857 | CPS130 | A+ |
table.column
disambiguates two columns with the same name
- an equivalent query:
SELECT * FROM Student JOIN Grade ON Student.SID = Grade.SID;
Filtering columns during a join
SELECT name, Grade.* FROM Student, Grade
WHERE Student.SID = Grade.SID;
name | SID | CID | grade |
Milhouse | 123 | CPS116 | B+ |
Bart | 142 | CPS116 | B- |
Bart | 142 | CPS114 | B |
Ralph | 456 | CPS114 | C |
Lisa | 857 | CPS116 | A+ |
Lisa | 857 | CPS130 | A+ |
- if a column name only exists in one table, it may be written by itself
- to specify all columns from a table, write
table.*
Giving names to tables
SELECT name, g.*
FROM Student s, Grade g
WHERE s.SID = g.SID;
name | SID | CID | grade |
Milhouse | 123 | CPS116 | B+ |
Bart | 142 | CPS116 | B- |
Bart | 142 | CPS114 | B |
Ralph | 456 | CPS114 | C |
Lisa | 857 | CPS116 | A+ |
Lisa | 857 | CPS130 | A+ |
- can give optional names to tables or columns, like a variable name in Java
Self-joins
SELECT *
FROM Student s1, Student s2
WHERE s1.SID < s2.SID;
name | SID | CID | grade |
Milhouse | 123 | CPS116 | B+ |
Bart | 142 | CPS116 | B- |
Bart | 142 | CPS114 | B |
Ralph | 456 | CPS114 | C |
Lisa | 857 | CPS116 | A+ |
Lisa | 857 | CPS130 | A+ |
- can give optional names to tables or columns, like a variable name in Java
Practice problem: Cast list for a movie
Write a PHP script that, when given a movie, shows the names of all female actors that appeared in it. (To do this, you will need to perform an SQL query with join operations.)