Assignment 1: SQL Queries

Objectives:

To understand and be able to write simple SQL queries, including updates.

Reading Assignments:

6.1 - 6.6, 7.1

Number of points:

100

Due date:

Wednesday, October 13th in class.

Tools for the assignment

For this assignment, you will use your SQL Server account on IISQLSRV. You should have an account, and you know your password from the first lecture.

As an alternative to SQL Server, you are allowed to use a different database, e.g. MySQL, Postgres, or SQL Server installed in your own machine. In that case you fist need to import the movie database in your database system: you can obtain the data from here here. There is no support for database systems other than IISQLSRV.

What to turn in:

You need to turn in a printout containing the SQL query for each of the questions below, at the end of class on the due date.

Homework Description

In this homework you are asked to write ten SQL queries on a relational movie database. This is a stripped down version of a dataset from the UCI KDD archive. The database exists already on IISQLSRV. The database consists of three tables: filmid is a key for FILMS and stagename is a key for ACTORS. CASTS.film refers to FILMS.filmid and CASTS.actor refers to FILMS.stagename.

  1. List all actors born in 1982, and the movies they cast in.
  2. List all actors who played in 'dead man walking'
  3. List all actors in descending order of the number of films they acted in.
  4. Find the film(s) with the largest cast. Find the film(s) with the smallest cast. In both cases, also return the size of the cast.
  5. The Bacon number of an actor is the length of the shortest path between the actor and Kevin Bacon in the "co-acting" graph. That is, Kevin Bacon has Bacon number 0; all actors who acted in the same film as KB have Bacon number 1; all actors who acted in the same film as some actor with Bacon number 1 have Bacon number 2, etc. Return all actors whose Bacon number is 2. Bonus: Suppose you write a single SELECT-FROM-WHERE SQL query that returns all actors whose Bacon number is infinity. How big is the query?
  6. Find the actors who acted in films of at least 15 distinct categories..
  7. Find all actors who acted only in films before 1960.
  8. Find the films with more women actors than men.
  9. A decade is a sequence of 10 consecutive years. For example 1965, 1966, ..., 1974 is a decade, and so is 1967, 1968, ..., 1976. Find the decade with the largest number of films.
  10. Create a new table with the following schema: ACTORS_DEBUT(stagename, lname, fname, debutfilmname, year), listing for each actor their debut film. You need to use either the INSERT SQL statement or the INTO clause of the SELECT statement. Since the names of the tables created by you need to be unique (two tables with the same name in the same database are not allowed), I recommend naming your table by your CSE account name, or some such thing.