Objectives:

  • To be able to translate from E/R diagrams to a relational database.
  • To understand functional dependencies and normal forms.

Assignment tools: SQLite (or any other SQL database) and this text file containing additional data.

Due date: Wednesday, May 17th by 11:00pm.
Turn in your solution using the assignment drop box linked from the main course web page.

What to turn in:
You can either draw your diagrams on paper and scan them or use software (e.g., PowerPoint) to create them. In either case, please only turn in PDF files. Indicate, with the name of each file, what problem it is for.

Problems

Problems 1–4 below are worth 4 points each. Problem 5, which is a bit more involved, is worth 8 points.

  1. Design an E/R diagram for geography that contains the following kinds of objects together with the listed attributes:
    • Countries: name, area, population, gdp ("gross domestic product")
    • Cities: name, population, longitude, latitude
    • Rivers: name, length
    • Seas: name, max depths
    Include also the following relationships between the geographical objects:
    • Each city belongs to exactly one country.
    • Each river crosses one or several countries.
    • Each river ends in a river or in a sea.
  2. Consider the following E/R diagram:
    E/R Diagram
    1. Write the SQL CREATE TABLE statements to represent this E/R diagram. Include all keys, foreign keys, and uniqueness constraints. (You do not need to execute the commands in a DBMS.)
    2. How did you represent the relationship "insures" from the E/R diagram? Why did you choose that your representation?
    3. Compare your representation of the relationships "drives" and "operates" in your schema and explain why they are different.
  3. Consider the following two relational schemas and sets of functional dependencies:
    1. R(A,B,C,D,E) with functional dependencies {D → B, CE → A}.
    2. S(A,B,C,D,E) with functional dependencies {A → E, BC → A, DE → B}.
    Decompose each relation into BCNF. Show all of your work and explain, at each step, which dependency violations you are correcting. (I.e., turn in a description of your decomposition steps, not just the final result.)
  4. A set of attributes is called closed (with respect to a given set of functional dependencies) if its closure is itself. I.e., X is closed if X+ = X.
    Consider a relation with schema R(A,B,C,D) and an unknown set of functional dependencies. For each closed attribute set below, give a set of functional dependencies that would produce that result.
    1. All sets of attributes are closed.
    2. The only closed sets are {} and {A,B,C,D}.
    3. The only closed sets are {}, {A,B}, and {A,B,C,D}.
  5. Mr. Frumble, a great childrens character that always gets into trouble, designed a simple database to record projected monthly sales in his small store. He came up with the following schema:
    Sales(name, discount, month, price)
    He inserted his data into the database, then he realized that there is something wrong with it: it was difficult to update! He hires you as a consultant to fix his data management problems. He gives you this file, mrFrumbleData.txt, and says: "Fix it for me!". Help him by normalizing his database. That will make his updates much easier.
    Unfortunately, you cannot sit down and talk to Mr. Frumble to find out what functional dependencies make sense in his business. Instead, you will reverse engineer the functional dependencies from his data instance. You can do so with the following steps:
    1. Using a DBMS of your choice (e.g., SQLite or SQL Server), create the Sales table and load the data from the Mr. Frumble's file into it. You don't need to turn in anything for this part.
    2. Find all functional dependencies in the database.
      This is a reverse engineering task, so expect to proceed in a trial and error fashion. Search first for the simple dependencies, such as name → discount. Then, try the more complex ones, such as (name, discount) → month, as needed. (Try to be clever in order not to check too many potential dependencies, but don't miss any!)
      To check each functional dependency, you will write a SQL query. Your challenge is to write this SQL query for every candidate functional dependency that you check, such that (1) the query's answer is always short (no more than ten lines or so) and (b) you can determine whether the FD holds or not by looking at the query's answer.
      For this point you should turn in all functional dependencies that you found and, for each of them, the SQL query that discovered it, together with the answer of the query.
    3. Decompose the table in BCNF.
      Turn in SQL commands to create the new tables, including any appropriate keys.
    4. Populate your BCNF tables from Mr. Frumble's data. For this you need to write SQL statements that fill in the tables you created at point iii from the table you created at point i. Each such statement can be of the form INSERT INTO X SELECT ... FROM Sales ..., where X is the name of a BCNF table.
      Turn in the SQL statements along with the tables' contents after loading them. (You can obtain the latter by running SELECT * FROM X.)