CSEP 544 Homework 2

Objectives:
To be able to translate from E/R diagrams to a relational database, and to understand functional dependencies and normal forms.
Due date:
Monday, January 27, 2014. Submit online using Catalyst. Turn in the following files: hw2-erdiagram.png, hw2-answers.txt, hw2-q2.sql, hw2-q5.sql, as explained below. Please do NOT zip the files, and be sure to check that you have submitted all 4 files.
Assignment Tools:
PostgreSQL
  1. [10 points] Design an E/R diagram for geography that contains the following kinds of objects together with the listed attributes: Model the following relationships between the geographical objects:

    Submit your diagram in a PNG image file called hw2-erdiagram.png. If you do not want to create a PNG, a PDF is also acceptable, but you must name it hw2-erdiagram.pdf.

  2. [15 points] Consider the following E/R diagram:
    E/R Diagram 
    1. Write the SQL CREATE TABLE statements to represent this E/R diagram. Include all keys, foreign keys, and uniqueness constraints. You do not need to run these commands in SQL.
    2. Which relation in your relational schema represents the relationship "insures", in the E/R diagram and why is that your representation?
    3. Compare the representation of the relationships "drives" and "operates" in your schema, and explain why they are different.

    Turn in your SQL statements for the first part in a file called hw2-q2.sql, submit your answers for the rest of the questions in a file called hw2-answers.txt.

  3. [10 points] Consider the following two relational schemas and sets of functional dependencies:
    1. R(A,B,C,D,E) with functional dependencies D -> B, CE -> A.
    2. S(A,B,C,D,E) with functional dependencies A -> E, BC -> A, DE -> B.
    For each of the two schemas, do the following: Decompose the relations, as necessary, into collections of relations that are in BCNF. Show all of your work and explain which dependency violations you are correcting by your decompositions. You have to turn in a description of your decomposition steps. Show: which is the relation that you are decomposing, what functional dependency do you apply, and which are the two resulting relations.

    Put your answers in hw2-answers.txt.

  4. [10 points] We say a set of attributes X is closed (with respect to a given set of functional dependencies) if X+=X. Given the closed attribute sets, this gives us some information on the underlying functional dependencies.

    Consider a relation with schema R(A,B,C,D) and an unknown set of functional dependencies. For each closed attribute set below, give a set of functional dependencies that is consistent with it.

    1. All sets of attributes are closed.
    2. The only closed sets are {} and {A,B,C,D}.
    3. The only closed sets are {}, {A,B}, and {A,B,C,D}.
  5. Put your answers in hw2-answers.txt.

  6. [30 points] Mr. Frumble (who is a great character for small kids that always gets into trouble) designed a simple database to record projected monthly sales in his small store. He never took a database class, so he came up with the following schema:
    Sales(name, discount, month, price)

    He inserted his data into the database, then he realized that there is something wrong with it: it was difficult to update. He hires you as a consultant to fix his data management problems. He gives you this file mrFrumbleData.txt and says: "Fix it for me!". Help him by normalizing his database. Unfortunately you cannot sit down and talk to Mr. Frumble to find out what functional dependencies make sense in his business. Instead, you will reverse engineer the functional dependencies from his data instance. You should do the following steps:

    1. Create a table in the database and load the data from the provided file into that table; use SQLite or any other relational DBMS if your choosing. You don't need to turn in anything for this point.
    2. Find all functional dependencies in the database. This is a reverse engineering task, so expect to proceed in a trial and error fashion. Search first for the simple dependencies, say name → discount then try the more complex ones, like name, discount → month, as needed. To check each functional dependency you have to write a SQL query. Your challenge is to write this SQL query for every candidate functional dependency that you check, such that (a) the query's answer is always short (say: no more than ten lines or so), and (b) you can determine whether the FD holds or not by looking at the query's answer. Try to be clever in order not to check too many dependencies, but don't miss potential relevant dependencies.

      For this point you should turn in all functional dependencies that you found, and for each of them the SQL query that discovered it, together with the answer of the query.

    3. Decompose the table in BCNF, and create SQL tables for the decomposed schema. Create keys and foreign keys where appropriate.

      For this point turn in the SQL commands for creating the tables.

    4. Populate your BCNF tables from Mr. Frumble's data. For this you need to write SQL queries that load the tables you created at point iii from the table you created at point i.

      Here, turn in the SQL queries that load the tables, and the tables' contents after loading them (obtained by running SELECT * FROM Table).

  7. Submit all answers & SQL queries for this question in hw2-q5.sql. Clearly indicate which question you are answering in the comments, non-SQL statements should be commented out.

  8. [25 points]Consider the following relational schema:

    Employee(eid, name, office)
    Manager(eid, mid)

    Each employee has a unique key, eid. An employee may have several managers, who are, in turn, employee: both attributes eid and mid in Manager(eid, mid) are foreign keys to Employee.

    For each of the queries below, write it in the relational calculus, in datalog, and in the relational algebra. You should return three answers: (a) a relational calculus expression; (b) a query in datalog+negation (c) a relational algebra plan.

    1. Write a query that retrieves all employees that have two or more managers. Your query should return the eid's and the names.

    2. An independent employee is an employee without a manager. (For example, the CEO is independent.) Write a query that retrieves all independent employees; you should return their eid and their names.

    3. Retrieve the office of all managers of the employee called 'Alice'. If there are multiple employees called Alice, or if one of them has several managers, you have to return all their offices.

    4. Find all managers for which all the employees they manage share the same office. Your query should return their eid, their name, and the office where all their managed people are located.

    5. A manager is an employee who manages at least one other employee. A second-level manager is a manager who manages only managers. Write a query to return all second-level managers; your query should return their eid's and their names.