HW5: Relational Algebra and Database Design

  1. Consider the following relational schema:

    product(pid, pname, price)
    orders(oid, pid, cid, quantity, date)
    customer(cid, cname, city)
    

    Products have an ID (the primary key), a name, and a selling price.

    Customers have an ID (the primary key), a name, and the city where they live.

    Each order is for one product and one customer. The quantity is an integer strictly greater than zero, which indicates how many of the product was ordered.

    Draw a relational algebra query plan (tree) for the following query:

    SELECT pname, avg(quantity)
    FROM product p, orders o, customer c
    WHERE p.pid = o.pid 
    AND c.cid = o.cid
    AND p.price <= 10
    GROUP BY p.pid, pname
    HAVING count(*) > 100
    

    You may draw your diagram on paper or use a drawing tool such as draw.io. If you draw your diagrams on paper, be sure to submit high-quality pictures of it. If we can't read it, we can't grade it.

  2. Draw an Entity-Relationship diagram for a mapping application. Your diagram must contain at least the following list of entities, along with their attributes, as well as the given relationships. Be sure to choose the correct primary key(s) based on the information below.

    Entities:

    • countries: name, area, population, GDP ("gross domestic product")
      • a country's name uniquely identifies it
    • cities: name, population, longitude, latitude
      • two different cities may have the same name (e.g., there are 41 different cities in the U.S. called "Springfield")
      • a city is uniquely identified by the combination of its latitude and longitude
    • rivers: name, length
    • seas: name, depth
    • rivers and seas are uniquely identified by their name within the set of all water entities
      • e.g., if there were a river named "Ganges", it would be a unique water entity—no other river or sea could have that name

    Relationships:

    • each city belongs to exactly one country
    • each river crosses one or more countries
    • each country can be crossed by zero or more rivers
    • each river must end in either a river or a sea

    You may draw your diagram on paper or use a drawing tool such as draw.io. If you draw your diagrams on paper, be sure to submit high-quality pictures of it. If we can't read it, we can't grade it.

  3. Consider the following ER diagram.

    Translate the diagram into a series of SQL CREATE TABLE statements, including all key constraints (primary and foreign). Make sure your statements are syntactically correct by running them through SQLite. Then, paste them into your PDF writeup.

  4. Which relation in your schema represents the insures relationship in the ER diagram? Why is that your representation? (A sentence or two is sufficient.)

  5. Compare the representation of the relationship haunts to that of caresFor in your schema and explain why they are different. (A sentence or two is sufficient.)

  6. Consider a relation \(R(A, B, C, D, E)\) with functional dependencies \(D \to B\) and \(CE \to A\). Decompose \(R\) into BCNF. At each step, explain which bad FD you are correcting. For each original FD, explain which output table from your decomposition it applies to after normalization, or state that the FD does not apply to any output table.

  7. Same instructions as the previous problem, but for a relation \(S(A, B, C, D, E)\) with the three functional dependencies \(A \to E\) and \(BC\to A\) and \(DE \to B\).

  8. Consider a relation \(T(A, B, C, D)\) with a to-be-determined set of functional dependencies. We call a set of attributes \(X\) closed if \(X^+ = X\) (ie, if the closure of \(X\) is itself).

    For each scenario below, construct one example set of functional dependencies for \(T\) that is consistent with the scenario.

    • All subsets of \(\{A, B, C, D\}\) are closed.
    • The only closed subsets of \(\{A, B, C, D\}\) are \(\{\}\) and \(\{A, B, C, D\}\).
    • The only closed subsets of \(\{A, B, C, D\}\) are \(\{\}\), \(\{A, C\}\), and \(\{A, B, C, D\}\).
  9. Mr. Frumble (a great character for small kids who always get into trouble) designed a simple database to record the projected monthly sales in his small store. He never took a data management class, so he came up with the following schema:

    sales(name, discount, month, price)
    

    He quickly realized it was difficult to update. Fortunately, he has hired you to fix his data management problems! He gives you his file mrFrumbleData.csv and says: "Fix it for me!".

    You decide to help Mr. Frumble by normalizing his database. Unfortunately, he has already dashed off and you are unable to ask him what functional dependencies make sense in his business. Instead, you will reverse engineer them from his data instance.

    Start by loading Mr. Frumble's data into a SQLite table. Submit your CREATE TABLE statement but not your import statements.

    1. Find all non-trivial functional dependencies in his schema.

      This is a reverse engineering task, so expect to proceed in a trial and error fashion. We recommend first searching for simple dependencies (eg, name → discount), and only searching for more-complex dependencies (eg, name, discount → month) as needed. For each candidate FD you consider, you will need to write a SQL query to verify whether it does or does not hold.

      Write and submit a SQL query for each non-trivial functional dependency you find, along with a comment stating the FD it checks for.

      Also write and submit one SQL query for a functional dependency that does not hold in the dataset, along with a comment stating the FD and the fact that it does not hold.

    2. Decompose Mr. Frumble’s single table into Boyce-Codd Normal Form (BCNF). No need to show your work. Create SQL tables for the decomposed schema. Create keys and foreign keys where appropriate.

      Submit the SQL commands for creating the tables.

    3. Write and submit SQL commands that load the data from the original table into your new tables. In a SQL comment, report the number of rows in each of your new tables.

    Collect your answers to all subquestions of this question in a file called frumble.sql.

Submit your answers to problems 1 through 8 to the HW5 Written assignment on Gradescope. Submit your answers to problem 9 to the HW5 Programming assignment on Gradescope.