CSE 414 Homework 4: Relational Algebra, SQL, and Query Execution

Objectives:
To gain experience with relational algebra and to understand how SQL queries are translated into relational algebara.
Reading assignments:
2.4, 5.1, 5.2 (relational algebra); 15.1-15.6 (query execution)
Assignment tools:
None needed
Due date:
Wednesday, April 29, 2015, at 11:00 pm. Use the assignment dropbox link on the course home page to submit your work.
What to turn in:
hw4-answers.{txt|pdf|doc[x]}. You may scan and submit hand-drawn diagrams if you wish as long as the resulting files are not too large (a few MB at most, please). Be sure to label your work so we can find your solutoins to each problem.
  1. Which of the following identities are correct? In each case, either answer YES, or answer NO and give a counterexample:

    1. δ(δ(R)) = δ(R)

    2. ΠLL(R)) = ΠL(R), where L = a set of attributes

    3. ΠKL(R)) = ΠK,L(R), where K, L = sets of attributes

    4. σCC(R)) = σC(R), where C = a condition

    5. γL, agg(A)L, agg(A)(R)) = γL, agg(A)(R), where L = a set of group-by attributes, agg is an aggregate operator, and A is an attribute


  2. Consider the following relational schema, representing a database of documents and a set of keywords:

    Doc(did, docTitle)
    Occurs(did, wid)
    Word(wid, word)
    Keywords(word)

    For each of the queries below, show an equivalent relational algebra plan:

    1. SELECT z.word, count(*)
      FROM Doc x, Occurs y, Word z
      WHERE x.did = y.did and y.wid = z.wid
      GROUP BY z.wid, z.word
    2. SELECT x.did, x.docTitle
      FROM Doc x
      WHERE NOT EXISTS (SELECT *
                        FROM Occurs o, Word w, Keyword k
                        WHERE x.did = o.did AND o.wid = w.wid AND w.word != k.word );
      
  3. Consider the following relational schema:

    Employee(eid, name, office)
    Manager(eid, mid)

    Each employee has a unique key, eid. An employee may have several managers, who are, in turn, employee: both attributes eid and mid in Manager(eid, mid) are foreign keys to Employee.

    For each of the queries below, write it as a SQL query and show an equivalent relational algebra plan.

    1. Write a query that retrieves all employees that have two or more managers. Your query should return the eid's and the names.

    2. An independent employee is an employee without a manager. (For example, the CEO is independent.) Write a query that retrieves all independent employees; you should return their eid and their names.

    3. Retrieve the office of all managers of the employee called 'Alice'. If there are multiple employees called Alice, or if one of them has several managers, you have to return all their offices.

  4. Exercise 15.6.2 (based on textbook pp. 745). Suppose B(R) = 10,000 and T(R) = 500,000. Suppose there is an index on R.a and assume that V(R,a) = k for some number k. Give the cost of σa=0(R) as a function of k under the following circumstances. You should neglect the cost of any disk I/O operations needed to access the index itself.

    1. The index is clustered (the book says "the index is clustering"; it is the same thing).

    2. The index is not clustered.

    3. No index is used but R is clustered on attribute a.