Objectives:

  • To understand how SQL queries are translated into the relational algebra.
  • To master writing relational queries in a logic formalism, datalog.

Assignment tools: None

Due date: Thursday, November 9th, 2017 by 11:00pm.
Turn in your solution using the assignment drop box linked from the main course web page.

What to turn in:
One file, hw4-answers.{txt|rtf|pdf|doc|docx} (any one of those formats)

Setup

In this homework, we will work with the following relational schema:

  • Employee (eid, name, office) — each row gives the unique ID of the employee (eid) along with their name and address.
  • Manager (eid, mid) — each row says that the employee with ID mid is a manager of the employee with ID eid.

Both the eid and mid values in Manager should hold the IDs of employees that exist. That is, both columns eid and mid in Manager are foreign key references to Employee(eid).

Also note that each employee can have multiple managers, and each manager can manage multiple employees. This is why Manager has no primary key.

Problems

For each query described below, show how to find the answer using datalog and relational algebra. In other words, your answer to each question will have two parts, where part (a) is a datalog program and part (b) is a relational algebra expression, both of which express the same query.

Assume that your datalog programs are using set semantics. However, for relational algebra, we will use bag semantics.

  1. [3 points] Retrieve all employees that have two or more managers. Output their IDs and names.
  2. [3 points] Retrieve all employees that have no managers. (For example, the CEO of the company has no manager.) Output their IDs and names.
  3. [3 points] Retrieve the offices of all managers of employees named 'Alice'. (Note that there could be multiple employees named 'Alice', and each such employee could have multiple managers.)
  4. [3 points] Find all the managers with the property that every employee they manage is located in same office. Output the manager's ID and name and the office where all their employees are located.
  5. [3 points] Find all the "second-level managers", that is, those managers who manage at least one employee that is also a manager (i.e., that manages at least one other employee). Output each second-level manager's ID and name.

Optional: If you want to double check your answers, you can try executing them in DLV. (You will need to make up some facts about Employees and Managers.)

Extra Credit:

  • [1 point] Write a datalog query that finds all employees that report directly or indirectly to 'Alice'. That is, your query should return the employees managed by Alice, the employees managed by someone managed by Alice, and so on.