CSE 544 Homework 4

This homework tests your understanding of disk architecture, B trees, query execution, and query optimization. You may need to read the textbook, as well as other sources to answer some of the questions.  Some basic mathematic skills are required.

Solution

  1.  [25 points total ] The HomeWotron disk has the following characteristics:

Compute the following:

  1. [4 points] What is the disk’s capacity, expressed in Giga bytes (GB)? What is the maximum transfer rate, expressed in mega bytes per second (MB/s) ?

  2. [4 points]  Recall that the time needed to read a block of data consists of the seek time, the rotational latency, and the transfer time.  Compute the transfer time for one block. Compute the average rotational latency. Your result should consist of two numbers, both expressed in milliseconds (ms).

  3. [4 points]  Assume cylinders are numbered 0, 1, …, 4095.  Compute the seek time for a block on cylinder 1500 assuming that the head is positioned on cylinder 500 when the read command is issued. Your result should be a number, expressed in milliseconds.

  4. [4 points] Generalize the result above: compute the seek time for a block on cylinder x, assuming that the head is positioned on cylinder y when the read command is issued.  Your answer should be a mathematical function s(x,y), where both x and y are number between 0 and 4095 (it is OK to use if-then-else in a mathematical function: s(x,y) can be expressed using an if-then-else, but can also be expressed without one).  Note: to verify that your result is correct, compute s(1500,500): it should return the same number you gave at point c.

  5. [4 points] Compute the average seek time. For that, assume that both the block’s position x and the head’s initial position y are random numbers, uniformly distributed in {0, 1, …, 4095}. Your result should be a number, expressed in milliseconds.

  6. [5 points] The HomeWotron SX is the top of the line model: it comes with two heads instead of one. The heads can move independently. When a read command is issued, any of the two heads can be chosen to read the block. The HomeWotron SX controller can be configured in one of two modes. In the NonPartitioned mode, the head closest to the cylinder where the block is positioned is chosen to read the block. In the Partitioned mode, head 1 is used to read cylinders 0, …, M/2, while head 2 is used to read cylinders M/2+1,…,M-1 (where M is the number of cylinders, M=4096 in our case). If your goal is to minimize average seek time, assuming uniform distribution both for the block position and the heads’ positions, which of the two modes would you choose? Your answer should consist of two mathematical formulas showing the average seek time for both modes, and an indication which is smaller. (Hints. Work with M rather than 4096. Use integrals rather than summations: integrals are approximations, but are easier to work with.)

 

  1. [20 points] Consider the B+ tree of order d=2 (i.e. each internal node has between 2 and 4 keys) represented here or below. Show the tree after each of the following operations:

  1. Insert the key  230.

  2. Insert the key  155.

  3. Delete the key 10.

  4. Delete the key 8.

Your answer should show four B+ trees. Notice: each operation is applied to the tree resulting from the previous operation. That is, your final answer should represent the tree with 230, and 155 inserted and 10 and 8 deleted.

 

  1. [30points] Solve problem 12.4 from the textbook.


  2. [25 points] Assume that we have n+1 relations, with the following schemas:
    S(C1,C2,..., Cn), R1(A1,B1),..., Rn(An,Bn),
    Consider two join expressions:
        E = R1  B1=A2 R2 B2=A3 . . .Rn-1 Bn-1=An  Rn
        F = S  C1=A1 R1 C2=A2 R2 . . .Rn-1 Cn=An  Rn
    Assume that we restrict the joins plans only to join trees without cartesian products. We distinguish between a plan AB  and a plan BA. Answer the following::
       a. [5 points] How many left-linear join trees are there for query E.
       b. [5 points] How many bushy join trees are there for query E.
       c. [5 points] How many left-linear join trees are there for query F.
       d. [5 points] How many bushy join trees are there for query F
       e. [5 points] Compare the four numbers above.
    Your first four answers should consists of four mathematical formulas in n, such as n!, or nn ,or 5. You may want to consult the Catalan numbers, e.g. in the book Concrete Mathematics, by Graham, Knuth, and Patashnik, Example 4 on page 357.