CSE 444 Final

December 14, 2001

2:30pm – 4:20pm


Total number of points: 100

Total time: 1h 50’


Text Box: city

[30 points] An insurance company maintains a database whose schema is given by the E/R diagram below. Here premium refers to the annual premium (what the client pays annually the insurance company) and is the same every year; amount is a one-time payment (which the insurance company has to pay to the client).

a.      [5 points] Write SQL statements for creating the tables for the E/R diagram. Indicate keys and foreign keys. Assume the following domains for the attributes:

                                                  i.      all keys (name, pid, cid) are fixed-length, 10 character long

                                                ii.      city is variable length character field, max 30

                                              iii.      all others (years, amounts) are integers


b.     [5 points]. An insurance agent processes the claim with cid=03492 for a client. He would like to find out if there were any other claims for the same property submitted during the same year. Write a SQL query to find this out: your query should return all cid’s of such claims.


c.      [10 points] Write a SQL query that returns, for each year when some claim was filed, the number of claims and the total dollar amount for that year.


d.     [10 points] A “bad” policy is defined to be a policy for which there exists a period of ten continuous years in which the total amount of claims exceeded the premium over those ten years. Write a SQL query to retrieve all “bad” policies





[20 points] Consider three relations, R(A,B), S(B,C), T(C,D) and assume the following statistics:
B(R) = 10000, B(S) = 20000, B(T) = 30000, B(U) = 40000, 
where U = R        S.
Consider the following plan:

We consider evaluating the two joins in a pipeline, as follows:
Step 1. Partition R, then S into buckets, on B, and store the B-buckets on disk.
Step 2. Join each pair of B-buckets from R and S and partition the join tuples immediately on the C attributes (pipeline); store the C-buckets of U on disk.
Step 3. Partition T into buckets, on C, and store its C-buckets on disk.
Step 4. Join each pair of C-buckets from U and T.

Compute the following:

a.      [15 points] The minimum amount of memory, M needed for this plan:
M  =

In the remaining questions use the value for M that you gave at point a:

b.     [1 point]   The number of B-buckets =

c.      [2 points] The average size of each B-bucket;  for R =        for S =

d.     [1 point]   The number of C-buckets =

e.      [2 points] The average size of each C-bucket;  for U =         for T =

3.     [10 points] Answer the following questions.  If you answer “yes” you don’t have to justify your answer; if you answer “no” you need to give an example that justifies your answer.

a.      [5 points] Suppose attribute A is a key in the relation R.

                                                  i.      Is A a key in PABC(R) ?

                                                ii.      Is A a key in sC=55(R)  ?

                                              iii.      Is A a key in R  È  S  ?

                                              iv.      Is A a key in R – S ?

                                                v.      Is A a key in R        T ?

b.     [5 points]  Suppose R has attributes A and B and satisfies the functional dependency A  ® B.

                                                  i.      Does the relation P­ABC(R)
satisfy the functional depencency A ® B ?

                                                ii.      Does the relation sC=55(R)
satisfy the functional depencency A ® B ?

                                              iii.      Does the relation R  È  S
satisfy the functional depencency A ® B ?

                                              iv.      Does the relation R – S
satisfy the functional depencency A ® B ?

                                                v.      Does the relation R        T
satisfy the functional depencency A ® B ?


4.     [20 points] Compute the optimal plan for R        S          T         U using the dynamic programming algorithm, assuming the following:
B(R) = 500,  B(S) = 200,  B(T) = 600,  B(U) = 400
The size of a join is estimated as:  B(A          B) = 0.01 * B(A) * B(B)
The cost of a join is estimated to be the cost of the subplans plus the size of the intermediate result(s).


Text Box: <START T1>
<T1, X1, 1>
<START CKPT  ????>
<T2, X2, 2>
<T1, X1, 3>
<END T1>
<START CKPT  ????>
<T2, X2, 4>
<T3, X3, 5>
<END T2>
<T4, X4, 6>
<END T3>
<T5, X5, 7>

[20 points] After a system’s crash, the undo-log using non-quiescent checkpointing contains the following data:


a.      [5 points] What are the correct values of the three <START CKPT ????>  records ?  You have to provide three correct values for the three “????”s.

b.     [10 points] Assuming that the three <START CKPT ???>  records are correctly stored in the log, according to your answer at a., show which elements are recovered by the undo recovery manager and compute their values after recovery.

c.      [5 points]  Indicate what fragment of the log the recovery manager needs to read.