CSEP544 - Assignment 4

Objectives:
To understand query execution, recovery and concurrency control.
Number of points:
100
Due date:
Feb. 27th, 2007 at 6.00pm

Questions:
  1. [20 points] After a system’s crash, the undo-log using non-quiescent checkpointing contains the following data:

                                         

          a.      What are the correct values of the three <START CKPT ????>  records ?  You have to provide three correct values for the three “????”s.

          b.      Assuming that the three <START CKPT ???>  records are correctly stored in the log, according to your answer at a., show which elements are recovered by the undo recovery manager and compute their values after recovery.

                      c.        Indicate what fragment of the log the recovery manager needs to read.
  2.  
  3. [20 points] The SuperSQL database system stores its undo log file in a table, with the following schema:

              Log(N, T, E, V)

    where N is the entry number (0, 1, 2, 3, …), T is the transaction id, E is the element id, and V is the old value.  A log entry of the form <T, E, V> is simply represented by the tuple (N, T, E, V), where N is the entry number; here E > 0. The log entries <START T>, <COMMIT T>, and <ABORT T> are represented by a tuple (N, T, E, null), where E=-1 for START, E=-2 for COMMIT, and E=-3 for ABORT. For example, the log:

    <START T1>

    <T1 X1 55>

    <START T2>

    <T2 X2 99>

    <COMMIT T1>

    . . . .

    Is represented by the table:

    N

    T

    E

    V

    0

    T1

    -1

     

    1

    T1

    X1

    55

    2

    T2

    -1

     

    3

    T2

    X2

    99

    4

    T1

    -2

     

    . . .

     

     

     

    Recall that each transaction starts and ends at most once, i.e. a sequence <START T> … <COMMIT T> … <START T> … will not occur in the log. Moreover, any update by the transaction T will occur between its <START T> and <COMMIT T>, or between <START T> and <ABORT T> respectively.

    Write a SQL query that can be run during database recovery, after a system crash. The answer to your query should return a table with two attributes, E, and V, indicating which elements have to be updated with what values. You should include each element E at most once in your answer: otherwise it would be ambiguous how to update it.

  4. [20 points] Solve the following problems from the textbook:

3.1.   For each of the following schedules:

  1. r1(A); r2(A); r3(B); w1(A); r2(C); r2(B); w2(B); w1(C);
  2. r1(A); w1(B); r2(B); w2(C); r3(C); w­3(A);
  3. r1(A); r2(A); w1(B); w2(B); r1(B); r2(B); w2(C); w1(D);

Answer the following questions:

     i.  What is the precedence graph for the schedule?

     ii. Is the schedule confict-serializable? If so, what are all the equivalent serial schedules?

3.2.
Say that a transaction T precedes a transaction U in a schedule S if every action of T precedes every action of U in S. Note that if T and U are the only transactions in S, then saying T precedes U is the same as saying that S is the serial schedule (T, U). However, if S involves transactions other than T and U, then S might not be serializable, and in fact, because of the effect of other transactions, might not even be conflict-serializable. Give an example of a schedule S such that:

               i.In S, T1 precedes T2, and

              ii.S is confict-serializable, but

              iii.In every serial schedule conflict-equivalent to S, T2 precedes T1.

 3.3
Below are several sequences of events, including start events, where sti means that transaction Ti starts. These sequences represent real time, and the timestamp-based scheduler will allocate timestamp to transactions in the order of the starts. Tell what happens as each executes.

                a. st­1; st2; r1(A); r2(B); w2(A); co2; w1(B);

                b. st1; r1(A); st2; w2(B); r­2(A); co2; w1(B);

3.4. 
In the following sequences of events, we use Ri(X) to mean “transaction Ti starts, and its read set is the list of database elements X.” Also, V­means “Ti attempts to validate,” and Wi(X) means that “Ti finishes, and its write set was X.” Tell what happens when each sequence is processed by a validation-based scheduler.

    1.  R1(A, B); R2(B, C); V1; R3(C, D); V3; W1(A); V2; W2(A); W3(B);
    2. R1(A, B); R2(B, C); V1; R3(C, D); V3; W1(A); V2; W2(A); W3(D);

  1. [20 points] Consider a database with objects X and Y and assume that there are two transactions T1 and T2. Transaction T1 reads objects X and Y and then writes X. Transaction T2 reads objects X and Y and then writes objects X and Y. Now answer the following questions
    1. Give an example schedule with actions of transactions T1 and T2 on objects X and Y, which is not conflict serializable, and which results in a write-read conflict
    2. Give an example schedule with actions of transactions T1 and T2 on objects X and Y, which is not conflict serializable, and which results in a write-write conflict
    3. Give an example schedule with actions of transactions T1 and T2 on objects X and Y, which is not conflict serializable, and which results in a read-write conflict
  2. [20 points] You will use only the Movie table in the IMDB database on IPROSRV for this task.  You need to come up with as many SQL SELECT queries as it takes you to get query plans that include all of these features:

    1. the id index
    2. the name index
    3. the year index
    4. the name,year index
    5. the year,name index
    6. a nested loop join
    7. a hash join
    8. a merge join

Your queries may not use the OPTION or related keywords to force the use of a particular index or join type.  Turn in a numbered (corresponding to the numbers above) list of your queries; it should look something like this:

1 - select ... from movie ... where ...

...

2,5,6 - select ... from movie ... where ...

Note: In SQL Server 2005, you can see the execution plan for a query by going to Query >> Display Estimated Execution Plan (or CTRL+L).