CSEP 544 Homework 4

Objectives:: To understand query execution, recovery and concurrency control.
Reading Assignments:: Chapters: 17.2-17.4 and 18
Number of points:: 100
Due date:: May 9, 6:30 pm by the Catalyst Dropbox. You can also turn in a hardcopy in class on May 5 if you prefer handwritten solutions.

Questions:

[20 points] After a system’s crash, the undo-log using non-quiescent checkpointing contains the following data:



      a.      What are the correct values of the three <START CKPT ????> records ? You have to provide three correct values for the three “????”s.

      b.      Assuming that the three <START CKPT ???> records are correctly stored in the log, according to your answer at a., show which elements are recovered by the undo recovery manager and compute their values after recovery.

        c.      Indicate what fragment of the log the recovery manager needs to read.

[20 points] The SuperSQL database system stores its undo log file in a table, with the following schema:

Log(N, T, E, V)

where N is the entry number (0, 1, 2, 3, …), T is the transaction id, E is the element id, and V is the old value. A log entry of the form <T, E, V> is simply represented by the tuple (N, T, E, V), where N is the entry number; here E > 0. The log entries <START T>, <COMMIT T>, and <ABORT T> are represented by a tuple (N, T, E, null), where E=-1 for START, E=-2 for COMMIT, and E=-3 for ABORT. For example, the log:

. . . .

Is represented by the table:

N	T	E	V
0	T1	-1
1	T1	X1	55
2	T2	-1
3	T2	X2	99
4	T1	-2
. . .

Recall that each transaction starts and ends at most once, i.e. a sequence <START T> … <COMMIT T> … <START T> … will not occur in the log. Moreover, any update by the transaction T will occur between its <START T> and <COMMIT T>, or between <START T> and <ABORT T> respectively.

Write a SQL query that can be run during database recovery, after a system crash. The answer to your query should return a table with two attributes, E, and V, indicating which elements have to be updated with what values. You should include each element E at most once in your answer: otherwise it would be ambiguous how to update it.

[40 points] Solve the following problems from the textbook:
3.1. For each of the following schedules:
1. r₁(A); r₂(A); r₃(B); w₁(A); r₂(C); r₂(B); w₂(B); w₁(C);
2. r₁(A); w₁(B); r₂(B); w₂(C); r₃(C); w₃(A);
3. r₁(A); r₂(A); w₁(B); w₂(B); r₁(B); r₂(B); w₂(C); w₁(D);
Answer the following questions:

i. What is the precedence graph for the schedule?

ii. Is the schedule confict-serializable? If so, what are all the equivalent serial schedules?

3.2.
Say that a transaction T₁ precedes a transaction T₂ in a schedule S, if every action of T₁ precedes every action of T₂. Note that if the schedule consists only of the transactions T₁ and T₂ and T₁ precedes T₂, then S is the serial schedule T₁;T₂. However, if T₁ precedes T₂ in S and S involves transactions other than T₁ and T₂, then S may not be a serial schedule, and may not even be serializable.
1. Give an example of a schedule S where T₁ precedes T₂ and S is not conflict serializable.
2. Give an example of a schedule S such that: T₁ precedes T₂, S is confict-serializable, and T₂ precedes T₁ in every serial schedule conflict-equivalent to S.
3.3
Consider a timestamp based concurrency control manager: the manager assigns a unique time stamp to each transaction, maintains for each element X the three values RT(X), WT(X), C(X), and decides at each read/write request of a transaction T whether to grant, to rollback, to ignore, or to defer. Consider the partial schedules below (where st means start transaction). What does the concurrency control manager do for the last request in each schedule ?
1. st1; r1(A); st2; r2(B); w2(A); co2; w1(B);
2. st1; r1(A); st2; w2(B); r2(A); co2; w1(A);
3. st1; r1(A); st2; w2(B); r2(A); co2; w1(C);
4. st1; r1(A); st2; w2(B); r2(A); w1(A);
3.4.
Consider an optimistic concurrency control manager (based on validation). We denote R_i to mean "transaction T_istarts and performs its read phase", V_i(X; Y) to "T_i attempts to validate, its read set is X, its write set is Y", and W_i to mean "T_i performs its write phase and finishes". For each validation attempt below, indicate whether the validation, succeeds, fails due to a R-W conflict, or fails due to a W-W conflict.
1. R1; R2; V1(A,B; A); R3; V3(C,D; B); W1; V2(B,C; A); W2; W3;
2. R1; R2; V1(A,B; A); R3; V3(C,D; B); W1; W3; V2(B,C; A); W2;
[20 points] Indicate for each statement below if it is true or false. You do not have to justify your answer:
1. Every schedule that is possible under a timestamp based concurrency control scheduler is also possible under a multiversion concurrency control scheduler.
2. Every schedule that is possible under a multiversion concurrency control scheduler is also possible under a timestamp based concurrency control scheduler.
3. Suppose that every transaction has the form ST, RD(X), WT(Y), CO, for some elements X and Y. Then a 2-phase locking scheduler that uses shared locks for read operations and exclusive locks for write operations will never result in a deadlock.
4. Suppose that all transactions are either read-only (i.e. they have the form ST,RD(X),RD(Y ), . . . ,CO), or they consists of a single write (i.e. they have the form ST,WT(X),CO). Then a 2-phase locking scheduler that uses shared locks for read operations and exclusive locks for write operations will never result in a deadlock.