Distributed Transaction Application in Java

The purpose of this project is to gain an understanding of the interaction between various components of a TP application, and the implementation issues involved. Your goal is to construct a distributed application in Java that implements a travel reservation system.

The project is organized as a sequence of steps that iteratively add components and system structure, working toward the ultimate goal of a multiple client, multiple server, scalable system. The steps are not all of comparable difficulty, and therefore should not be used as weekly milestones. Effort required is also not proportional to the length of the specification, so long specs might be easier to implement than short ones.

Students will work in pairs. Many of the individual blocks and functions can be done in parallel. For example, locking and resource management (steps 1 and 2) can be developed separately.

Common interfaces will be provided for components interacting with the client so that a common client can attach to and use any project’s server. We will provide a basic test scripting interface to ensure minimal functionality. It is your responsibility to augment these tests (using the standard interfaces) to make certain that your service resists failure. If the test scripts turn out to be flexible enough, we may collect all student-written test scripts and apply them all to your project to see if it fails.

Your options for software development environment include the following:

There will be at least one mid term review of your progress. This review is only a checkpoint, and not a graded activity.


1. Build a lock manager package/class. The operations that must be supported include:

You need to handle deadlocks, probably by a timeout mechanism. A failed (deadlocked) lock operation should throw an exception. You should also be able to convert locks, e.g.,:

lock(xid, foo, read); 
/* read foo */ 
lock(xid, foo, write);
/* write foo ... you would include error checking and exception handling...*/

Keep in mind that other transactions may have read locks on foo, so deadlock is possible. The thingBeingLocked (foo) should probably be a string, and the lock objects should be stored in a hash table (don't forget to lock the hash table).

This component is independent of the rest of the project, so it makes sense to do it first, to give you experience with the Java environment while you get warmed up to TP concepts described in class. Lock managers are described in the textbook in Chapter 6, Section 6.2.

2. Build a simple Resource Manager.

The simple RM implements transactions. That is, it supports the methods start, commit, and abort, and all data access (read/write) operations are associated with a transaction. Initially, everything will be stored in one RM. Later, data will be partitioned into multiple RMs, where each RM stores some part of the data to be operated on.

You might choose to break this step into several pieces, and skip some intermediate parts. For example, atomicity (in this case, support for commit() and abort()) is more significant when there's a disk image involved, so you might choose to defer that until later.

The operations to be supported are outlined in the attached Java interface. The next two paragraphs will make more sense if you have this in front of you.

Assume that there is only one airline (so a flight identifier is an integer), only one type of car, only one type of hotel room, and only one day. Since there is only one type of things, there is only one price. The net effect of { addCars(T1, ‘San Diego’, 4, $52); addCars(T2, ‘San Diego’, 7, $54); } should leave 11 cars at either $52 or $54, not 7 cars at $54 and 4 cars at $52. We know these assumptions sacrifice verisimilitude, but they should enable implementation of the persistent database relatively quickly. It should be possible to query for which reservations the customer holds, and how much the customer should be charged. Don't bother with an account payment feature. The system will now look like: client<-><->RM.

The Technical Interface methods are defined to make it easier to test for faults. The shutDown() method implies that the RM should shut down gracefully. In this case, that means cleaning up its files, so that next time it starts up, it does not attempt to recover its state. The selfDestruct() method exists to allow failure generation between two disk writes. The idea is that it sets a counter of disk writes that will be executed successfully before the RM terminates. The RM will have to startup and recover from termination.

If you choose to implement it, atomicity in this step will be accomplished by shadowing as in step 4. Copy your memory image, update it, then set the pointer to the active memory image to the new one on commit. In step 4, this will be a disk image that will be copied, updated, and relinked (renamed). The only failure to handle is an abort. Since the memory image is lost when the process terminates, there doesn’t need to be any recover() method at this stage.

3. Isolation. Combine parts 1 and 2. That is, the RM should lock data appropriately for each transaction, and unlock data when the transaction commits or aborts. We will test this implementation using multiple clients and a single resource manager. You might experiment with different locking granularities at this stage. The system will now look like: client*<->RM. (* means arbitrarily many)

4. Durability. Add persistence and recovery to the Resource Manager. All state is stored on disk. The disk image is updated when a transaction commits. The RM must implement a recover() method to restore its state from the state on disk, and gracefully handle various exceptions, such as operations called with unknown (forgotten) transaction ids.

This will be accomplished using shadowing, described in the lecture notes on Database Recovery. See also, footnote on page 251 of text. There will be other references on shadowing.

The system will now look like: client*<->RM<->disk.

5. Implement a workflow controller. Workflow control is described in Section 2.4. The WC will be a front-end so that the eventual location (partitioning) of data on the RM’s is not exposed to the client. In other words, the client will ask the WC for a reservation on flights 435 and 534 and a rental car in St. Louis, and the WC will start a transaction, contact the RM for flight 435 and make a reservation, contact the RM for flight 534 and make a reservation, then contact the RM for cars in St. Louis and make a reservation. At the moment, this is done using just one RM, and there are more pieces to build before the WC can be used to do this.

The workflow controller will support the RM interface exported in step 2, and add the reserveItinerary(customer, flight_list, location, bCar, bRoom) method. A customer will have only one itinerary. This intinerary reservation is the sort of high level operation associated with a workflow controller, and can be implemented here. The parameters are: customer, the customer id from newCustomer; flight_list, a vector of integer flight numbers; location, the place where rooms or cars might be reserved; and bCar/bRoom, true if the customer wants a car/room reservation. All other methods are passed directly to the only RM that exists at this stage.

The workflow controller will be given the list of active RMs as command line arguments on startup.

6. Implement a Transaction Manager. The TM supports the following operations: start, commit, abort, enlist. The enlist method is called by an RM to tell the TM that it is involved in a transaction. The other methods are called by the workflow controller on behalf of the client. At this stage, the TM needs no persistence.

Since the TM exists behind the WC interface, no client interfaces will be provided for the TM.

The workflow controller will need to be given the hostname of the TM, in addition to the list of active RMs. The RMs will be given the hostname of the TM on startup.

7. Run multiple RMs. The TM will maintain a list for each active transaction of, which RMs are involved, and implement one phase commit. The WC will decide which pieces go where.

8. Modify the TM to store a list of which transactions committed, necessary for two phase commit below.

9. Implement two phase commit. At this stage, code the basics of two phase commit. That is, implement commit and abort under the assumption that messages never get lost or excessively delayed.

10. Now worry about what happens on failure. In particular, handle cases where messages get lost and ensure the RM’s can recover from being in the undecided state (in those cases where it’s technically feasible).

11. After you finish the first set of steps, we've thought about a handful of enhancments that could be made to make the project more enjoyable, or more applicable. This is certainly not an exhaustive list.

  1. Logging instead of shadowing. Shadowing has the advantage of simplicity and the disadvantage of poor performance, sort of like bubble sort. Logging allows better performance, but is tricky. This sub-project would be appropriate for someone literate in Java and interested in performance issues, and might involve not implementing shadowing at all.
  2. TM committed transaction list garbage control. The Transaction Manager keeps a list of committed transactions, so that a Resource Manager can connect to it after recovery and ask if a particular transaction was committed. Since storage is not infinite, implement a garbage control scheme for this list of committed transactions.