CSE 326, Winter 1999: Homework 6

Due 3/5/99

For all program or data structure design problems such as the two below you must provide pseudocode (see the manual)and an adequate explanation of the methods. It is often helpful to include small examples demonstrating the methods. Straight pseudocode with no additional documentation is not enough.

  1. (10 points) Imagine that you have been asked to to implement a new IRS database. There are 100,000,000 records, each of which take an average of 2,000 bytes. The records are keyed with a Social Security Number which is 4 bytes. The computer that you will be using has 2,048 byte pages and pages are addressed with 4 byte integers. Assume that the computer has 16 MB of memory that is usable for storing all or part of a search structure. For the B-tree, disk addresses are used for pointers, and a 2 byte integer is used to keep track of the length of the active area in the B-tree node. The leaves of the B-tree contain pointers to the actual records. We assume that the nodes, other than the root, of the B-tree are about 3/4-th full on average.
  2. (10 points) In the priority queue method for the N-body problem we have to maintain a max-priority queue together with the sum of all the values in the priority queue. This would seem like a simple task because every time we insert into the priority queue we can add the new value to the sum and every time we do a delete-max we can subtract the deleted value from the sum. Unfortunately, this does not work well in practice because the values are floating point numbers and there is wide variation in scale from the smallest entry in the priority queue to the largest entry in the priority queue. We need another method to maintain the sum that is not so sensitive to round-off error.

    My astronomer friends discovered such a method using leftist trees. The idea is to add a new field called sum to a leftist tree node that maintains the sum of all the nodes in the subtree rooted at that node. If these sums are maintained properly then round-off errors can be minimized. The proper way to maintain the sum at a non-leaf node p is to make sure that the sum is always computed as p.key + p.left.sum + p.right.sum where the sums were computed the same way for the children of p. Round-off error is minimized because addition is done by adding up the smallest numbers first, before adding in the larger numbers.

    Your assignment is to design recursive leftist merge operation that properly maintains the sum field. This function then can be used in your designs of insert and delete-max that also maintain the sum field properly.