CSE 373, Autumn 2002: Homework 5

Due 11/22/2002

For all program or data structure design problems such as the two below you must provide pseudocode (see the manual) and an adequate explanation of the methods. It is often helpful to include small examples demonstrating the methods. Straight pseudocode with no additional documentation is not enough.  Your pseudocode can contain English if needed.  Each problem should be answered on a separate sheets of paper so they can be graded by different TAs.  Your name should be at the top of every page.
 
  1. (10 points) In this problem you will design algorithms to handle lazy deletion for closed hashing using linear probing. Assume we have an on going process of insertions, deletions, and finds. Design four algorithms where each algorithm is briefly described in pseudocode. Assume the keys are non-negative integers which are hashed into a table T of size HSize. An empty table entry can be indicated using -2. Use -1 in a location to indicate that the entry formerly in the location has been deleted. Your hash table algorithms do not have to resize the hash table.
  2. (10 points) Imagine that you have been asked to to implement a new IRS database. There are 100,000,000 records, each of which take an average of 2,000 bytes. The records are keyed with a Social Security Number which is 4 bytes. The computer that you will be using has 2,048 byte pages and pages are addressed with 4 byte integers. Assume that a B-tree node completely as possible fills a page.  This may means that leaves will hold may hold more or less keys than internal nodes depending on what data is stored on the leaves. Assume that the computer has 16 MB of memory that is usable for storing all or part of a search structure. For the B-tree, disk addresses are used for pointers, and a 2 byte integer is used to keep track of the length of the active area in the B-tree node. The leaves of the B-tree contain pointers to the actual records. We assume that the nodes, other than the root, of the B-tree are about 3/4-th full on average.
  3. (10 points)  A d-heap is like a binary heap except that each node has d children instead of 2.  For a number of reasons a 3-heap can perform better than a binary heap.