CSE 373, Autumn 2002: Homework 5
Due 11/22/2002
For all program or data structure design problems such as the two below
you must provide pseudocode (see the manual) and an adequate explanation
of the methods. It is often helpful to include small examples demonstrating
the methods. Straight pseudocode with no additional documentation is not
enough. Your pseudocode can contain English if needed. Each
problem should be answered on a separate sheets of paper so they can be
graded by different TAs. Your name should be at the top of every
page.
-
(10 points) In this problem you will design algorithms to handle lazy deletion
for closed hashing using linear probing. Assume we have an on going process
of insertions, deletions, and finds. Design four algorithms where each
algorithm is briefly described in pseudocode. Assume the keys are non-negative
integers which are hashed into a table T of size HSize. An empty table
entry can be indicated using -2. Use -1 in a location to indicate that
the entry formerly in the location has been deleted. Your hash table algorithms
do not have to resize the hash table.
-
Find(x,T) which returns the index of where x is stored if it is in the
table T and -1 otherwise.
-
Insert(x,T)which inserts x into the table T with the precondition that
x is not currently in the table. Naturally, a location with a -1 in it
can be used to place a newly inserted item. Your algorithm should indicate
when the hash table T is full and x cannot be inserted.
-
Delete(x,T) which removes x from the table T using lazy deletion.
-
Rehash(T) which returns a new table of the same size HSize which contains
all the items of the old table, but has no items marked as deleted.
-
(10 points) Imagine that you have been asked to to implement a new IRS
database. There are 100,000,000 records, each of which take an average
of 2,000 bytes. The records are keyed with a Social Security Number which
is 4 bytes. The computer that you will be using has 2,048 byte pages and
pages are addressed with 4 byte integers. Assume that a B-tree node completely
as possible fills a page. This may means that leaves will hold may
hold more or less keys than internal nodes depending on what data is stored
on the leaves. Assume that the computer has 16 MB of memory that is usable
for storing all or part of a search structure. For the B-tree, disk addresses
are used for pointers, and a 2 byte integer is used to keep track of the
length of the active area in the B-tree node. The leaves of the B-tree
contain pointers to the actual records. We assume that the nodes, other
than the root, of the B-tree are about 3/4-th full on average.
-
Calculate the maximum number of children (M) that an internal node of the
B-tree can have. Calculate the maximum number of entries (L) a leaf of
the B-tree can have.
-
Calculate the height of the B-tree with that order so that all the 100,000,000
records can be accessed. About how many children does the root have.
-
Calculate how many levels of the B-tree can fit into the memory of the
computer. Several full levels and the portion of one level will fit into
memory.
-
Calculate how many disk accesses in the worst case are necessary to find
a specific record using this search structure. How many on average?
-
(10 points) A d-heap is like a binary heap except that each node
has d children instead of 2. For a number of reasons a 3-heap can
perform better than a binary heap.
-
Assuming a d-heap is stored in an array indexed 1 to Max, give a formulas
for computing the index of the d children of the node at index i.
Give a formula for computing the parent of a node at index i.
-
Design PercolateDown(i : integer, x : keyvalue) for the 3-heap.
-
Design PercolateUp(i : integer, x: keyvalue) for the 3 heap.