CSE 326 Summer 2003
Homework 7

NOTE: This homework is for your practice only.  It will not be turned in.

1. Calculate the amount of memory in bytes used by each of the following dictionary data types where:

• pointers require 4 bytes
• keys require 8 bytes
• data associated with each key requires 2 bytes
• there are 10,000 (key,data) pairs
• “ints” are 4 bytes

a.  sorted array

b.  BST

c.  AVL tree

d.  splay tree

e.  hash table – separate chaining, lambda=0.75

f.  hash table – closed hashing, lambda=0.4. The hash table array should store the keys and data themselves, not pointers to the keys or data.

g.  b-tree, where the branching factor is 5. Read about b-trees in Weiss section 4.7.

2. The Unix 'spell' command reads an input file and prints out all the words in the input file that are not in some reference dictionary. Suppose the reference dictionary contains 30,000 words, and we only wish to make one pass through the reference file. A simple strategy is to read the reference dictionary into a hash table and use the hash table to look for each input word as it is read.

If memory is limited, however, and the entire dictionary cannot be stored in a hash table, we can still get an efficient algorithm that almost always works. We declare array 'table', with type bool or boolean (initialized to false) of size tableSize. As we read in a word from the reference dictionary, we set table[hash(word)] = true. Which of the following statements are true? Justify your answers.

a) If a word hashes to a location with value false, then the word is not in the dictionary.

b) If a word hashes to a location with value true, then the word is in the dictionary.

Suppose we choose tableSize = 300,007.

c) Assume that an average word can be stored in 7 bytes, that a bool or boolean can be represented with one bit, and that the table of bools or booleans is packed as densely as possible (8 bool or boolean per byte). How much memory does the table require?

d) What is the probability that a misspelled word is NOT recognized as a misspelling by this algorithm? What is the probability that the algorithm mistakes a properly-spelled word as a misspelling?

3. You have a set of elements a through j and perform the following sequence of operations on a Disjoint Sets ADT:

find(d)

union(d, a)

union(b, c)

union(h, j)

find(c)

union(h, b)

find(j)

union(b, a)

a) Without weighted union or path compression, and choosing the root of a union by selecting the alphabetically smaller node, show the up-tree forest which results from the above operations. At what depth does node j end up?

b) With weighted union, but without path compression, and breaking ties on unions by selecting the alphabetically smaller node, at what depth does node j end up? It is not necessary to show the resultant up-tree forest again.

c) With weighted union and path compression, and breaking ties on unions by selecting the alphabetically smaller node, at what depth does node j end up? It is not necessary to show the resultant up-tree forest again.

4. Graphs: Weiss 9.5, 9.16, 9.41

5. Sorting: Weiss 7.17, 7.19, 7.20.   For 7.19, explain what a cutoff is.