CSE326 Spring 2002 Quiz Section
GROUP WORK: Scenarios for 5/23/02
1.
The
system you are designing requires a sort routine. Unfortunately, memory usage
is at such a high cost on this machine (say it is a small, embedded device)
that you cannot afford to have such (if any overhead) in your sorting
algorithm.
Which sorting routine would you choose?
a)
Quick
Sort
b)
Insertion
Sort
c)
Merge
Sort
d)
Heap
Sort
Discussion Points:
Quick Sort:
-
very
fast in practice
-
recursion
costs memory, but tail recursion or iterative quicksort could make this cost
acceptable
Insertion Sort:
-
simple
mathematical operations
-
only
constant memory overhead
-
no
recursion
-
is
O(n2) in the worst-case, but maybe our data comes presorted
-
if
n is small, this might still beat an n log n algorithm
Merge Sort:
-
uses
a lot of memory
-
if
done in-place, very efficient (guaranteed O(n log n)).
Heap Sort:
-
simple
mathematical operations (only compares, bit shifts, and increments)
-
guaranteed
O(n log n)
-
only
constant memory overhead
-
no
recursion
2.
You
need a data structure with the following functionality. Items are inserted,
looked up, and removed on a fairly
frequent basis. More importantly, every hour, in order to save space, you
remove the least used data from the structure. For example, if data A was
accessed only once in the past hour, you
might want to remove it.
Assuming it is up to you to decide what removing old
data means, which data structure is a good starting point for you to work from?
a)
AVL
Tree
b)
Sorted
List
c)
Splay
Tree
d)
Hash
Table
Discussion Points:
-
This
depends a lot on how we define what old data is. Is it what we accessed most recently or just data that is old.
-
Splay
Trees intrinsically place the least used data deeper in the tree. When cleaning, one could just delete all
nodes beneath a certain depth.
-
Hash
Tables allow for very efficient basic operations (insert, find, remove,
etc.) The only issue is removing the
last used items. Assuming each is
timestamped with the last time it was used, one could traverse across every
item in the table and remove what is “old.”
While this is costly, it’s a rare operation.
-
AVL
trees allow for efficient operation, but maintaining the balance after deleting
several nodes at once could be costly.
3.
We
propose a new operation to the dictionary ADT. RangeRestrict(x,y) should remove
all items in the dictionary whose
values are less than x or greater than y. For example, if the dictionary contains real numbers, after a call to
RangeRestrict, only the numbers in the closed interval [x,y] would be in the dictionary. You may assume that x and y
are in the dictionary.
Which data structure would provide an efficient
implementation for both the traditional dictionary operations as well as
RangeRestriction?
a)
Sorted
List
b)
Hash
Table
c)
Splay
Trees
d)
AVL
Trees
Discussion Points:
-
In
a tree with the binary search tree property, the following is true: the left
subtree of x and the right subtree of y are what you want to delete.
-
Balanced
trees offer fast implementation of the traditional dictionary operations.
-
With
a Splay Tree, one can do a Find(x), DeleteLeftSubtree(x), Find(y),
DeleteRightSubtree(y). Very efficient.
-
With
an AVL tree, recovering the balance of the tree after a RangeRestrict does not
seem easily accomplishable (unless we do a lazy removal).
-
Hashtables
offer fast basic operations, but a RangeRestrict would require traversing
across the entire table. This remove
could also seriously slow down probes if open addressing is used.