Link
Sets, Maps, and BSTs
Two principles for understanding Binary Search Trees and how they affect the implementation of the data structure.
Kevin Lin, with thanks to many others.
1
Ask questions anonymously on Piazza. Look for the pinned Lecture Questions thread.

2
Tree Abstraction
3
5
1
0
2
4
6
Root
Leaf (also subtree)
Subtree
Root of subtree
Nodes
Values
Recursive Description: root and subtree.
Each subtree is itself a valid tree.
A tree with zero subtrees is a leaf.
Relative Description: node and value.
Each node has a value. A parent node is joined with an edge to each child node.
We oftentimes abuse the terminology a bit by saying things like, “each parent is the sum of its children”.

?: What does the “root node” refer to? What does the “root value” refer to?

Feedback from the Reading Quiz
3

Analysis of an OrderedLinkedSet<Character>
OrderedLinkedSet is an implementation of Set using Linked Nodes.
Name an operation that takes worst case linear time, i.e. Θ(N).
4
Q
A. size		B. contains	C. add		D. iterator		E. isEmpty
D
E
F
C
B
G
A
7
size

Name an operation that takes worst case linear time, i.e. Θ(N).
5

Optimization: Express Lanes
Problem: Search is slow even if we spend extra time adding keys to their sorted position.
Solution 1: Add (random) express lanes. Data structure is called a Skip List (out of scope).
6
D
E
F
C
B
G
A
We won’t be discussing skip lists in depth, but hold onto this idea of creating shortcuts.

Optimization: Move Entry Point
Problem: Search is slow even if we spend extra time adding keys to their sorted position.
Solution 2: Move the entry pointer to the middle.
7
D
E
F
C
B
G
A
All keys to the left of the entry point are less-than D (come before it in the alphabet) while all the keys to the right of the entry point are greater-than D.

Optimization: Move Entry Point, Flip Links
Problem: Search is slow even if we spend extra time adding keys to their sorted position.
Solution 2: Move the entry pointer to the middle. Flip the left links.
8
D
E
F
C
B
G
A
?: How does this change affect the worst-case search time?




?: How can we improve this optimization?

Optimization: Move Entry Point, Flip Links, Use Longer Hops
Problem: Search is slow even if we spend extra time adding keys to their sorted position.
Solution 2: Move the entry pointer to the middle. Flip the left links. Use longer hops.
9
D
E
F
C
B
G
A
D
E
F
C
B
G
A
We saw this pattern of recursive subdivision in merge sort, and it’s here again!

?: What is the worst-case search time?

Order Theory
Based on the ordering given by the binary search tree to the left,fill in the tree to the right with valid symbols.
10
Q
?: Binary search trees are related to OrderedLinkedSets. What do we know about the relationship between the square symbol and the triangle symbol?




Q1: Based on the ordering given by the binary search tree to the left, fill in the tree to the right with valid symbols.

Order Theory
Based on the ordering given by the binary search tree to the left,fill in the tree to the right with valid symbols.
11
A
1
2
3

Applying Order Theory
Say we have the following total order.

Assume that there are several other symbols not shown above.

In which of the five labeled nodes can the pentagon symbol          reside?

12
Q
A
C
D
E
B
Q1: In which of the five labeled nodes can the pentagon symbol reside?

In which of the five labeled nodes can the pentagon symbol reside?
13

Binary Search Tree Invariant
Say we have the following total order.



Binary Search Tree Invariant.For every node X in the tree:
All keys in the left subtree ≺ X’s key.
All keys in the right subtree ≻ X’s key.
14
lower
upper
lower
upper
?: If we search a left subtree, how does that change the lower limit on the keys? The upper limit?




?: If we search a right subtree, how does that change the lower limit on the keys? The upper limit?

Key Comparison
Formally, ordering must be complete, transitive, and antisymmetric.
Given keys p and q:
Exactly one of p ≺ q and q ≺ p are true.
p ≺ q and q ≺ r imply p ≺ r.

One consequence of these rules:No duplicate keys allowed!
15
lower
upper
?: What is the purpose of this formal definition of key comparison? How do we apply these rules to numbers vs. arbitrary objects?




?: How might we allow duplicate keys in our binary search tree in spite of these rules? What are the potential problems that arise?

Binary Search Tree Algorithms
16
Let’s take a look at some algorithms that rely on the Binary Search Tree Invariant.

Search Algorithm
static BST contains(BST T,                    Key sk) {
  if (T == null)
    return null;
  if (sk.equals(T.key))
    return T;
  else if (sk ≺ T.key)
    return contains(T.left, sk);
  else
    return contains(T.right,                    sk);
}
17
If searchKey equals T.key, return.
If searchKey ≺ T.key, search T.left.
If searchKey ≻ T.key, search T.right.
dog
bag
flat
alf
cat
elf
glut

Search Algorithm Analysis
What is the runtime to complete a search on a bushy BST in the worst case, where N is the number of nodes?
Θ(log N)
Θ(N)
Θ(N log N)
Θ(N2)
Θ(2N)
18
Q
We don’t yet have a formal definition for the concept of bushiness. Use the example as a visual aid.

Q1: What is the runtime to complete a search on a bushy BST in the worst case, where N is the number of nodes?




?: What is the best case runtime?

What is the runtime to complete a search on a bushy BST in the worst case, where N is the number of nodes?
19

Height of Tree
The number of edges on the longest path between the root node and any leaf.
A path is a connected sequence ofedges that join parent-childnodes.
The height of thistree is 3.
20
1
2
3
?: What’s the height of a tree that only consists of a single leaf? How many edges are between the root node and the (only) leaf node in this tiny tree?

21
Adding a New Key
Check if the tree already has the key. If found, do nothing. Else, create a new node and set the appropriate reference.
static BST add(BST T, Key ik) {
  if (T == null)
    return new BST(ik);
  if (ik ≺ T.key)
    T.left = add(T.left, ik);
  else if (ik ≻ T.key)
    T.right = add(T.right, ik);
  return T;
}
dog
bag
flat
alf
eyes
cat
elf
glut
You might sometimes see code that exhibits “arm’s-length recursion.” Consider these two unnecessary base cases.

if (T.left == null)
  T.left = new BST(ik);
else if (T.right == null)
  T.right = new BST(ik);

?: How does the code given in the slide handle the arm’s-length recursion scenario?

Removing an Existing Key
3 different cases for removing an existing key based on the number of children.
Key has no children.
Key has one child.
Key has two children.

In each case, our goal is to maintain the Binary Search Tree Invariant!
22
dog
bag
flat
alf
eyes
cat
elf
glut

Removing: Leaf Case
Remove the node with the value glut.


Just remove the reference from the parent node.
23
dog
bag
flat
alf
eyes
cat
elf
glut
?: Give pseudocode for the removing a leaf from a tree.

Removing: One Child
Remove the key containing the value flat.
What simple modification can we make to the tree to remove the value flat?
24
dog
bag
flat
alf
eyes
cat
elf
Q
Q1: What simple modification can we make to the tree to remove the value flat?

Removing: One Child
Remove the node with the value flat.
What simple modification can we make to the tree to remove the value flat?

Use the Binary Search Tree Invariant: any subtree under flat must contain values all greater than dog!
25
dog
bag
flat
alf
eyes
cat
elf
A
lower
?: Suppose the subtree of flat was a right child, instead of a left child (elf). Why does this answer still work?




?: Give pseudocode for the removing a node with one child from a tree.

Key Removal Challenge
Delete the root node with value k.
26
Q
e
b
g
a
d
f
v
p
y
m
r
x
z
k
Q1: Delete the root node with value k.

Remove the root node with the value dog.

We need to find a replacement root node.Must be ≻ than all keys in left subtree.Must be ≺ than all keys in right subtree.
Removing: Two Children
27
dog
bag
flat
alf
eyes
cat
elf
glut
?: Does the node with the value bag work?

Removing: Two Children
Remove the root node with the value dog.

We need to find a replacement root node.Must be ≻ than all keys in left subtree.Must be ≺ than all keys in right subtree.

The predecessor (cat) or successor (elf). These nodes have either 0 or 1 children.
28
dog
bag
flat
alf
eyes
cat
elf
glut
This procedure is also sometimes called “Hibbard Deletion”.

?: Why are the predecessor and successor nodes guaranteed to have 0 or 1 children?

Key Removal Challenge
Delete the root node with value k.
29
Q
e
b
g
a
d
f
v
p
y
m
r
x
z
k
Q1: Delete the root node with value k.




?: Give pseudocode for the removing a node with two children from a tree.

Key Removal Challenge
Delete the root node with value k.
Replace with either g or m.
30
A
e
b
g
a
d
f
v
p
y
m
r
x
z
k

Key Removal Challenge
Delete the root node with value k.
Replace with either g or m.
31
A
e
b
f
a
d
v
p
y
m
r
x
z
g

Implementing Sets and Maps
32

Binary Search Tree Set
Think of the BST below as representing a Set.
{mo, no, sumomo, uchi, momo}
33
sumomo
momo
mo
no
uchi
sumomo
mo
momo
no
uchi
?: How would we represent a count of words, i.e. a map?




?: Give an insertion order that would result in the particular tree shown in the slide.

Binary Search Tree Map
To represent maps, just have each BST node store key-value pairs.
34
sumomo   1
momo   2
mo       2
no       2
uchi      1
sumomo
mo
momo
no
uchi
1
2
2
2
1
?: What is the runtime for finding a particular value?

Implementer’s Design Decision Hierarchy
35
Set
Abstract Data Type
Which ADT is the best fit?

Data Structure
Which data structure offers the best performance for our input/workload?

Implementation Details
How do we maintain invariants?
Binary Search Tree
Linked Nodes
For every node X in the tree:
All keys in the left subtree ≺ X’s key.
All keys in the right subtree ≻ X’s key.
Map
As the ADT implementer, we always had to keep in mind our invariants when thinking through the problem.

?: How does the Binary Search Tree Invariant affect the implementation of contains, add, and remove?