Binary Heaps
Table of contents
- Complete binary trees
- Priority queues
- Heaps
- Binary heap invariants
- Binary heap operations
- Max-heap demos
- Tree representations
- Array representation
- Heap operations
- Heap constraints
- Heap indexing
Complete binary trees
When we introduced 2-3 trees, we first proposed the following invariant: each node must have either 0 or 2 children. (A binary tree with this property is known as a full binary tree.) But we also came up with a worst-case, spindly binary tree structure that maintained this invariant.
This led us to consider a different hypothesis: that unbalanced growth leads to worst-case height trees. This hypothesis setup the foundation for 2-3 trees.
But what if we turned back time and didn’t give up here? Consider the binary trees depicted below. Let’s develop an invariant that includes the balanced trees and excludes unbalanced trees.
One such invariant is the definition of complete binary trees.
- Complete tree
- A tree where every level (except possibly the last) is completely filled, and the last level is filled left to right.
Unfortunately, it’s hard to maintain the binary search tree invariant together with the completeness invariant. Consider how we might maintain the completeness invariant while adding the item A to the following BST.
Note that this isn’t possible with rotation alone! Some of the nodes have to move quite a distance: the leaf D, which was previously the right child of C, becomes the overall root of the entire tree as a result of inserting A!
If we want completeness, then we can’t have the binary search tree invariant. But without the BST invariant, we can’t efficiently implement a set or map abstract data type—all of our search trees relied on a search tree invariant to help us decide, at each node, whether we would expect to find the target value in the left/middle/right subtree.
Priority queues
The priority queue is an abstract data type where elements are organized according to their associated priority values. A max-oriented priority queue serves highest-priority elements first while a min-oriented priority queue serves lowest-priority elements first. A priority queue might be used to help triage emergency room requests. Rather than helping patients first-come, first-served like in a regular queue, we might instead use a priority queue to ensure patients with the most severe ailments (or most time-sensitive needs) are treated first even if someone else arrived before them.
The min-oriented priority queue MinPQ
interface supports 3 important operations: adding new elements to the priority queue, returning the minimum priority element, and removing the minimum priority element. Unlike sets and maps, duplicate elements are allowed, and multiple elements may be added to the priority queue with the same priority value. In case there are two elements with the same priority values, ties can be broken arbitrarily so any element with the minimum priority value could be returned or removed on the next call to those methods. For simplicity, our demos, examples, visualizations in lessons and sections will treat the value of the number or character as its priority.
add(int element)
- Add the given
element
to the priority queue. min()
- Returns (without removing) the element with the highest priority.
removeMin()
- Removes and returns the element with the highest priority.
We can implement the MinPQ
interface using data structures that we’ve already introduced earlier.
UnorderedLinkedListMinPQ
- In an unordered linked list, insertion is fast since we can add the element to the front of the list in constant time. Finding the minimum is slow since we need to check the entire linked list.
OrderedLinkedListMinPQ
- In an ordered linked list where elements are stored by descending priority value, the opposite is true. Finding the minimum is fast since it’s just the first element in the list. Insertion is slow since we need to find the exact position where the element belongs in the ordered list.
AVLTreeMinPQ
- A binary search tree organizes elements in sorted order by their priority values. The lowest-priority element is stored in the leftmost node in the tree while the highest-priority element is stored in the rightmost node in the tree.
Question 1
Using intuition, what is the order of growth of the runtime for AVLTreeMinPQ
operations?
- Θ(1)
- Θ(log N)
- Θ(N)
Heaps
Binary heap invariants
A binary min-heap is a complete binary tree that maintains the min-heap invariant.
- Min-heap invariant
- The priority of each node in the heap is less-than or equal to the priorities of its children.
Question 1
Which of these trees are valid binary min-heaps?
Binary heap operations
By definition, the lowest-priority element in a min-heap is always stored in the overall root. To implement the min
method, just return the root item. Implementing removeMin
is a bit trickier.
removeMin
- Swap the root with the last leaf.
- Remove the last leaf.
- Sink the new root to its proper place, promoting the lesser child.
The sink operation swaps the current node with one of its children, “sinking” the current node down the heap.
- Sink
- Swap a node down the tree until it is lesser than both of its children. Promote the lesser child. Break ties arbitrarily.
Likewise, we can define a swim operation to swap the current node with its parent.
- Swim
- Swap a node up the tree until its parent is lesser than itself.
We can define add
and maintain the binary heap invariants using the swim operation.
add
- Add the item in the next open leaf position (from the left) on the bottom level.
- Swim to restore heap invariants.
Max-heap demos
Tree representations
One of the main advantages of using heaps to implement priority queues is in the tree representation. We can represent trees in different ways.
- Node representation
- Store items together with structure. Only parent to child references, so much harder to implement swim.
- Array representation
- Store items separate from structure. If the tree is complete, both parent to child and child to parent indices can be computed for sink and swim.
In the real-world, binary heaps are never implemented using the node representation. When discussing “binary heaps” with other computer scientists, array representation is the default assumption.
Array representation
Heap operations
Use the Min Heap visualization and insert the items in level-order from left-to-right: 0, 5, 1, 8, 8, 6, 2.
Question 1
Select all values that, when inserted into this heap, will result in the maximum number of swaps.
- -3
- -1
- 1
- 3
- 6
- 9
Question 2
When removing the minimum-priority element from this heap, which value will be temporarily placed at the root of the heap?
Question 3
When removing the minimum-priority element from this heap, how many levels will the temporary root node need to sink?
Heap constraints
Suppose this binary min-heap stores the lowest-priority element in the overall root node and the second-lowest element in its left child. Assume all elements and priority values are distinct.
Question 1
Select all nodes where the third-lowest-priority element could be found.
- A
- B
- C
- D
- E
- F
- G
Question 2
Select all nodes where the highest-priority element could be found.
- A
- B
- C
- D
- E
- F
- G
Heap indexing
Consider the following binary min-heap of integers with __
denoting blank spaces in the array. Coincidentally, the value of the integer lines up with its index in the array representation of the heap. Note that this implementation is using the zero-indexed array representation.
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, __, __, __, __]
Question 1
Give the index representing the minimum value.
Question 2
Give the index representing the last leaf node on the bottommost level.
Question 3
Give the index representing the left child of the value 3
.
Question 4
Give the index representing the right child of the value 3
.
Question 5
Give the index representing the parent of the value 10
.