Priority Queues
Designing and analyzing priority queues.
Priority queues are a fundamental building block for many real-world algorithms. For instance, if we only need to show the first 10 autocomplete matches in a specific order, a priority queue can be more efficient than sorting the entire list of all matches. Or, if we’re implementing a map navigation algorithm, a priority queue allows dynamic updates to the next-shortest path as the algorithm discovers new roads. Priority queues may also be used for content moderation.
Online social media platforms facilitate the development of social relationships between users by allowing users to create and share content with each other. Most social media platforms rely on algorithms to personalize the content feed presented to each user. This user-generated content requires moderation, or methods for managing content shared between users on the platform.
In this project, we will compare 4 implementations of priority queues to simulate a content moderation queue that might be used in a social media platform where content is generated continuously. By the end of this project, students will be able to:
- Design and implement multiple data structures to solve complex problems.
- Analyze and compare implementation runtimes and optimizations.
In the next project, we’ll also use your priority queues as a building block for shortest paths.
Can I work with someone else on this project?
Although this project requires an individual submission, you may find it useful to discuss overarching ideas with your classmates. Our primary rule is that we ask that you do not claim to be responsible for work that is not yours. If you get some help from someone else or from an online resource, cite it. I believe that there is a lot of value in learning from others so long as you do not deprive yourself (or others) of the opportunity to learn.
We are comfortable doing this because each submission in this class comes in the form of a video that you record. Your video is a demonstration of everything that you learned throughout the process of working on an assignment. Our goal is for students to support each other and find community through this course. The real advantage of taking a course on-campus at a university is to be able to connect with others who share common interests in learning.
What am I submitting at the end of this project?
Satisfactory completion of the project requires one Canvas submission including: a video-recorded individual presentation, some kind of visually-organizing structure, such as slides or a document and a link to your src/main/java/minpq
folder on GitLab that addresses all the green callouts.
The project instructions contain a lot of details to provide context, clarify common confusions, and help students get started. Your Canvas submissions only need to address tasks that are described in green callouts like this one.
Your video presentation should meet the following requirements:
- Your presentation should not be much longer than 8 minutes (9 minutes maximum) and should include your voiceover. (Your video is appreciated but not necessary.)
- Your video presentation should include some kind of visually-organizing structure, such as slides or a document.
- After submitting to Canvas, add a submission comment linking to your slides or document, a link to your
src/main/java/minpq
folder on GitLab, and a list of your citations and references used. Remember tocommit
andpush
to Gitlab and check that the associated status for your latest Pipeline is a green checkmark!
Students are expected to follow Washington state law on the Student Conduct Code for the University of Washington. In this course, students must:
- Indicate on assignment submissions any assistance received, including materials distributed in this course.
- Not receive, generate, or otherwise acquire any substantial portion or walkthrough for an assignment.
- Not aid, assist, attempt, or tolerate prohibited academic conduct in others.
In a canvas comment turned in with your video presentation, you must cite any sources you used, including lecture materials, section materials, and collaboration with peers. If you used any kind of computer technology to help prepare your project deliverable, include the queries and/or prompts.
Submitted work that is not consistent with sources may be subject to the student conduct process.
Priority queue interface
The MinPQ
interface represents a priority queue that affords access to minimum-priority elements. Priority values are extrinsic to each element: rather than relying on a compareTo
method, priority values are specified as arguments to the add
and changePriority
methods.
void add(E element, double priority)
- Adds an element with the given priority value if the element is not already in this priority queue.
boolean contains(E element)
- Returns true if the given element is in this priority queue.
E peekMin()
- Returns the element with the minimum priority value.
E removeMin()
- Returns and removes an element with the minimum priority value.
void changePriority(E element, double priority)
- Updates the given element’s associated priority value.
int size()
- Returns the number of elements in this priority queue.
The interface also defines two default methods that rely on implementations of the above methods.
void addOrChangePriority(E element, double priority)
- Adds an element with the given priority value if it is not already present. Otherwise, updates the priority value of the existing element.
if (!contains(element)) { add(element, priority); } else { changePriority(element, priority); }
boolean isEmpty()
- Returns true if this priority queue contains no elements. Returns whether
size() == 0
.
A MinPQ
cannot contain duplicate elements. However, different elements can have the same priority value. peekMin
and removeMin
may return any element with the minimum priority value.
Here’s a small test example showing how to use the MinPQ
interface with the DoubleMapMinPQ
reference class included in the project. Try this example yourself by writing a new test case in the MinPQTests class. You can write additional test cases like this to assist in debugging.
@Test
void compareSimple() {
MinPQ<String> reference = new DoubleMapMinPQ<>();
reference.add("1", 1.0);
reference.add("2", 2.0);
reference.add("3", 3.0);
reference.add("4", 4.0);
reference.add("5", 5.0);
reference.add("6", 6.0);
MinPQ<String> testing = createMinPQ();
testing.add("1", 1.0);
testing.add("2", 2.0);
testing.add("3", 3.0);
testing.add("4", 4.0);
testing.add("5", 5.0);
testing.add("6", 6.0);
// Call methods to evaluate behavior.
reference.changePriority("3", 0.0);
reference.changePriority("1", 7.0);
while (!reference.isEmpty()) {
System.out.println(reference.removeMin());
}
testing.changePriority("3", 0.0);
testing.changePriority("1", 7.0);
while (!testing.isEmpty()) {
System.out.println(testing.removeMin());
}
// Assert that the different PQ's are equivalent
assertTrue(sameRemoveOrder(reference, testing));
}
Why not just implement Java's PriorityQueue interface?
The java.util
standard library includes a binary heap PriorityQueue
class. Here are three reasons why we can’t implement Java’s PriorityQueue
class in this project.
PriorityQueue
is a class- We can’t implement the
PriorityQueue
class because it’s a class, not an interface. Although the Java developers designedSet
,Map
, andList
as interfaces, priority queues are so commonly associated with binary heaps that the Java developers broke the pattern and definedPriorityQueue
as a class. PriorityQueue
relies oncompareTo
- By default,
PriorityQueue
uses elements’compareTo
method to define the priority of an element. For example, the priority of the string “A” is less than “B” because “A” precedes “B” in the alphabet. This makesPriorityQueue
great for sorting objects, but not great for applications like shortest paths. PriorityQueue
disaffords changing priority value- The fact that
PriorityQueue
uses elements’compareTo
methods reveals a hidden disaffordance. To change the priority of an object inPriorityQueue
, we need to remove the element from the priority queue and then re-insert it. This workaround is not only inconvenient, but also inefficient and prone to bugs.
PriorityNode
Java collections typically only specify a single data type as in ArrayList<String>
. But a MinPQ
needs to keep track of each element as well as its associated priority value. We could do this by creating two lists: an ArrayList<T>
for elements and an ArrayList<Double>
for each element’s priority value. However, this approach requires us to ensure the state of both lists are always the same, which introduces additional complexity that makes the code harder to maintain and more brittle or susceptible to future bugs.
The PriorityNode
class includes two fields representing an element
together with its priority
value so that it can be used in a Java collection.
List<PriorityNode<String>> elements = new ArrayList<>();
elements.add(new PriorityNode<>("example", 0));
Two PriorityNode
objects are considered equal if and only if their elements are equal. Priority values are not checked for equality.
How will this property of PriorityNode equality help you implement MinPQ?
MinPQ
does not allow duplicate elements, but does allow duplicate priority values. When using Java collections such as a List
, methods like List.contains
or List.remove
will call the objects’ equals
method to check for equality. The following contains
call will return true
, and the remove
call will successfully remove the priority node even though their priority values are different.
elements.contains(new PriorityNode<>("example", 1));
elements.remove(new PriorityNode<>("example", 2));
Reference implementation
The project code includes a working DoubleMapMinPQ
. This implementation is called “double map” because it combines the runtime benefits of two maps to solve the problem efficiently. The two maps work together to help achieve sublinear (logarithmic or better) runtime for every operation.
NavigableMap<Double, Set<E>> priorityToElement
- Associates each unique priority value to all the elements with that priority value. Returning a minimum-priority element involves finding the set of minimum-priority elements and picking any element from that set.
Map<E, Double> elementToPriority
- Associates each element with its priority value in order to speed-up
contains
andchangePriority
.
The state of both maps is synchronized across all methods. Any change to one data structure also requires a change to the other data structure.
Design and implement
Design and implement 3 implementations of the MinPQ
interface.
Make sure to pass all the tests
for UnsortedArrayMinPQ
, HeapMinPQ
, and OptimizedHeapMinPQ
, push your code to Gitlab, and check that the associated status for your latest Pipeline is passing. If the implementations do not pass all the test cases, explain in your video what you think could be causing the problem.
UnsortedArrayMinPQ
Elements are added to an ArrayList
in any order. Most operations may need to scan over the entire list.
HeapMinPQ
A standard binary heap priority queue that delegates all method calls to java.util.PriorityQueue
. In other words, each MinPQ
operation is implemented by calling the underlying PriorityQueue
. This class contains only one field assigned to an instance of PriorityQueue
. Our solution only uses 1-3 additional lines of code for most methods.
Add the compareSimple
test shown above during the description of the Priority queue interface, into MinPQTests
. Then, trace through the test compareSimple
for the HeapMinPQ
implementation with a focus on justifying how your code for changePriority
produces the correct behavior. You can justify changePriority
by tracing the code for that method once you reach that method in your trace of compareSimple
.
OptimizedHeapMinPQ
A optimized binary heap priority queue supported by a HashMap
that associates each element with its array index to speed-up contains
and changePriority
. In other words, our heap, represented through an array, holds priority nodes organized by priority value. The HashMap
acts as an “address book” for the indices of each priority node, helping us locate nodes within the heap.
Use this MinPQ.java class as a reference.
- Identify methods in the reference class that are most similar to our interface.
- Adapt the code to implement our interface. Make sure all tests pass before optimizing the code.
- Optimize the code by adding a
HashMap
synchronized to the state of the elements in the array.
Recall from the Binary Heaps lesson that “skipping” index 0 in the array representation of a heap can help simplify expressions for finding the left child, right child, and parent nodes in a heap. You may find it helpful to skip index 0 in your implementation by adding null
to the front of your representative ArrayList
.
Trace through the test compareSimple
for the OptimizedHeapMinPQ
implementation with a focus on justifying how your code for changePriority
produces the correct behavior. You can justify changePriority
by tracing the code for that method once you reach that method in your trace of compareSimple
.
Analyze and compare
Asymptotic analysis
Give a big-theta bound for the worst-case runtime of the removeMin
and changePriority
methods for each implementation, including DoubleMapMinPQ
, with respect to N, the size of the priority queue. Explain the worst case and the runtime of each implementation in a couple sentences while referencing the code. Please use the assumptions we provide below for the runtime analysis.
For all array-backed data structures, ignore the time it would take to resize the array.
java.util.PriorityQueue
is implemented using an array-representation binary heap that
provides O(log(n)) time for the enqueuing and dequeuing methods (
offer
,poll
,remove()
andadd
); linear time for theremove(Object)
andcontains(Object)
methods; and constant time for the retrieval methods (peek
,element
, andsize
).
HashSet
and HashMap
are implemented using resizing separate-chaining (linked list structure) hash tables where the number of buckets is always similar to the size. Assume that the hash function evenly distributes elements across the buckets in the underlying array. The Java implementations include a further optimization.
- Tree bucket optimization
- Let us now add the tree bucket optimization, already existing in Java, into our asymptotic analysis. When the size of a bucket exceeds 8 elements, the separate chain is automatically converted from a linked list to a red-black tree.
Explain the impact of tree bucket optimization assuming an even distribution of elements across the underlying array. Does the tree bucket optimization help, hurt, or not affect the asymptotic analysis given our assumptions?