Due Wednesday, August 7 at 11:59pm.
Submit by pushing your code to GitLab and submitting your writeup to Gradescope. If you intend to submit late, fill out this late submission form when you submit.
In this assignment, will practice writing more JUnit tests, and you will implement data structures and sorting algorithms for future use.
You will use these files from your prior assignments:
src/main/java/datastructures/dictionaries/ArrayDictionary.java
src/main/java/datastructures/dictionaries/ChainedHashDictionary.java
src/main/java/datastructures/sets/ChainedHashSet.java
src/main/java/datastructures/lists/DoubleLinkedList.java
If you have chosen a new partner for this assignment, choose either of your submissions from prior assignments and verify that these are functioning properly.
You will be modifying the following files:
src/main/java/datastructures/priorityqueues/ArrayHeapPriorityQueue.java
src/main/java/misc/Sorter.java
src/test/java/datastructures/priorityqueues/TestArrayHeapPriorityQueue.java
src/test/java/misc/TestSorter.java
Additionally, here are a few more files that you might want to review while completing the assignment (note that this is just a starting point, not necessarily an exhaustive list):
src/main/java/datastructures/priorityqueues/IPriorityQueue.java
src/test/java/datastructures/priorityqueues/IntPair.java
src/main/java/datastructures/dictionaries/IDictionary.java
src/main/java/analysis/experiments/*
src/main/java/analysis/utils/SortingAlgorithms.java
Here are some baseline expectations you should meet in all projects:
Follow the course collaboration policies.
DO NOT use or import any classes from
java.util.*
. There are only two exceptions to this rule:
You may import and use the following classes:
java.util.Iterator
java.util.NoSuchElementException
java.util.Objects
java.util.Arrays
You may import and use anything from java.util.*
within your testing code.
DO NOT make modifications to instructor-provided code (unless told otherwise). If you need to temporarily change our code for debugging, make sure to change it back afterwards.
Task: Set up Project 3 in your IDE and make sure all the tests for
DoubleLinkedList
, ArrayDictionary
,
ChainedHashSet
and ChainedHashDictionary
pass.
Import your homework repo from GitLab. See the instructions from Project 0 if you need a reminder on how to do this.
Copy your DoubleLinkedList.java
, ArrayDictionary.java
,
ChainedHashDictionary.java
, and ChainedHashSet.java
files from
Project 2 to this new one.
Copy any extra tests you wrote to TestDoubleLinkedList.java
or
TestChainedHashDictionary.java
Finally, make sure everything works.
Try running the following tests and make sure all the tests pass.
TestDoubleLinkedList.java
TestArrayDictionary.java
TestChainedHashDictionary.java
TestChainedHashSet.java
Try running SanityCheck.java
, and try running Checkstyle. Checkstyle
should still report the same 5 errors with SanityCheck.java
as it did in
the previous projects.
ArrayHeapPriorityQueue
and
Sorter.topKSort(...)
Task: add tests for ArrayHeapPriorityQueue
and
Sorter.topKSort(...)
to TestArrayHeapPriorityQueue
, and
TestSorter
.
In parts 1c and 1d, you will be asked to implement an ArrayHeapPriorityQueue
and the top-\(k\) sorting algorithm.
After you finish reading the specifications for these classes below, start by adding tests for these two classes. If you look through the source code of these two test classes, you'll notice that they're almost completely empty! It's now your responsibility to write comprehensive tests for these two classes.
We recommend that you write at least some tests before you implement
ArrayHeapPriorityQueue
and Sorter.topKSort
. Doing so will help you
better understand what your methods need to do (before you jump into coding), as well as
provide a reference to check how well your code is working. Additionally, it may feel better
to write your tests first and fix bugs as you implement the code—the alternative would
be to implement all the code first, then write tests, only to realize that there is actually
much more work remaining in fixing bugs. And of course, there's no real downside to having
more tests, so feel free to add more after implementing the code—you'll probably come
up with more test cases after you start implementing the code, anyways.
Note: This testing portion is a step up in difficulty compared to the
DoubleLinkedList.delete
testing from Project 1, so make sure not to take this
portion lightly, and make use of the resources we provide, including the tips below, your
partner, and TAs at office hours. Writing a good test suite for your code
will take time and effort, and the scoring for this assignment will reflect
that—the testing and regular code implementation portions of this assignment
will have comparable weighting.
Just like we did in Project 1 part 1's TestDoubleLinkedListDelete.java
, we will
grade your tests by running them against many different buggy versions of
ArrayHeapPriorityQueue
and Sorter.topKSort(...)
. In order to help
us do so efficiently, we ask that you keep all tests in
TestArrayHeapPriorityQueue
and TestSorter
fast. Every test
you write should take at most 1 second to run. Tests that take longer than this may time out
and fail during grading. As a rule of thumb, your tests shouldn't need to add more
than a thousand elements to your data structures. You may add as many tests as you want to
these two files, however.
ArrayHeapPriorityQueue
TopKSort
ArrayHeapPriorityQueue
Task: complete the ArrayHeapPriorityQueue
class.
Notes:
In lecture, we covered how to implement and work with 2-heaps: a heap where each node has up to 2 children.
For this assignment, you must implement a 4-heap: a heap where each node has up to four children. You will need to modify the equations you were given in lecture in order to successfully do so.
Additionally, you must implement this as being indexed from 0, rather than some other arbitrary index.
Make sure you're implementing a min-heap, not a max-heap.
Unlike all the other data structures we've implemented so far, your heap will not attempt to handle null entries. (There's no sensible way to compare a null against a non-null element).
Your heap should throw an exception when given a null element—see the
IPriorityQueue
interface for more details.
The data structure you'll be implementing has some additional functionality: we'll be
able to remove and replace elements in the heap. In exchange, the heap will not
support duplicate elements according to equals
.
We recommend leaving the implementation of these remove
,
replace
, and contains
methods until after you've implemented add
and removeMin
. The code for these methods may require some changes to
code used by add
and removeMin
, but it will probably be
easier to think about these methods after you've
implemented the basic code for add
and removeMin
.
In particular, you'll need to track the indices of each element in the heap by
using a dictionary to map from element to index in the heap. (Take a moment to
consider why we want to do this, and what the alternative is.) Tracking the indices
means that we'll need to update our dictionary of indices every time the heap
changes from adding, removing, or percolating; fortunately, adding this logic is
relatively painless if you implement add
and removeMin
using our provided method stubs for the helper methods percolateUp
,
percolateDown
, and swap
.
Note: this dictionary of element to index is one reason why we do not support duplicate elements in our heap—it would imply us keeping track of two identical keys in our dictionary pointing to different indices.
Here are some more tips for each method:
contains
: Since we're tracking the indices of each element in the
heap, implementing this is as simple as calling containsKey
.
Make sure that contains
runs in approximately constant time.replace
: This method replaces an element in the heap with a new
element. When this method gets called, you'll need to figure out whether the
new item needs to percolate upwards or downwards in the heap, and update the
heap accordingly.
replace
should have \(\mathcal{O}(log(n))\) runtime.
void replace(oldItem, newItem):
throw exception if necessary
index = dictionary.remove(oldItem)
heap[index] = newItem
dictionary.put(newItem, index)
percolate(index)
void percolate(index):
if index is first in heap:
percolateDown(index)
else:
parent = get parent index
if heap[index] < heap[parent]:
percolateUp(index)
else if heap[index] > heap[parent]:
percolateDown(index)
remove
: The code for this method should actually be fairly simple
after you've implemented replace
, since you can simply reuse its
functionality. This method will end up looking somewhat similar to
removeMin
: you just need to replace the input item with the
last item in the heap, and then use the percolation logic from
replace
to determine how to fix the heap invariants.
remove
should have \(\mathcal{O}(log(n))\) runtime.
void remove(item):
throw exception if necessary
index = dictionary.get(item)
if index is last index in heap:
remove last element in heap
else:
swap(index, last index in heap)
remove last element in heap
percolate(index)
In lecture, we assumed that every element has some numerical priority, and focused on comparing each element by that priority.
In practice, the exact "priority" an element has is an implementation detail of the items the client gives us, and isn't a value we can see directly.
What you should do instead is to compare items using their compareTo
method. The ArrayHeapPriorityQueue
is specifically defined so that all
items MUST implement the Comparable
interface—if they
don't, the code won't compile.
Sorter.topKSort(...)
Task: complete the Sorter.topKSort(...)
method.
Your Sorter.topKSort(k, list)
method is responsible for sorting and returning
the top (largest) \(k\) elements from a list containing \(n\) comparable elements. You can
find the Sorter
class inside the misc
package.
One naive way of implementing this algorithm would be to implement some sorting algorithm such as quicksort or merge sort, sort the list then return the last \(k\) elements.
However, this would take \(\mathcal{O}(n\log(n))\) time to run! We can do better: your implementation must run in \(\mathcal{O}(n\log(k))\) time.
Notes:
There is an additional test that runs when you push to Gitlab, called
testIsNLogK
. This test counts operations in your code to make sure the
number of comparisons is properly \(\mathcal{O}(n\log(k))\). Your code may also time out
if it takes more than a second to run, to mirror the timeout during grading.
There are pre- and post-conditions defined in the method header for
topKSort
. Per the definition of "precondition," you should assume the
preconditions are true when you write your code, and your may do anything if the
preconditions are not met. Additionally, be aware the postcondition defines an
additional condition that must be true at the end of the method.
topKSort
should not mutate its input list. (We normally wouldn't
document this, since it should be assumed by default that a method will not have
major side effects such as mutating its inputs, but we're doing it here for
emphasis.)
Because you will be putting values from the IList
parameter into your
heap, yourtopKSort
will not support sorting lists with any duplicate values
(according to the values' equals
method).
Hint 1: your method should construct exactly two data structures: a single
ArrayHeapPriorityQueue
, and a single DoubleLinkedList
to store
the output to return.
Hint 2: you should need to loop over your input list exactly once.
Task: Complete a write-up containing answers to the following questions.
You and your partner will work together on this write-up: your completed write-up MUST be in PDF format. You will submit it to Gradescope. Log into Gradescope with your @uw.edu email address. When you submit, mark which pages correspond to the questions we ask, and afterwards, the partner who uploaded the PDF must add the other partner as a group member on Gradescope. Do not have each member submit individually. A video showing how to do this can be found here. We may deduct points if you do not correctly submit as a group or do not properly assign your pages/problems.
For each of the experiments, answer the bolded questions (and questions in the orange boxes) in your write-up. Just like before, a plot will automatically be generated to display the results of the experiments; include PNGs of the plots inside your write-up PDF.
The post-experiment analysis portions will be graded based on the clarity of your explanations and whether they match up with your plot.
topKSort
In this experiment, we're running topKSort
and producing two plots for the runtime.
Experiment1
has \(k\) as a fixed value and plots the runtime as \(n\) varies, and
Experiment2
has \(n\) as the fixed value and plots the runtime as \(k\) varies.
In this experiment, we'll run our topKSort
against other implementations that
use insertion sort and merge sort to find the top \(k\) elements.
makeDoubleLinkedList
method, but the order is the same for
all 3 sorting tests. Is it sorted? Reverse-sorted? Random? Some other pattern?topKSort
In this experiment, we'll again run our topKSort
against other sorting methods,
but this time, we'll look at quick sort and merge sort, and with a different input
ordering.
makeRandomDoubleLinkedList
method, but the order is the same for
all 3 sorting tests. Is it sorted? Reverse-sorted? Random? Some other pattern?topKSort
Task: Submit a response to the feedback survey.
After finishing the project, take a couple minutes to complete this individual feedback survey on Canvas. (Each partner needs to submit their own individual response.)
The following deliverables are due on Wednesday, August 7 at 11:59pm.
Submit by pushing your code to GitLab and submitting your writeup to Gradescope. If you intend to submit late, fill out this late submission form when you submit.
Before the deadline, be sure to double-check that...