Due Thursday, May 16 at 11:59pm.
Submit by pushing your code to GitLab. If you intend to submit late, fill out this late submission form when you submit.
In this part of the assignment, will practice writing more JUnit tests, and you will implement data structures and sorting algorithms that you will be using in the second part.
You will use these files from your prior assignments:
src/main/java/datastructures/concrete/dictionaries/ArrayDictionary.java
src/main/java/datastructures/concrete/dictionaries/ChainedHashDictionary.java
src/main/java/datastructures/concrete/ChainedHashSet.java
src/main/java/datastructures/concrete/DoubleLinkedList.java
If you have chosen a new partner for this assignment, choose either of your submissions from prior assignments and verify that these are functioning properly.
You will be modifying the following files:
src/main/java/datastructures/concrete/ArrayHeap.java
src/main/java/misc/Sorter.java
src/test/java/datastructures/TestArrayHeap.java
src/test/java/misc/TestSorter.java
Task: Set up HW5 in your IDE and make sure all the tests for double linked list, hash dictionary, and hash set pass.
Import your homework repo from GitLab. See the instructions from HW0 if you need a reminder on how to do this.
Copy your DoubleLinkedList.java
,
ArrayDictionary.java
, ChainedHashDictionary.java
, and
ChainedHashSet.java
files from HW4 to this
new one.
Copy any extra tests you wrote to TestDoubleLinkedList.java
or
TestChainedHashDictionary.java
Finally, make sure everything works.
Try running the following tests and make sure all the tests pass.
TestDoubleLinkedList.java
TestArrayDictionary.java
TestChainedHashDictionary.java
TestChainedHashSet.java
Try running SanityCheck.java
, and try running checkstyle.
Checkstyle should still report the same 5 errors with SanityCheck.java
as it did in the previous projects.
ArrayHeap
and
Sorter.topKSort(...)
Task: add tests for ArrayHeap
and
Sorter.topKSort(...)
to TestArrayHeap
, and
TestSorter
.
In parts 1c and 1d, you will be asked to implement an ArrayHeap and the top-\(k\) sorting algorithm.
After you finish reading the specifications for these classes below, start by adding tests for these two classes. If you look through the source code of these two test classes, you'll notice that they're almost completely empty! It's now your responsibility to write comprehensive tests for these two classes. This is a step up from previous assignments, but after working with JUnit tests for each assignment so far, writing these tests shouldn't be anything you can't overcome (with your partner and other course resources to help).
We recommend that you write at least some tests before you implement ArrayHeap
and Sorter.topKSort
. Doing so will help you better understand what your methods need to do (before you jump into coding), as well as provide a reference to check how well your code is working when you run the tests as you write the code. And of course, there really aren't any downsides to writing more tests, so feel free to add more tests after implementing the code. It's likely that you'll have more ideas of what to test after or as you implement the code, anyways.
Just like we did in Homework 2 part 1's TestDoubleLinkedListDelete.java
, we will grade your tests by running them against many
different buggy versions of ArrayHeap
and Sorter.topKSort(...)
.
In order to help us do so efficiently, we ask that you keep all tests in TestArrayHeap
and TestSorter
short.
Every test you add to these two files should have a timeout of a second or less.
(You can add as many tests as you want to these two files, however).
As a general rule of thumb, any tests in these two files should be
manipulating heaps or lists that at most contain several hundred items.
Basically, we plan on running the tests you add in TestArrayHeap
multiple times, and don't want them to take forever to run.
Start small: you don't need to come up with every edge case immediately—it will probably be easier to write the general case tests first. For example, make sure that removeMin
decreases the size, or that peekMin
returns the correct value after inserting a few times.
Cover your bases—make sure every method has at least one test.
Revisit the testing categories from homework 2 part 1b / lecture slides, and try to make tests for any categories that seem applicable to the code in this assignment.
Test that the correct exceptions are thrown in cases defined by the method documentation.
Be sure to test your classes both on large amounts of data and small amounts of data in order to test many possible scenarios and combinations of data structure states and parameters.
When testing your ArrayHeap
, you may notice that there's no easy way to iterate through all its data. The only way to verify that your heap works (does the correct percolations to maintain valid heap properties) is by calling removeMin
repeatedly and checking that it returns the correct values. Whether your input sequence is random, sorted, reverse-sorted, or anything else, after removeMin
ing, the data should all come out in sorted order. So, one way to test the heap is to make sure each value from removeMin
is greater than or equal to the return value of the previous removeMin
call (according to compareTo
).
You may find that using only removeMin
is somewhat restrictive. As an alternative, you can use reflection (see HW4 explanation on testing private fields) to access the internal array inside your ArrayHeap
class. The provided TestArrayHeap
class has a helper method that will return the internal array given an ArrayHeap
. After calling it, you can check whether the heap is valid according to the min-heap properties (each node is 'less than equal to' all of its children, and 'greater than equal to' its parent).
If you want to be more thorough, you can make a helper method to check the heap property at every single index in the heap. This helper method will likely be similar to TestDoubleLinkedList.assertListValidAndMatches
from homework 2.
(added 5/12) As a reminder: DO NOT include reflection in tests that will be graded (so be sure to comment them out, place them in separate file, or remove them before your final submissions. Reflection (and really all the methods mentioned other than the first) is tied heavily to your own implementation, meaning that it is unlikely to work with other implementations of the class, including as the ones we use during grading.
To test your Sorter.topKSort
, consider the fact that there are two real parameters: K and the input list. To be thorough, consider all the edge cases for K and the input list, and test combinations of both.
As a reminder, you may use classes from
java.util.*
when writing unit tests. You may find
Collections.sort(...)
to be particularly helpful for generating expected output. You may need to write some extra logic to convert between an IList
and
java.util.List
, however.
As always, you may not use anything from
java.util.*
for anything outside of your tests,
unless explicitly told otherwise.
ArrayHeap
Task: complete the ArrayHeap
class.
Notes:
In lecture, we covered how to implement and work with 2-heaps: a heap where each node has up to 2 children.
For this assignment, you must implement a 4-heap: a heap where each node has up to four children. You will need to modify the equations you were given in lecture in order to successfully do so.
Additionally, you must implement this as being indexed from 0, rather than some other arbitrary index.
Make sure you're implementing a min-heap, not a max-heap.
Unlike all the other data structures we've implemented so far, your heap will not attempt to handle null entries. (There's no sensible way to compare a null against a non-null element).
Your heap should throw an exception when given a null element—see the
IPriorityQueue
interface for more details.
The data structure you'll be implementing has some additional functionality: we'll be able to remove and replace elements in the heap. In exchange, the heap will not support duplicate elements.
We recommend leaving the implementation of these remove
and
replace
methods until after you've implemented add
and removeMin
. The code for remove
and
replace
will require some changes to add
and
removeMin
, but those changes will be difficult to make before you've
implemented the basic code for add
and removeMin
.
In particular, you'll need to track the indices of each element in the heap by
using a dictionary to map from element to index in the heap. (Take a moment to
consider why we want to do this, and what the alternative is.)
You probably want to declare this dictionary as a field of type IDictionary<T, Integer>.
Tracking the indices means that
we'll need to update our dictionary of indices every time the heap changes from
adding, removing, or percolating; fortunately, adding this logic is relatively
painless if you implement add
and removeMin
using our
provided method stubs for the helper methods percolateUp
,
percolateDown
, and swap
.
Here are some more tips for each method:
contains
: Since we're tracking the indices of each element in the
heap, implementing this is as simple as calling containsKey
.
Make sure that contains
runs in approximately constant time.replace
: This method replaces an element in the heap with a new
element. When this method gets called, you'll need to figure out whether the
new item needs to percolate upwards or downwards in the heap, and update the
heap accordingly.
replace
should run in approximately log(n) time.
void replace(oldItem, newItem):
throw exception if necessary
index = dictionary.remove(oldItem)
heap[index] = newItem
dictionary.put(newItem, index)
percolate(index)
void percolate(index):
if index is first in heap:
percolateDown(index)
else:
parent = get parent index
if heap[index] < heap[parent]:
percolateUp(index)
else if heap[index] > heap[parent]:
percolateDown(index)
remove
: The code for this method should actually be fairly simple
after you've implemented replace
, since you can simply reuse its
functionality. This method will end up looking somewhat similar to
removeMin
: you just need to replace the input item with the
last item in the heap, and then use the percolation logic from
replace
to determine how to fix the heap invariants.
remove
should run approximately log(n) time.
void remove(item):
throw exception if necessary
index = dictionary.remove(item)
if index is last in heap:
heap[index] = null
else:
last = get last item in heap
place last at index
dictionary.put(last, index)
heap[last index] = null
percolate(index)
In lecture, we assumed that every element has some numerical priority, and focused on comparing each element by that priority.
In practice, the exact "priority" an element has is an implementation detail of the items the client gives us, and isn't a value we can see directly.
What you should do instead is to compare items using their
.compareTo(...)
method. The ArrayHeap
is
specifically designed so that all items MUST implement the
Comparable
interface—if they don't, the
code won't compile.
Sorter.topKSort(...)
Task: complete the Sorter.topKSort(...)
method.
Your Sorter.topKSort(k, list)
method is responsible for
returning the top \(k\) elements from a list containing \(n\) comparable
elements. You can find the Sorter
class inside the
misc
package.
One naive way of implementing this algorithm would be to implement some sorting algorithm such as quicksort or mergesort, sort the list then return the last \(k\) elements.
However, this would take \(\mathcal{O}(n\log(n))\) time to run! We can do better: your implementation must run in \(\mathcal{O}(n\log(k))\) time.
Notes:
Make sure your topKSort
method works with both
small and large input lists!
Because you will be putting values from the IList parameter into your heap, your topKSort will not support sorting lists with any duplicate values.
Hint 1: your method should construct exactly two data structures:
a single ArrayHeap
, and a single DoubleLinkedList
to store the output to return.
Hint 2: you should need to loop over your input list exactly once.
Reminder (5/14): there is a secret test that runs when you push, called testIsNLogK. This test counts operations in your code to make sure the runtime is properly n * log(k). Recently, we also added a timeout of 1 second so this test now more accurately mirrors grading (and tests efficiency by counting and timing your code).
Task: Complete a writeup containing answers to the following questions.
Your writeup will be graded partially on whether or not you produced a reasonable answer, and partially based on the clarity of your explanations and analysis. You and your partner will work together on this writeup.
Note: your completed writeup MUST be in PDF format. You will not submit this writeup via Canvas—instead, submit it to Gradescope. Log into Gradescope with your @uw.edu email address. When you submit, make sure to mark which pages correspond to the questions we ask, and after submission the partner who submitted the assignment must add the other partner. That is, please only turn in one submission per group and make sure both group members are assigned to that submission. A video showing how to do this can be found here.
You will run a variety of experiments and report back on the resulting behavior. Be sure to take some time to read the provided code and understand what's going on.
Note: Some experiments may take up to ~30 min to run.
You can find the experiments in the analysis.experiments
package.
Run the experiment code. Each Experiment*.java
file should
generate a CSV file in the experimentdata
folder.
We recommend using plot.ly to generate a
plot
from your CSV, but if you prefer you can also use Microsoft Excel, Google Sheets, or
any other
analysis tool of your choice. Generate a plot of the data, and include this
plot as a image inside your writeup.
If the CSV file contains multiple result columns, plot them all
in the same chart.
If you're using plot.ly, here are some basic instructions:
You will be graded based on whether you produce a reasonable plot. In particular, be sure to give your plot a title, label your axes (with units), include a legend if appropriate, and relabel the columns to something more descriptive than "TestResult".
You should also notice that in your plots, experiment 1 results in a flat linear shape, but experiment 2 results in a curve. Why are they different and where do those shapes come from? You might find it helpful to reference general knowledge of complexity classes in your response and the axes of the two graphs.
The answer to this question doesn't need to be very long; a couple sentences is enough.
The following deliverables are due on Thursday, May 16 at 11:59pm.
Submit by pushing your code to GitLab. If you intend to submit late, fill out this late submission form when you submit.
Before the deadline, be sure to double-check that...
TestArrayHeap
and
TestSorter
classes are completed.ArrayHeap
class and your Sorter.topKSort(...)
method are both fully implement and passes all of your tests.