In this homework, you will implement a hash dictionary (also known as a hash map) and a hash set. We will be using these two data structures extensively in the next project.
This entire project is due on Wednesday, May 8 at 11:59pm.
You will use these files from your prior assignments
src/main/java/datastructures/concrete/dictionaries/ArrayDictionary.java
src/main/java/datastructures/concrete/DoubleLinkedList.java
If you have chosen a new partner for this assignment, choose either of your submissions from HW2 and verify that these are functioning properly.
You will be modifying the following files:
src/main/java/datastructures/concrete/dictionaries/ArrayDictionary.java
src/main/java/datastructures/concrete/dictionaries/ChainedHashDictionary.java
src/main/java/datastructures/concrete/ChainedHashSet.java
Additionally, here are a few more files that you might want to review while completing the assignment (note that this is just a starting point, not necessarily an exhaustive list):
src/test/java/datastructures/dictionaries/BaseTestDictionary.java
src/test/java/datastructures/dictionaries/TestArrayDictionary.java
src/test/java/datastructures/dictionaries/TestChainedHashDictionary
src/test/java/datastructures/TestChainedHashSet.java
src/main/java/datastructures/interfaces/IDictionary.java
src/main/java/datastructures/interfaces/ISet.java
src/main/java/analysis/experiments/*
Here's another video overview.
Here are some baseline expectations we expect you to meet in all projects:
Follow the course collaboration policies
DO NOT use any classes from
java.util.*
. There are only two exceptions to this rule:
You may import and use the following classes:
java.util.Iterator
java.util.NoSuchElementException
java.util.Objects
java.util.Arrays
You may import and use anything from java.util.*
within your testing code.
DO NOT make modifications to instructor-provided code (unless told otherwise). If you need to temporarily change our code for debugging, make sure to change it back afterwards.
Clone the starter code from GitLab and open the project in your IDE. See the instructions from HW1 if you need a reminder on how to do this.
Copy your DoubleLinkedList.java
and ArrayDictionary.java
files from HW2 to this new one.
The copied code will not immediately compile. This
is ok: we modified IDictionary
so it requires
you to implement one additional method.
Copy your DoubleLinkedList
delete
tests from HW2 and paste
them directly into TestDoubleLinkedList.java
.
Finally, make sure everything works.
Try running TestDoubleLinkedList
and make sure the tests run.
Try running SanityCheck.java
, and try running Checkstyle.
Checkstyle should still report the same 5 errors with SanityCheck.java
as it did with HW2.
ArrayDictionary
Task: Implement ArrayDictionary.iterator()
.
Take a look at IDictionary
. You may notice that the
interface now requires all subclasses to implement one additional
method: iterator()
.
By implementing this method, we can now (efficiently) iterate over all key-value pairs in our dictionaries! For example, we can now do this:
IDictionary<String, Integer> foo = makeTheDictionary();
for (KVPair<String, Integer> pair : foo) {
String key = pair.getKey();
Integer value = pair.getValue();
// Do something with key and value
System.out.println(key + " : " + value);
}
The above program should print out every key-value pair contained within
foo
.
Notes:
Your iterator()
method may return the key value pairs
in any order you wish. The easiest approach would likely be to
return them in the order your pairs are listed in your internal
array.
You may assume the user will not modify the dictionary while you're iterating over it.
You may NOT create any temporary data
structures such as a temp IList
when implementing
your iterator. We want our iterators to be efficient, and having
to copy the contents of our dictionary to some other data
structure at any point is suboptimal.
You may, however, create new KVPair
objects. (They
do not count as "temporary" data structures).
We have provided you with unit tests for your iterators. You can add more tests if you want.
If you need examples on how to implement an iterator, see
DoubleLinkedList
and the stubs in
ChainedHashDictionary
. Your iterator will look something like this:
private static class ArrayDictionaryIterator<K, V> implements Iterator<KVPair<K, V>> {
// Add any fields you need to store state information
public ArrayDictionaryIterator(/* possible constructor arguments */) {
// Initialize the iterator
}
public boolean hasNext() {
// Implement hasNext
throw new NotYetImplementedException();
}
public KVPair<K, V> next() {
// Return the next KVPair in the dictionary
throw new NotYetImplementedException();
}
}
Make sure to make the class private and static, and make sure you copy how the generics are being defined correctly.
The IDictionary
interface requires that dictionaries
return KVPair
objects. However, your internal dictionary
uses private Pair
objects. You will need to convert
between the two within your iterator—see the method header comments
in KVPair
to understand how to instantiate new
KVPair
objects.
ChainedHashDictionary
Task: Complete the ChainedHashDictionary
class.
In this task, you will implement a hash table that uses separate chaining as its collision resolution strategy.
Notes:
ChainedHashDictionary
takes as input a load
factor lambda at which to resize; for the other constructor, it's up to you to pick a
reasonable default value for lambda.ArrayDictionary
for your internal chains/buckets.ChainedHashDictionary
receives a null key, use a hashcode of 0 for
that key.iterator()
method doesn't need to yield the
pairs in any particular order.Correctly implementing your iterator will be tricky—don't leave it to the last minute!
Try to finish the other methods in ChainedHashDictionary
as soon as possible so
you can move on to implementing iterator()
.
The comments in ChainedHashDictionary
will contain a few hints on how to
implement this method.
In the class when we covered separate chaining hash tables we used LinkedList
as
the chaining data structure. In this task, instead of LinkedList
, you will use your
ArrayDictionary
(from HW2) as the chaining data structure.
When you first create your chains
array, it will contain null pointers. As
key-value pairs are inserted in the table, you need to create the chains
(ArrayDictionary
s) as required. Let's say you created an array of size 5 (you can
create array of any size), and you inserted the key-value pair ("a", 11).
Dictionary<String, Integer> map = new ChainedHashDictionary<>();
map.put("a", 11);
Your hash table should something like the following figure. In this example, the key "a"
lands in index 2, but if might be in a different index depending on your table size. Also, in
this example, ArrayDictionary
(chain) is of size 3, but you can choose a different
size for your ArrayDictionary
.
Now, suppose you inserted a few more keys:
map.put("f", 13);
map.put("c", 12);
Your internal hash table should now look like the figure below. In this example, keys "a" and "f" both hash to the same index (2).
Hash function: Do not try to implement your own hash function. Use the hash
method hashCode()
that Java provides for all classes: so to get the hash of a key,
use keyHash = key.hashCode()
. This method returns an integer, which can be
negative or greater than chains.length
. How would you handle this?
Resizing strategy: Recall that operations on a hash table slow down as the
load factor increases. So you need to decide when to resize (increase) your internal array, by
how much, and actually do the resizing. When resizing your ArrayDictionary
, you
just copied over item from the old array to the new one. Here, how would you move items from one
hash table to another?
getOrDefault
on your IDictionary
s
The IDictionary
interface includes a convenient getOrDefault
method
which will be useful in future assignments. This method either gets the value of a key in
the IDictionary
or returns the given default value if the key is not inside.
IDictionary
provides a default implementation of this method that simply calls
containsKey
; however, this method can be implemented more efficiently inside
ArrayDictionary
and ChainedHashDictionary
, where you have access
to the data structures' internals. (You can also see AVLDictionary
for an
example implementation.) While implementing this method on your data structures is not
strictly required, we recommend doing so, since it will make it easier for your to pass any
efficiency tests later.
ChainedHashSet
Task: Complete the ChainedHashSet
class.
In section c, you implemented the dictionary ADT with a hash table. You can also implement
the set ADT using hash tables. Recall that in sets, the data item to store is just a key, not a
key-value pair. In sets, the operation of interest is contains(key)
, which returns
whether a key is part of the set. Hash tables provide an efficient implementation for this
operation.
Notes:
IDictionary<KeyType, Boolean>
to store items (keys) in your
HashSet
. Since there are no key-values pairs (only keys) in sets, we will
completely ignore the values in your dictionary: use a placeholder boolean whenever
necessary.In this homework assignment, we won't be grading the tests that you write. But, to be thorough and to foster good habits, we strongly encourage you to write additional tests for your code, since they may help you spot different bugs or make you more confident in the correctness of your implementation. (Also remember that we have 'secret' tests that you will be graded on in addition to the provided tests, so it's in your best interest to test more cases.)
For this assignment, you should focus in particular on edge cases. Whenever you see conditional logic in your code (if statements, loop conditions, etc.) you should consider writing a test to check its edge cases. To make sure your tests are truly comprehensive, you should make sure that your test cases end up running every possible conditional branch in your code.
Although you can see the inner workings of your own code, it may sometimes be difficult to write code to check the actual state of your data structures by calling their public methods. Instead, you can test the state of your data structures by accessing your private fields directly; see the blue box below for more details.
Task: Complete a writeup containing answers to the following questions.
Your writeup will be graded partially on whether or not you produced a reasonable answer, and partially based on the clarity of your explanations and analysis. You and your partner will work together on this writeup.
Note: your completed writeup MUST be in PDF format. You will not submit this writeup via Canvas—instead, submit it to Gradescope. Log into Gradescope with your @uw.edu email address. When you submit, make sure to mark which pages correspond to the questions we ask, and after submission the partner who submitted the assignment must add the other partner. That is, please only turn in one submission per group and make sure both group members are assigned to that submission. A video showing how to do this can be found here.
You will run a variety of experiments and report back on the resulting behavior. Be sure to take some time to read the provided code and understand what's going on.
Notes:
WARNING: Unable to attach Serviceability Agent. You can try again with
escalated privileges.
Two options: a) use -Djol.tryWithSudo=true to try with sudo; b) echo 0 |
sudo tee /proc/sys/kernel/yama/ptrace_scope
You can find the experiments in the analysis.experiments
package.
Run the experiment code. Each Experiment*.java
file should
generate a CSV file in the experimentdata
folder.
We recommend using plot.ly to generate a
plot
from your CSV, but if you prefer you can also use Microsoft Excel, Google Sheets, or
any other
analysis tool of your choice. Generate a plot of the data, and include this
plot as a image inside your writeup.
If the CSV file contains multiple result columns, plot them all
in the same chart.
If you're using plot.ly, here are some basic instructions:
You will be graded based on whether you produce a reasonable plot. In particular, be sure to give your plot a title, label your axes (with units), include a legend if appropriate, and relabel the columns to something more descriptive than "TestResult".
You should have seen some interesting patterns in the plots you made! In this part of the writeup, we want to hear about why you think these things happened in the lens of what we're doing in lecture.
The answers to these questions don't need to be very long; a couple sentences for each is enough.
Task: Submit answers to the following questions.
Each group member should answer and submit these questions independently. These are graded credit/no credit, but we still expect some level of detail regarding your experience. Submit your individual writeups by filling out the form on Canvas.
The following deliverables are due on Wednesday, May 8 at 11:59pm.
Before submitting, be sure to double-check and make sure that...
ArrayDictionary
class is fully repaired and
passes all tests.
ChainedHashDictionary
class is fully
implemented and passes all tests.ChainedHashSet
class is fully
implemented and passes all tests.Submit by pushing your code to GitLab. If you intend to submit late, fill out this late submission form when you submit.
Both partners should turn in their individual writeup by the deadline via Canvas on this page.