Homework 4: Testing, ChainedHashMap, and ChainedHashSet

Overview

In this homework, you will write some black box test cases and implement a hashmap and a hashset. We will be using these two data structures extensively in the upcoming project.

You will use these files from your prior assignments

main.java.datastructures.concrete.dictionaries.ArrayDictionary.java
main.java.datastructures.concrete.DoubleLinkedList.java

If you have chosen a new partner for this assignment, choose either of your submissions from HW2 and verify that these are functioning properly.

You will be modifying the following files:

main.java.datastructures.concrete.dictionaries.ArrayDictionary.java
main.java.datastructures.concrete.ChainedHashSet.java
main.java.datastructures.concrete.dictionaries.ChainedHashDictionary.java
test.java.datastructures.TestAvlTree.java

Additionally, here are a few more files that you might want to review while completing the assignment (note that this is just a starting point, not necessarily an exhaustive list):

test.java.datastructures.TestChainedHashSet.java
test.java.datastructures.dictionaries.TestChainedHashDictionary
test.java.datastructures.dictionaries.TestArrayDictionary.java
main.java.datastructures.interfaces.IDictionary.java
main.java.datastructures.interfaces.ISet.java
main.java.datastructures.concrete.BST.java

This is a team assignment. This entire project is due on Tuesday October 30 11:59pm.

When working on this assignment, we expect you meet these baseline expectations at all times:

Expectations

Here are some baseline expectations we expect you to meet:

Follow the course collaboration policies
DO NOT use any classes from java.util.*. There are only two exceptions to this rule:
1. You may import and use java.util.Iterator and java.util.NoSuchElementException.
2. You may import and use anything from java.util.* within your testing code.
DO NOT modify instructor-provided code (unless told otherwise)

Notes:

The gitlab pipeline status for this homework reflects the status of all the gradle jobs. That means, you will see a green checkmark next to your commit only if your code

compiles successfully (compile job),
passes all the tests (test jobs), and
passes style check (stylecheck job).

If any one of these jobs/tests fail, you will see a red cross next to your commit. (In HW2, the overall pipeline status only reflected status of the compile job.)

You can see the status of individual jobs by visiting the pipeline view and can see the output of a job by clicking on the job. Here is a gif that shows how to navigate to see status for individuals jobs in a pipeline.

(a) Initial Setup

Import the project from GitLab in your IDE. See the instructions from HW2 if you need a reminder on how to do this.
Copy your DoubleLinkedList.java and ArrayDictionary.java files from HW2 to this new one.

The copied code will not immediately compile. This is ok: we modified IDictionary so it requires you to implement one additional method.
Copy any of the delete tests you wrote from TestDeleteFunctionality.java and TestDeleteStress.java to the new TestDoubleLinkedList.java.

Be sure not to overwrite the entire TestDoubleLinkedList class: we gave you a slightly modified one containing a few extra tests.
Finally, make sure everything works.

Try running TestDoubleLinkedList and make sure the tests run.

Try running SanityCheck.java, and try running checkstyle. Checkstyle should still report the same 5 errors with SanityCheck.java as it did with HW2.

(b) Testing

Task: Write black box test cases for an isValidAVL() method

Recall from the class, black box tests are test cases that you write when you do not have access to the implementation, but you know the expected behavior of the code. In this part of the homework, your task is to write black box tests to test 11 different implementations of isValidAVL() method. Out of the 11 implementation, only 1 is correct and other 10 are buggy implementations. At least one of your tests should catch (fail for) the buggy implementations and all of your tests should pass for the correct implementation.

Take a look at TestAvlTree to see how you'll be writing tests for this task. TestAvlTree contains two sample tests. The tests that you'll write should be similar to these sample tests, except that you would change the BST tree, testing different BSTs. For this task you only need to modify test.java.datastructures.TestAvlTree.java.

Note on scoring and testing:

There are 11 implementations that we'll test with your tests. You earn 1 point if your tests catch a buggy implementation, i.e., if one of your tests fail when we run your tests against a buggy implementation.
But if your tests (one or more tests) fail when we run your tests against a correct implementation, you lose all points for this part of the homework. If tests fail for a correct implementation, those are just bad tests.
To check how well your tests do when run against the 11 implementations, push your commit to gitlab and check the gitlab pipeline job IsValidAVL. The console output of the pipeline shows the result of running your tests against our implementations. The test scripts prints your score (out of 11) for your tests. If you try to run any of your tests in TestAvlTree locally, they will fail because the isValidAVL() method we provided in BST always returns false. On the gitlab runner side, we replace isValidAVL() with one of the 11 implementations and then run all your tests. So to run your tests in TestAvlTree, push commit to gitlab and check the runner output.
What to test? Try testing smaller tree, larger trees, null trees, trees with negative values, etc.

(c) Repair ArrayDictionary

Task: implement ArrayDictionary.iterator().

Take a look at IDictionary. You may notice that the interface now requires all subclasses to implement one additional method: iterator().

By implementing this method, we can now (efficiently) iterate over all key-value pairs in our dictionaries! For example, we can now do this:

IDictionary<String, Integer> foo = makeTheDictionary();
for (KVPair<String, Integer> pair : foo) {
    String key = pair.getKey();
    Integer value = pair.getValue();

    // Do something with key and value
    System.out.println(key + " : " + value);
}

The above program should print out every key-value pair contained within foo.

Notes:

Your iterator() method may return the key value pairs in any order you wish. The easiest approach would likely be to return them in the order your pairs are listed in your internal array.
You may assume the user will not modify the dictionary while you're iterating over it.
You may NOT create any temporary data structures such as a temp IList when implementing your iterator. We want our iterators to be efficient, and having to copy the contents of our dictionary to some other data structure at any point is suboptimal.

You may, however, create new KVPair objects. (They do not count as "temporary" data structures).
We have provided you with unit tests for your iterators. You can add more tests if you want.
If you need examples on how to implement an iterator, see DoubleLinkedList and the stubs in ChainedHashDictionary. Also see the ArrayList iterator we implemented together at the start of the quarter.
The IDictionary interface requires that dictionaries return KVPair objects. However, your internal dictionary uses private Pair objects. You will need to convert between the two within your iterator – see the method header comments in KVPair to understand how to instantiate new KVPair objects.
When creating the iterator, you will need to create a private static inner class to ArrayDictionary. Your class should look roughly like this:
```
private static class ArrayDictionaryIterator<K, V> implements Iterator<KVPair<K, V>> {
    // ...
}
```
Make sure to make the class private and static, and make sure you copy how the generics are being defined correctly.

(d) Implement ChainedHashDictionary

Task: complete the ChainedHashDictionary class.

In this task, you will implement a hash table that uses separate chaining as its collision resolution strategy.

Notes:

You may implement any resizing strategy covered in lecture.
Use ArrayDictionary for your internal chains/buckets.
If your ChainedHashDictionary receives a null key, use a hashcode of 0 for that key.
Your iterator() method doesn't need to yield the pairs in any particular order.

Correctly implementing your iterator will be tricky – don't leave it to the last minute! Try and finish the other methods in ChainedHashDictionary as soon as possible so you can move on to implementing iterator().

The comments in ChainedHashDictionary will contain a few hints on how to implement this method.

Hints (Added 10/28)

In the class when we covered separate chaining hash tables we used LinkedList as the chaining data structure. In this task, instead of LinkedList, you will use your ArrayDictionary (from HW2) as the chaining data structure.

When you first create your chains array, it will contain null pointers. As key-value pairs are inserted in the table, you need to create the chains (ArrayDictionary) as required. Let's say you created an array of size 5 (you can create array of any size), and you inserted the key-value pair ("a", 11).

Dictionary<String, Integer> map = new ChainedHashDictionary<>();
map.put("a", 11);

Your hash table should something like the following figure. In this example, the key "a" lands in index 2, but if might be in a different index depending on your table size. Also, in this example, ArrayDictionary (chain) is of size 3, but you can choose a different size for your ArrayDictionary.

ChainedHashDictionary internal state 1

Now, suppose you inserted a few more keys:

map.put("f", 13);
map.put("c", 12);

Your internal hash table should now look like the figure below. In this example, keys "a" and "f" both hash to the same index (2).

ChainedHashDictionary internal state 2

Hash function: Do not try to implement your own hash function. Use the hash method hashCode() that Java provides for all classes: so to get the hash of a key, use keyHash = key.hashCode(). This method returns an integer, which can be negative or greater than chains.length. How would you handle this?

Resizing strategy: Recall that operations on a hash table slow down as the load factor increases. So you need to decide when to resize (increase) your internal array, by how much, and actually do the resizing. When resizing your ArrayDictionary, you just copied over item from the old array to the new one. Here, how would you move items from one hash table to another?

(e) Implement ChainedHashSet

Task: complete the ChainedHashSet class.

In task (d) you implemented the Dictionary ADT with a hash table. You can also implement the Set ADT with hash tables. Recall that in Sets, the data item to store is just a key, not a key-value pair. In Sets, the operation of interest is contains(key) that is to know whether a key is part of the Set. And hash tables provide an efficient implementation for this Set operation.

Notes:

To avoid code duplication, we will use an internal dictionary of type IDictionary<KeyType, Boolean> to store items (keys) in your HashSet. Since there are not key-values pairs (only keys) in Sets, we will completely ignore the values in your dictionary: use a placeholder boolean whenever necessary.
Your code for this class should be very simple: your inner dictionary should be doing most of the work.

(f) Complete writeup

Task: submit answers to the following questions.

Each group member should answer and submit these questions independently. Submit your individual writeups (in plaintext .txt) at this canvas page. Use this template to write your individual writeup.

Questions about the homework:
1. What is your name and your partner's name?
2. How was the homework? Did you enjoy it?
3. Do you have any suggestions for improving the homework?
Questions about your partner (skip if you worked solo):
1. Which parts of the homework did you work on, and which parts did your partner program?
2. How was your partnership? Do you feel you and your partner contributed equally to this homework?

Deliverables

The following deliverables are due on Tuesday October 30 11:59pm.

Before submitting, be sure to double-check and make sure that...

Your TestAvlTree class contains tests and they pass for all implementations that we test with your tests.
Your ArrayDictionary class is fully repaired and passes all tests.
Your ChainedHashDictionary class is fully implemented and passes all tests.
Your ChainedHashSet class is fully implemented and passes all tests.
You ran the checkstyle plugin within Eclipse and fixed all style errors.

The code submission is via GitLab. Make sure all your changes are pushed to GitLab when you are ready to submit. We'll take a snapshot of all repositories before the deadline. If you intend to submit late, fill this late submission form before the deadline.

Both partners should turn in their individual writeup (f) by the deadline, and submit it via canvas.

CSE 373, Autumn 2018: Homework 4: Testing, ChainedHashMap, and ChainedHashSet

Overview

Expectations

Notes:

(a) Initial Setup

(b) Testing

(c) Repair ArrayDictionary

(d) Implement ChainedHashDictionary

Hints (Added 10/28)

(e) Implement ChainedHashSet

(f) Complete writeup

Deliverables