CSE 373, Winter 2019: Homework 3: Testing, ChainedHashDictionary, and ChainedHashSet

Overview

In this homework, you will implement a hash dictionary (also known as a hash map) and a hash set. We will be using these two data structures extensively in the next project.

This entire project is due on Friday February 8 Tuesday February 12 at 11:59pm.

You will use these files from your prior assignments

  • main.java.datastructures.concrete.dictionaries.ArrayDictionary.java
  • main.java.datastructures.concrete.DoubleLinkedList.java

If you have chosen a new partner for this assignment, choose either of your submissions from HW2 and verify that these are functioning properly.

You will be modifying the following files:

  • main.java.datastructures.concrete.dictionaries.ArrayDictionary.java
  • main.java.datastructures.concrete.ChainedHashSet.java
  • main.java.datastructures.concrete.dictionaries.ChainedHashDictionary.java

Additionally, here are a few more files that you might want to review while completing the assignment (note that this is just a starting point, not necessarily an exhaustive list):

  • test.java.datastructures.TestChainedHashSet.java
  • test.java.datastructures.dictionaries.TestChainedHashDictionary
  • test.java.datastructures.dictionaries.TestArrayDictionary.java
  • main.java.datastructures.interfaces.IDictionary.java
  • main.java.datastructures.interfaces.ISet.java

Here's another video overview.

Expectations

Here are some baseline expectations we expect you to meet in all projects:

  • Follow the course collaboration policies

  • DO NOT use any classes from java.util.*. There are only two exceptions to this rule:

    1. You may import and use java.util.Iterator and java.util.NoSuchElementException.

    2. You may import and use anything from java.util.* within your testing code.

  • DO NOT make modifications to instructor-provided code (unless told otherwise). If you need to temporarily change our code for debugging, make sure to change it back afterwards.

Section a: Initial setup

  1. Clone the starter code from GitLab and open the project in your IDE. See the instructions from HW0 if you need a reminder on how to do this.

  2. Copy your DoubleLinkedList.java and ArrayDictionary.java files from HW2 to this new one.

    The copied code will not immediately compile. This is ok: we modified IDictionary so it requires you to implement one additional method.

  3. Copy your DoubleLinkedList tests from HW2. You may replace the file provided with the one from HW2, since the one included in HW3 has some outdated test names. Make sure to also copy over the delete tests you wrote from TestDeleteFunctionality.java and TestDeleteStress.java to the new TestDoubleLinkedList.java.

  4. Finally, make sure everything works.

    Try running TestDoubleLinkedList and make sure the tests run.

    Try running SanityCheck.java, and try running Checkstyle. Checkstyle should still report the same 5 errors with SanityCheck.java as it did with HW2.

Section b: Repair ArrayDictionary

Task: implement ArrayDictionary.iterator().

Take a look at IDictionary. You may notice that the interface now requires all subclasses to implement one additional method: iterator().

By implementing this method, we can now (efficiently) iterate over all key-value pairs in our dictionaries! For example, we can now do this:

IDictionary<String, Integer> foo = makeTheDictionary();
for (KVPair<String, Integer> pair : foo) {
    String key = pair.getKey();
    Integer value = pair.getValue();

    // Do something with key and value
    System.out.println(key + " : " + value);
}

The above program should print out every key-value pair contained within foo.

Notes:

  1. Your iterator() method may return the key value pairs in any order you wish. The easiest approach would likely be to return them in the order your pairs are listed in your internal array.

  2. You may assume the user will not modify the dictionary while you're iterating over it.

  3. You may NOT create any temporary data structures such as a temp IList when implementing your iterator. We want our iterators to be efficient, and having to copy the contents of our dictionary to some other data structure at any point is suboptimal.

    You may, however, create new KVPair objects. (They do not count as "temporary" data structures).

  4. We have provided you with unit tests for your iterators. You can add more tests if you want.

  5. If you need examples on how to implement an iterator, see DoubleLinkedList and the stubs in ChainedHashDictionary. Your iterator will look something like this:

    private static class ArrayDictionaryIterator<K, V> implements Iterator<KVPair<K, V>> {
        // Add any fields you need to store state information
    
        public ArrayDictionaryIterator(/* possible constructor arguments */) {
            // Initialize the iterator
        }
    
        public boolean hasNext() {
            // Implement hasNext
            throw new NotYetImplementedException();
        }
    
        public KVPair<K, V> next() {
            // Return the next KVPair in the dictionary
            throw new NotYetImplementedException();
        }
    }

    Make sure to make the class private and static, and make sure you copy how the generics are being defined correctly.

  6. The IDictionary interface requires that dictionaries return KVPair objects. However, your internal dictionary uses private Pair objects. You will need to convert between the two within your iterator—see the method header comments in KVPair to understand how to instantiate new KVPair objects.

Section c: Implement ChainedHashDictionary

Task: complete the ChainedHashDictionary class.

In this task, you will implement a hash table that uses separate chaining as its collision resolution strategy.

Notes:

  • You may implement any resizing strategy covered in lecture.
  • Use ArrayDictionary for your internal chains/buckets.
  • If your ChainedHashDictionary receives a null key, use a hashcode of 0 for that key.
  • Your iterator() method doesn't need to yield the pairs in any particular order.

Correctly implementing your iterator will be tricky—don't leave it to the last minute! Try to finish the other methods in ChainedHashDictionary as soon as possible so you can move on to implementing iterator().

The comments in ChainedHashDictionary will contain a few hints on how to implement this method.

Additional hints

In the class when we covered separate chaining hash tables we used LinkedList as the chaining data structure. In this task, instead of LinkedList, you will use your ArrayDictionary (from HW2) as the chaining data structure.

When you first create your chains array, it will contain null pointers. As key-value pairs are inserted in the table, you need to create the chains (ArrayDictionarys) as required. Let's say you created an array of size 5 (you can create array of any size), and you inserted the key-value pair ("a", 11).

Dictionary<String, Integer> map = new ChainedHashDictionary<>();
map.put("a", 11);

Your hash table should something like the following figure. In this example, the key "a" lands in index 2, but if might be in a different index depending on your table size. Also, in this example, ArrayDictionary (chain) is of size 3, but you can choose a different size for your ArrayDictionary.

ChainedHashDictionary internal state 1

Now, suppose you inserted a few more keys:

map.put("f", 13);
map.put("c", 12);

Your internal hash table should now look like the figure below. In this example, keys "a" and "f" both hash to the same index (2).

ChainedHashDictionary internal state 2

Hash function: Do not try to implement your own hash function. Use the hash method hashCode() that Java provides for all classes: so to get the hash of a key, use keyHash = key.hashCode(). This method returns an integer, which can be negative or greater than chains.length. How would you handle this?

Resizing strategy: Recall that operations on a hash table slow down as the load factor increases. So you need to decide when to resize (increase) your internal array, by how much, and actually do the resizing. When resizing your ArrayDictionary, you just copied over item from the old array to the new one. Here, how would you move items from one hash table to another?

A remark about testing private fields

In the previous project, you may have noticed that it can be difficult to verify the correctness of internal fields in your data structures. We want this internal state to be private to maintain the abstraction in our ADTs (and in other classes as well), and in theory, the state of these private fields shouldn't matter as long as the public interface works properly. However, in practice, we often know that a bad internal state will lead to errors later—for example, a bad linked list pointer always has the potential to cause issues—but it may take a lengthy or specific sequence of operations to actually cause that error in a test.

In these cases, we would often rather find some way to directly inspect this private state. There are a few methods of testing internal state, each with pros and cons:

  • Just test internal state indirectly. This is what we did in the previous assignment; in many cases, it's the most natural method, since the public interface is all that matters in the end, but it can be difficult if the private state is complex, as with many data structures.
  • Move your tests so that they can access the internal state, for example, by moving the tests into the class being tested. This is a fairly simple solution, but increases clutter in the code. Additionally, in this class in particular, we need to keep your main code and tests separate for grading purposes.
  • Increase visibility of your internal state, essentially making your state not internal. Again, simple, but it removes abstraction and defeats the purpose of having different access levels in the first place. This also will not work with our test-grading methods.
  • Use reflection. Reflection refers to the act of a program inspecting or changing its own code at runtime. Java's reflection API allows us to access private fields or methods of an object by name, allowing us to bypass their private nature. However, using reflection can make our tests harder to write and read, and doesn't work well with IDEs and compiler type checks (since it only works at runtime).

We recommend using this last method: reflection. However, you need to be careful about where you use it.

  • DO NOT use reflection in tests that will be graded. Reflection (and really all the methods mentioned other than the first) is tied heavily to your own implementation, meaning that it is unlikely to work with other implementations of the class, including as the ones we use during grading.
  • DO NOT use reflection outside of tests. None of the assignments in this class require usage of reflection.

We've included a helper method in BaseTest.java for this assignment that can return the contents of a private field of an object given the field's name and type. We've also included an example usage of the method in TestChainedHashDictionary.java, since it's not obvious how to pass it a type. You may find it useful if for example you wish to add a test to check that your chains array is resizing properly.

Section d: Implement ChainedHashSet

Task: complete the ChainedHashSet class.

In section c, you implemented the dictionary ADT with a hash table. You can also implement the set ADT using hash tables. Recall that in sets, the data item to store is just a key, not a key-value pair. In sets, the operation of interest is contains(key), which returns whether a key is part of the set. Hash tables provide an efficient implementation for this operation.

Notes:

  • To avoid code duplication, we will use an internal dictionary of type IDictionary<KeyType, Boolean> to store items (keys) in your HashSet. Since there are no key-values pairs (only keys) in sets, we will completely ignore the values in your dictionary: use a placeholder boolean whenever necessary.
  • Your code for this class should be very simple: your inner dictionary should be doing most of the work.

Section e: Complete writeup

Task: submit answers to the following questions.

Each group member should answer and submit these questions independently. These are graded credit/no credit, but we still expect some level of detail regarding your experience. Submit your individual writeups (in plaintext .txt) at this Canvas page. Use this template to write your individual writeup.

  1. Questions about the homework:
    1. What is your name and your partner's name?
    2. How was the homework? Did you enjoy it? Whats parts did you enjoy or not enjoy?
    3. Do you have any suggestions for improving the homework?
  2. Questions about your partner (skip if you worked solo):
    1. Which parts of the homework did you work on, and which parts did your partner program?
    2. How was your partnership? Do you feel you and your partner contributed equally to this homework?

Deliverables

The following deliverables are due on Friday February 8 Tuesday February 12 at 11:59pm.

Before submitting, be sure to double-check and make sure that...

  • Your ArrayDictionary class is fully repaired and passes all tests.
  • Your ChainedHashDictionary class is fully implemented and passes all tests.
  • Your ChainedHashSet class is fully implemented and passes all tests.
  • You ran the checkstyle plugin within your IDE and fixed all style errors.

Submit by pushing your code to GitLab. If you intend to submit late, fill out this late submission form when you submit.

Both partners should turn in their individual writeup (section e) by the deadline via Canvas at this page.