Homework 2: Part 1

Summary

Due Friday January 25 at 11:59pm.

Submit by pushing your code to GitLab. If you intend to submit late, fill out this late submission form when you submit.

In this half of the assignment, you will implement the data structures you will be using on the second half.

You will be modifying the following files:

main.java.datastructures.concrete.DoubleLinkedList.java
test.java.datastructures.TestDeleteFunctionality.java
test.java.datastructures.TestDeleteStress.java
main.java.datastructures.concrete.dictionaries.ArrayDictionary.java

Additionally, here are a few more files that you might want to review while completing the assignment (note that this is just a starting point, not necessarily an exhaustive list):

test.java.datastructures.TestDoubleLinkedList.java
main.java.datastructures.interfaces.IList.java
test.java.datastructures.TestArrayDictionary.java
main.java.datastructures.interfaces.IDictionary.java
main.java.analysis.experiments.*

In particular, the experiments are a portion of the writeup due with part 2 of this project, but rely on the data structures in this portion. Note that the most important information for the programming assignments (the expected input and output, and any special cases) tend to be found in the files themselves. This page serves as an general guide to the assignment; the details tend to be found more often within the files themselves.

Where are the files?

The structure of our projects are much more complex than the assignments in CSE 142 and 143; it might be hard to even find the files at first! Let's take a look at how you would find the file test.java.datastructures.TestDeleteFunctionality.java to start.

First, all of the packages and classes we handle start off in the src directory of the provided project folder.
The test at the start of the package statement signifies that you need to go into the test folder of the src directory. This is the folder where all the tests should live.
Similarly, we will go into the java directory, due to the .java.
datastructures will be the last directory to go into before you can open the file, which is...
TestDeleteFunctionality.java!

Note that you've interacted with files like this before; for example, when using a Scanner in CSE 142, you had to start off your code with

import java.util.Scanner;

Which tells the compiler that the Scanner object exists within the java.util package. Similarly,

import java.util.*;

will import all classes within the java.util package (but note that this doesn't extend to other related packages; java.util.stream might seem to imply that it is a 'subpackage' or something of the sort, but it is simply named that way to signify that it is related to that package).

We recommend reading through this entire assignment page before you begin writing code. We also have a video overview for this part with some additional tips for after you read the assignment, but before you start coding.

Part 1a: Implement DoubleLinkedList

Task: complete the DoubleLinkedList class.

A doubly-linked list is a similar to the singly-linked lists you studied in CSE 143 except in two crucial ways: your nodes now have pointers to both the previous and next nodes, and your linked list class has now have pointers to both the front and back of your sequence of list node objects.

Visually, the singly linked lists you studied in CSE 143 look like this:

Singly linked list diagram

Doubly-linked lists containing the same data will look like this:

Doubly linked list diagram

Your implementation should:

Be generic (e.g. you use generics to let the users store objects of any types in your list)
Implement the IList interface. This means you will be using this file extensively as a template for what your code will do.
Be as asymptotically efficient as possible.
Contain exactly as many node objects as there are items in the list. (If the user inserts 5 items, you should have 5 nodes in your list).

Warning: correctly implementing a doubly-linked list will require you to pay careful attention to edge cases. Some tips and suggestions:

Think carefully about the end cases (front and back) and what should happen when the list is empty or nearly empty.
Write pseudocode for your methods before writing code. Avoid immediately thinking in terms of listnode manipulation – instead, come up with a high-level plan and write helper methods that abstract your node manipulations. Then, flesh out how each helper method will work.

Or to put it another way, figure out how to refactor your code before you start writing it. Your code will be significantly less buggy that way.

What is an iterator?

When implementing DoubleLinkedList, you will also need to implement an iterator for the class.

You should have studied iterators in CSE 143, and we should have (briefly) covered them in lecture, but here is a review of what iterators are in case you need it:

An iterator object is a kind of object that lets the client efficiently iterate over a data structure using a foreach loop.

Whenever we do something like:

for (String item : something) {
    // ...etc...
}

...Java will internally convert that code into the following:

Iterator<String> iter = something.iterator();
while (iter.hasNext()) {
    String item = iter.next();
    // ...etc...
}

When you call iter.next() for the first time, the iterator will return the first item in your list. If you call iter.next() again, it'll return the second item. Once the user calls iter.next() enough time and encounters the last item in the list, calling iter.next() once again should throw an NoSuchElementException.

The iter.hasNext() method will return true if calling iter.next() will safely return a value, and false otherwise.

You can see an example of this expected behavior within your tests.

Notice that the iterator is behaving somewhat similar to a Scanner, except that it's iterating over a data structure instead of a String or file.

In practice, iterators can also be used to safely modify the object they're iterating over. We will not be implementing this functionality in this class: you should assume the client will never modify a data structure while they're iterating over it.

Part 1b: Write missing tests

Task: Modify TestDeleteFunctionality, and TestDeleteStress and add in tests for the delete method.

In part 1b, you will practice writing some unit tests using JUnit.

Start by skimming through TestDoubleLinkedList.java and familiarize yourself with the tests we have given you. Since this is the first assignment, we've given you most of the tests you need, except for a few. Can you see what tests are missing?

There are no tests for the DoubleLinkedList.delete(...) method! Your job here is to write tests for this method.

A few additional notes:

To help facilitate grading, we ask that you split your tests into two groups. All tests that check and make sure your delete(...) behaves correctly and matches the IList specification should be placed within the TestDeleteFunctionality file.

Please keep all tests within TestDeleteFunctionality short. Every test you add to this file should have a timeout of a second or less. (You're encouraged to add as many tests as you want, however.)

Add any stress tests to TestDeleteStress. Stress tests are tests that either (a) focus on testing to make sure your code is efficient or (b) focus on testing to make sure your code is correct when it's asked to handle large amounts of data.

Please see the existing stress tests in TestDoubleLinkedList to get a sense of what sorts of timeouts and list sizes you should be using.
Your stress tests do not need to be particularly complicated. We'll be assessing your tests by copying them into many different projects that contain deliberately buggy implementations of delete(...). Most of these implementations will contain actual bugs that can be caught by your TestDeleteFunctionality tests; only one or two of them will be functionally correct yet grossly inefficient.
You may find the following documents from the resources page to be particularly useful:
1. JUnit tutorial
2. Tips on testing code

Part 1c: Implement ArrayDictionary

Task: complete the ArrayDictionary class.

Your ArrayDictionary class will internally keep track of its key-value pairs by using an array containing Pair objects.

Dictionary<String, Integer> foo = new ArrayDictionary<>();
foo.put("a", 11);
foo.put("b", 22);
foo.put("c", 33);

Your internal array should look like the following:

ArrayDictionary internal state 1

Now, suppose the user inserted a few more keys:

foo.put("d", 44);
foo.remove("b");
foo.put("a", 55);

Your internal array should now look like the below. Notice that we've updated the old key-value pair for "a" to store the new value. We've also removed the key-value pair for "b".

ArrayDictionary internal state 2

This means you will need to scan through the internal array when retrieving, inserting, or deleting elements. If your array is full and the user inserts a new key, create a new array that is double the size of the old one and copy over the old elements. Minor design decisions, like the initial capacity of the array, are left up to you; choose something that reasonable and adjust if it seems necessary.

Once completed, the design and logic of your ArrayDictionary should feel very similar to the ArrayIntList objects you previously studied in CSE 143.

There is one general optimization we will have you implement. Because the values in the dictionary are inherently unordered, we can use this to our benefit in the remove method. Instead of shifting over all the elements as you would normally need to do to remove from an array, you should instead just replace the value stored at the index containing the element to be removed to be the last pair currently in the ArrayDictionary. Here is an example of what your internal representation may look like before, during, and after a single call to dict.remove("a").

ArrayDictionary internal state during remove

This seems inefficient...?

If you finished early...

If you finished early, you may want to start on the writeup portion of this project (see the homework 2, part 2 spec for more details). While the writeup is not due this week, it contains questions that focus exclusively on material from part 1.

You may find working on the writeup to be a good way of helping you determine whether your data structures are designed soundly.

Deliverables

The following deliverables are due on Friday January 25 at 11:59pm.