CSE 373, Spring 2019: HW2 Part 1

Summary

Due Wednesday, April 17 at 11:59pm.

Submit by pushing your code to GitLab. If you intend to submit late, fill out this late submission form when you submit.

In this half of the assignment, you will implement the data structures you will be using for the rest of the quarter in later projects. For this reason, it's very important in this part of the project to make sure you work out all of the potential bugs you might have in your code—oversights in this part may end up causing bugs for the entire rest of the course.

You will be modifying the following files:

  • src/main/java/datastructures/concrete/DoubleLinkedList.java
  • src/test/java/datastructures/TestDoubleLinkedListDelete.java
  • src/main/java/datastructures/concrete/dictionaries/ArrayDictionary.java

Additionally, here are a few more files that you might want to review while completing the assignment (note that this is just a starting point, not necessarily an exhaustive list):

  • src/test/java/datastructures/TestDoubleLinkedList.java
  • src/test/java/datastructures/BaseTestDoubleLinkedList.java
  • src/main/java/datastructures/interfaces/IList.java
  • src/test/java/datastructures/dictionaries/TestArrayDictionary.java
  • src/test/java/datastructures/dictionaries/BaseTestDictionary.java
  • src/main/java/datastructures/interfaces/IDictionary.java
  • src/main/java/analysis/experiments/*

WARNING: For all assignments, you should modify only the files we explicitly list at the top of the assignment page as needing modifications. All changes in other files will be ignored during grading. If you accidentally change one of these files, your code may not compile when we're grading, in which case we may not grade your code, and you'll get a zero on the assignment. If you need to temporarily change one of these files for debugging, make sure you revert your changes afterwards.

Also, absolutely DO NOT change the gitlab-ci.yaml config file. This file affects what the GitLab runners run; you have no reason to change this, and if you do, you may lose the ability to get feedback from the runners. Also don't change the build.gradle file, since your IDE needs that check project dependencies.

Note that the most important information for the programming assignments (the expected input and output, and any special cases) tend to be found in the files themselves. This page serves as an general guide to the assignment; the details tend to be found more often within the files themselves.

Where are the files?

The structure of our projects are much more complex than the assignments in CSE 142 and 143; it might be hard to even find the files at first! Let's take a look at how you would find the file test.java.datastructures.TestDoubleLinkedListDelete.java to start.

  1. First, all of the packages and classes we handle start off in the src directory of the provided project folder.
  2. The test at the start of the package statement signifies that you need to go into the test folder of the src directory. This is the folder where all the tests should live.
  3. Similarly, we will go into the java directory, due to the .java.
  4. datastructures will be the last directory to go into before you can open the file, which is...
  5. TestDoubleLinkedListDelete.java!

Note that you've interacted with files like this before; for example, when using a Scanner in CSE 142, you had to start off your code with

import java.util.Scanner;

Which tells the compiler that the Scanner object exists within the java.util package. Similarly,

import java.util.*;

will import all classes within the java.util package (but note that this doesn't extend to other related packages; java.util.stream might seem to imply that it is a 'subpackage' or something of the sort, but it is simply named that way to signify that it is related to that package).

We recommend reading through this entire assignment page before you begin writing code. We also have a video overview for this part with some additional tips for after you read the assignment, but before you start coding.

Part 1a: Implement DoubleLinkedList

Task: Complete the DoubleLinkedList class.

A doubly-linked list is a similar to the singly-linked lists you studied in CSE 143 except in two crucial ways: your nodes now have pointers to both the previous and next nodes, and your linked list class has now have pointers to both the front and back of your sequence of list node objects.

Visually, the singly linked lists you studied in CSE 143 look like this:

Singly linked list diagram

Doubly-linked lists containing the same data will look like this:

Doubly linked list diagram

Your implementation should:

  1. Be generic (e.g. you use generics to let the users store objects of any types in your list)
  2. Implement the IList interface. This means you will be using this file extensively as a template for what your code will do.
  3. Be as asymptotically efficient as possible.
  4. Contain exactly as many node objects as there are items in the list. (If the user inserts 5 items, you should have 5 nodes in your list).

Warning: correctly implementing a doubly-linked list will require you to pay careful attention to edge cases. Some tips and suggestions:

  • Think carefully about the end cases (front and back) and what should happen when the list is empty or nearly empty.
  • Write pseudocode for your methods before writing code. Avoid immediately thinking in terms of list node manipulation – instead, come up with a high-level plan and write helper methods that abstract your node manipulations. Then, flesh out how each helper method will work.

    Or to put it another way, figure out how to refactor your code before you start writing it. Your code will be significantly less buggy that way.

  • Keep in mind the differences between objects and primitives (int, double, etc). This will come up in two ways: one, you'll need to remember that changing an object might change the reference in another place (Check the Week 1 QuickCheck for an example), and two, you'll need to remember to use == to compare equality for primitives and nulls, and use .equals() for object comparisons. Tip: you may use java.util.Objects to handle your equality checks with possibly-null values instead of juggling the == and .equals() yourself. Here's documentation for the relevant method.

What is an iterator?

When implementing DoubleLinkedList, you will also need to implement an iterator for the class.

You should have studied iterators in CSE 143, and we should have (briefly) covered them in lecture, but here is a review of what iterators are in case you need it:

An iterator object is a kind of object that lets the client efficiently iterate over a data structure using a foreach loop.

Whenever we do something like:

for (String item : something) {
    // ...etc...
}

...Java will internally convert that code into the following:

Iterator<String> iter = something.iterator();
while (iter.hasNext()) {
    String item = iter.next();
    // ...etc...
}

When you call iter.next() for the first time, the iterator will return the first item in your list. If you call iter.next() again, it'll return the second item. Once the user calls iter.next() enough time and encounters the last item in the list, calling iter.next() once again should throw an NoSuchElementException.

The iter.hasNext() method will return true if calling iter.next() will safely return a value, and false otherwise.

You can see an example of this expected behavior within your tests.

Notice that the iterator is behaving somewhat similar to a Scanner, except that it's iterating over a data structure instead of a String or file.

In practice, iterators can also be used to safely modify the object they're iterating over. We will not be implementing this functionality in this class: you should assume the client will never modify a data structure while they're iterating over it.

Part 1b: Write missing tests

Task: Add tests for the delete method to TestDoubleLinkedListDelete.

In part 1b, you will practice writing some unit tests using JUnit.

Start by skimming through TestDoubleLinkedList.java and familiarize yourself with the tests we have given you. Since this is the first assignment, we've given you most of the tests you need, except for a few. Can you see what tests are missing?

There are no tests for the DoubleLinkedList.delete(...) method! Your job here is to write tests for this method.

You are responsible for writing tests that ensure your delete method is fully working. The ability to create a comprehensive set of tests for a given piece of code is a foundational skill in programming. To do this you'll want to think about all the different ways your code could be broken. Often it is helpful to think of these possible bugs in categories:

  • Expected Behavior: the obvious or most common uses of your code should function correctly
  • Edge Cases: your code should function properly at the edges of expected behavior such as interacting with first or last elements.
  • Empty Case: your objects should behave properly when they are initially empty or become empty due to usage
  • Invalid Input: your code should properly protect itself from improper usage such as bad inputs or operations state mismatch
  • Scale: your code should be efficient enough to perform appropriately when tasked with larger sets of data or invocations

To help you out this first time, here are some cases you will need to write tests for specific to your delete method. Turn each of the following cases into its own JUnit test method.

Whenever you want to check the state of the list, you'll need to call its methods and use one of the JUnit assert methods to check whether our expected value matches the actual value the list returns. When implemented as intended, each of the cases below (except for the invalid input case) should have at least one assert statement. It's also important to think about how each test case affects different aspects of the state of the list to ensure that your tests check all the necessary methods of the list.

Note: this is not a comprehensive list of cases, so you should add more tests as you think of them.

[redacted]

A few additional notes:

Part 1c: Implement ArrayDictionary

Task: Complete the ArrayDictionary class.

Your ArrayDictionary class will internally keep track of its key-value pairs by using an array containing Pair objects.

Dictionary<String, Integer> foo = new ArrayDictionary<>();
foo.put("a", 11);
foo.put("b", 22);
foo.put("c", 33);

Your internal array should look like the following:

ArrayDictionary internal state 1

Now, suppose the user inserted a few more keys:

foo.put("d", 44);
foo.remove("b");
foo.put("a", 55);

Your internal array should now look like the below. Notice that we've updated the old key-value pair for "a" to store the new value. We've also removed the key-value pair for "b".

ArrayDictionary internal state 2

This means you will need to scan through the internal array when retrieving, inserting, or deleting elements. If your array is full and the user inserts a new key, create a new array that is double the size of the old one and copy over the old elements. Minor design decisions, like the initial capacity of the array, are left up to you; choose something that reasonable and adjust if it seems necessary.

Once completed, the design and logic of your ArrayDictionary should feel very similar to the ArrayIntList objects you previously studied in CSE 143.

There is one general optimization we will have you implement. Because the values in the dictionary are inherently unordered, we can use this to our benefit in the remove method. Instead of shifting over all the elements as you would normally need to do to remove from an array, you should instead just replace the value stored at the index containing the element to be removed to be the last pair currently in the ArrayDictionary. Here is an example of what your internal representation may look like before, during, and after a single call to dict.remove("a").

ArrayDictionary internal state during remove

This seems inefficient...?

You may have noticed that the implementation we've described above does not feel very efficient – it would take \(\mathcal{O}(n)\) time to lookup a key/value pair.

This is true! We need dictionaries to do interesting things but also have not covered how to implement truly efficient dictionaries yet. We've compromised by having you implement a basic one for now.

You'll implement more efficient dictionaries later this quarter, as a part of your second partner programming project.

Part 1d: Complete group writeup

Task: Complete a writeup containing answers to the following questions.

Your writeup will be graded partially on whether or not you produced a reasonable answer, and partially based on the clarity of your explanations and analysis. You and your partner will work together on this writeup.

Note: your completed writeup MUST be in PDF format. You will not submit this writeup via Canvas—instead, submit it to Gradescope here. Log into Gradescope with your @uw.edu email address. When you submit, make sure to mark which pages correspond to the questions we ask, and after submission the partner who submitted the assignment must add the other partner. That is, please only turn in one submission per group and make sure both group members are assigned to that submission. A video showing how to do this can be found here.

  1. You will run a variety of experiments and report back on the resulting behavior. Since this is the first assignment, we have written the experiments for you so all you need to do is run them. You will be asked to write your own experiments in future projects, so be sure to take some time to read the provided code and understand what's going on.

    You can find the experiments in the analysis.experiments package. For each of the four experiments, answer the following:

    1. Briefly, in one or two sentences, summarize what this experiment is testing. (Note: some lines include comments about what they do, but not all. Treat this as an exercise in reverse-engineering unknown code.) When reading these experiments, you can generally overlook "main" (which contains a lot of Java-specific shorthand) and pay closer attention to the "test" methods that you'll find below. A comment will describe the calls that main is making.

    2. Predict what you think the outcome of running the experiment will be, in a few sentences.

      There is no right or wrong answer here, as long as you thoughtfully explain the reasoning behind your hypothesis.

    3. Run the experiment code. Each Experiment*.java file should generate a CSV file in the experimentdata folder (it is possible that the 4th experiment will produce a warning; this is can safely be ignored, as long as the CSV file in generated as described). We recommend using plot.ly to generate a plot from your CSV, but if you prefer you can also use Microsoft Excel, Google Sheets, or any other analysis tool of your choice. Generate a plot of the data, and include this plot as a image inside your writeup. If the CSV file contains multiple result columns, plot them all in the same chart.

      If you're using plot.ly, here are some basic instructions:

      1. Click the 'Import' button and load your csv for the experiment data you want to graph.
      2. Click on 'Trace' for each time you want to add a new Line to your graph.
      3. Select whatever type of graph you want (line, probably?), and also select what data should be the X and Y axes.
      4. Repeat adding 'Trace's for each test result.
      5. Screenshot the graph when you're done.

      You will be graded based on whether you produce a reasonable plot. In particular, be sure to give your plot a title, label your axes (with units), include a legend if appropriate, and relabel the columns to something more descriptive than "TestResult".

    4. How do your results compare with your hypothesis? Why do you think the results are what they were?

  2. You should have seen some interesting patterns in the plots you made! In this part of the writeup, we want to hear about why you think these things happened in the lens of what we're doing in lecture.

    The answers to these questions don't need to be very long; a couple sentences for each is enough.

    1. In your experiment 1 results, you should see that removing from the front is a generally faster. Why is this the case? Feel free to explain in terms of the algorithm that is happening, the Big Oh notation, or anything else that is relevant.
    2. In experiment 2, you should see that get() is much slower than the iterator() to traverse your DoubleLinkedList—why is that? Similarly, why are the runtimes for the iterator() and the for-each loop so similar?
    3. In experiment 3, you should see that get() works very quickly for two areas of your DoubleLinkedList: the front and back. It should get slower as the indices you get are closer to the middle. Why is that? State the complexity class (choose from: constant, linear, quadratic, logarithmic, exponential, etc.) of the runtime of getting from the front, getting from the back, and getting from the middle to support your answer.
    4. In experiment 4, you should notice that the memory usage for DoubleLinkedList increases by a constant amount as the input list size increases. This is in contrast to ArrayDictionary, which increases in memory usage by about the same amount on average, but sometimes has spikes in memory usage. What is the cause of the spikes in the data for ArrayDictionary, and why doesn't DoubleLinkedList spike in memory usage in the same way?

What is a CSV file?

A CSV file is a kind of text file formatted to contain tabular data. Each row of data goes on a separate line, and each cell in a row is separated by commas. (CSV stands for "comma separated value").

Since CSV files are just regular text files, you can open and view them using any text editor such as Eclipse. This makes CSV files a fairly popular way of exchanging data—they're easy to generate programmatically, yet are understood by a wide variety of tools (such as Excel or Google Sheets).

Deliverables

The following deliverables are due on Wednesday, April 17 at 11:59pm.

Submit by pushing your code to GitLab. If you intend to submit late, fill out this late submission form when you submit.

Before submitting, be sure to double-check that:

  • You are submitting your completed TestDoubleLinkedListDelete, DoubleLinkedList, and ArrayDictionary classes, which all behave as expected.
  • You ran checkstyle and fixed all issues it reported with the files you edited.
  • You and your partner have submitted your group writeup on Gradescope, and are both assigned to the same submission