Due Thursday, July 18 at 11:59pm.
In this half of the assignment, you will run some code that uses the data structures you implemented in the previous part, and then analyze the resulting behavior. This will generally involve the following steps:
The hypothesis/predict-based-on-the-code portions of your write-up will be graded based on completion, so just write down your honest thoughts before running the experiment. Please submit all plots generated along with your answers. The post-experiment analysis portions will be graded based on the clarity of your explanations and whether they match up with your plot.
You and your partner will work together on this write-up. Your completed write-up MUST be in PDF format. Submit your write-up to Gradescope here as a group. Make sure to log in to Gradescope with your @uw.edu email address. When you submit, mark which pages correspond to the questions we ask, and afterwards, the partner who uploaded the PDF must add the other partner as a group member on Gradescope. Do not have each member submit individually. A video showing how to do this can be found here. We may deduct points if you do not correctly submit as a group or do not properly assign your pages/problems.
You won't be writing any new code in this part of the project; instead, you'll be running the experiment files in
src/main/java/analysis/experiments/
.
Additionally, here are a few more files that you might want to review while completing the assignment (note that this is just a starting point, not necessarily an exhaustive list):
src/main/java/datastructures/lists/DoubleLinkedList.java
src/main/java/datastructures/dictionaries/ArrayDictionary.java
The results of these experiments depend on your previous code, so it's important to make sure you fix all of the potential bugs you might have in your code before doing the final runs of the experiments. You will receive feedback on your part 1 code soon after the final due date (usually the weekend-ish) and you should determine any fixes required from that. In the meantime, however, you may still run the experiments—if your data structures passed all the provided tests, you can assume that their behavior is reasonably close to the expected behavior.
When you run most of the experiment files, a new window should open that plots the results of the experiment as they run. (Only experiment 8 is different; it will finish running all code first before producing the plot). For many of the experiments, the x-axis represents the size of the data structure (see the experiment notes for more details), and the y-axis represents time taken to run the code.
The window should look something like this:
To save these files, right click anywhere on the plot and an options window should appear. Select 'Save as' > 'PNG...'' to save the file.
Note: we've set PNG files to be ignored in your git repos to reduce the memory usage when we download all your repositories for grading. If you want to share these images, you will have to use other means.
There are a couple more useful options for the plots, such as renaming and hiding results. To access the menus, right click and select 'Series Properties...' to be brought to the menu shown below. Here you can hide results that grow too quickly, allowing you to view the other results better. You can also rename their legend entries here.
Feel free to also explore the other menu options. If you wish to rename the plot titles or axes labels, you may do so from the 'Chart Properties' window.
Note: to zoom in on a specific region of your plot, click and drag right and down to select the region. If you want to reset the zoom, drag any direction that is not a combination of right and down. You can also zoom through the right-click menu.
For each of the experiments, answer the bolded questions (and questions in the orange boxes) in your write-up. Also include the PNGs of the plots inside your write-up PDF.
Note: As a reminder, you may ignore Checkstyle errors in this code, since this isn't code that you'll be submitting. If the errors bother you though, we've updated our Checkstyle configuration to remove the Checkstyle errors in the experiment files for this assignment, but you may need to re-download the configuration file if you have it saved locally.
In this first experiment, we'll check the actual runtimes of getting indices out of a
DoubleLinkedList
, compared to using the iterator. First, open up the code for the experiment and
read through the three test methods that the experiment will be calling. Based on the code, predict the
complexity class (constant, logarithmic, linear, \(n\log(n)\), quadratic, exponential) of the runtime of each test method as listSize
increases, and record it
in your write-up.
Now, run the experiment file. The main method in each experiment will open a plot window and plot the output of
each test method when called with each input values from the listSizes
array.
get
This experiment will check the runtime of getting various indices of your doubly-linked list. Again, read through
the code for each test method of the experiment and predict the complexity class (constant, logarithmic, linear, \(n\log(n)\), quadratic, exponential) of the runtime of
each as listSize
increases.
remove
order
This experiment plots the runtime of removing all key-value pairs from your array dictionary, but with two
different orderings. Read through each test method, and describe the order that each one removes from
the dictionary in terms of the locations of the removed elements in the dictionary's array. Then, predict
the complexity class (constant, logarithmic, linear, \(n\log(n)\), quadratic, exponential) of the runtime of each as listSize
increases.
Note that we deliberately avoid referring to "cases" in this section, since talking about cases is weird/confusing when the dictionary size is changing. The underlying intuition is very similar, however: for each list size, one of the two removal orderings ends up significantly slower than the other, to the point that we can see it has worse asymptotic growth.
put
order
This experiment plots the runtime of adding values to a BST dictionary (sorted based on the keys) with three
different key orderings. Read through each test method, and describe the order that each one adds to the
dictionary. Then, predict the complexity class (constant, logarithmic, linear, \(n\log(n)\), quadratic, exponential) of the runtime of each as dictionarySize
increases.
In this experiment, we'll plot the memory usage of DoubleLinkedList
and ArrayDictionary
objects. Comparing these two isn't very useful in practice, since we would almost never need to compare an
implementation of a list to an implementation of a dictionary, but we may as well plot them both at the same
time for convenience.
Both test methods simply construct a DoubleLinkedList
or ArrayDictionary
of the input
size, respectively; predict the complexity class (constant, logarithmic, linear, \(n\log(n)\), quadratic, exponential) of the memory usage of each as the size
increases.
Recall that the master theorem allows us to easily calculate the big-\(\Theta\) runtime of recursive code. Using the master theorem, predict the big-\(\Theta\) runtimes for both tests. Next, compare your predicted big-\(\Theta\) runtimes with the actual runtime plot of the experiment. Do these results reflect what you predicted, and why or why not?
Now you've had some master theorem practice, let's look at how you would apply it to a real example: a method to
reverse lists. We have the methods reverse1
and reverse2
, they both do the same thing
but work slightly
differently.
Give a recurrence formula for the runtime of each of the reverse methods. Do not worry about finding the exact constants for the non-recursive term. (For example, if the running time is \(A(n) = 4A(n/3) + 25n\), you need to get the \(4\) and the \(3\) right but you don't have to worry about getting the \(25\) right.) Then, use the master theorem to predict the big-\(\Theta\) runtimes. Again, compare your predicted big-\(\Theta\) runtimes to the actual runtimes. Do the results match what you expected, and why or why not?
There are no additional questions for this part.
The experiments in this section are a little different from the others—the test functions only run a single time each and populate an array rather than outputting a single value.
The goal of the experiment is to demonstrate that the memory usage of code that runs multiple times (in this case, code in a for loop) changes depending on how the memory is used or saved
Note: some of these experiments will use a nontrivial amount of your computer's memory (~500MB). If the code takes more than a few minutes to run or your computer runs out of memory, you should be able to produce a sufficient plot by turning down the max size a bit.
In this half of the experiment, we'll first compare the memory usage of two different functions that create
arrays in a loop. Read through the code for each and predict the complexity class (constant, logarithmic, linear, \(n\log(n)\), quadratic, exponential) of the memory usage for each as
maxSize
increases.
(Note that the freq
parameter is used solely for determining when to save values to plot, and has no
relation to
the memory usage.)
So, what's the actual asymptotic growth of the memory usage of that second test, then? In this half of the experiment, we'll play around a bit more with Java and its garbage collection to find out. We won't ask you to do any prediction this time around, since Java's garbage collector is not a main learning objective of this course, nor is it something we expect you to know about.
It may still help to read over the test functions, though; note that the first test function for this half is
actually just the second test function from the previous half; then, tests 2 and 3 are similar functions, but
with an extra System.gc
call included at different points. You may have noticed this in the code
comments from the previous half as well, but this method makes a request to Java to perform garbage collection
(although Java has no obligation to comply). The inclusion of this method call should bring the memory usage
much closer to our expected behavior.
Consider the following method, and predict the complexity class (constant, logarithmic, linear, \(n\log(n)\), quadratic, exponential) of its memory usage as its input increases:
public static Object[] test3(int n) {
Object[] output = new Object[10];
Object[] arr = output;
for (int i = 0; i < n; i++) {
arr[0] = new Object[10];
arr = (Object[]) arr[0];
}
return output;
}
Now repeat for this method:
public static Object[] test4(int n) {
Object[] output = new Object[10];
for (int i = 0; i < n; i++) {
output[0] = new Object[10];
output = (Object[]) output[0];
}
return output;
}
After finishing the project, take a couple of minutes to complete this individual feedback survey on Canvas.