Project 3, part 1: Heaps, sorting, and testing

Due

In this part of the assignment, you will implement data structures and sorting algorithms you will be using on the second and third parts, and will practice writing more jUnit tests.

Part 1a: Before you start

Before you start, you should...

  1. Import the project 3 project into Eclipse. Download the project files and import it using the same process you did for projects 1 and 2.

  2. Copy your DoubleLinkedList.java, ArrayDictionary.java, ChainingHashMap.java, and ChainingHashSet.java files from the previous project into this one as well as all new tests you wrote. The files should be copied to the same packages.

  3. Make sure you are familiar with the contents of this new project.

    Note: the one new package we've added is the search package, which contains several of the pieces we'll need to finish our search engine. You will be asked to modify code in only the search.analyzers package. You will need to use (but do not need to modify) the classes in the search.models package.

    As before, do NOT modify any files you were not told to modify.

Part 1b: Implement tests for ArrayHeap and Searcher.topKSort(...)

Task: add tests for ArrayHeap and Searcher.topKSort(...) to TestArrayHeapFunctionality, TestTopKSortFunctionality, and TestSortingStress.

In parts 1c and 1d, you will be asked to implement an ArrayHeap and the top-\(k\) sorting algorithm.

After you finish reading the specifications for these classes below, start by adding tests for these three classes. If you look through the source code of these three test classes, you'll notice that they're almost completely empty! It's now your responsibility to write comprehensive tests for these two classes.

We will grade the quality of your tests by running them against many different buggy versions of ArrayHeap and Searcher.topKSort(...). In order to help us do so efficiently, we ask that you...

  1. Keep all tests in TestArrayHeapFunctionality and TestTopKSortFunctionality short. Every test you add to these two files should have a timeout of a second or less. (You can add as many tests as you want to these two files, however).

    As a general rule of thumb, any tests in these two files should be manipulating heaps or lists that at most contain several hundred items.

  2. Add any expensive stress tests (tests that manipulate a large amount of data or take a long time to complete) to the TestSortingStress file. This file can contain stress tests for both ArrayHeap and Searcher.topKSort(...).

    As a general rule of thumb, any tests in this file should be manipulating heaps or lists containing several hundred thousand items.

Notes:

  1. As a reminder, you may use classes from java.util.* when writing unit tests. You may find Collections.sort(...) to be particularly helpful: you can use it to help you automatically determine if your output is in the correct order. You may need to write some extra logic to convert between an IList and java.util.List, however.

    As always, you may not use anything from java.util.* for anything outside of your tests, unless explicitly told otherwise.

  2. Be sure to test your classes both on very large amounts of data and very small amounts of data.

  3. Be sure to test to make sure you throw the correct exceptions if the user uses your classes incorrectly!

Part 1c: Implement ArrayHeap

Task: complete the ArrayHeap class.

Notes:

  1. In lecture, we covered how to implement and work with 2-heaps: a heap where each node has up to 2 children.

    For this assignment, you must implement a 4-heap: a heap where each node has up to four children. You will need to modify the equations you were given in lecture in order to successfully do so.

  2. It is up to you whether you want to have your heaps implement 1-indexing (e.g. whether you want to leave the element located at the 0-th index empty or not).

  3. Make sure you're implementing a min-heap, not a max-heap.

  4. Unlike all the other data structures we've implemented so far, your heap will not attempt to handle null entries. (There's no sensible way to compare a null against a non-null element).

Part 1d: Implement Searcher.topKSort(...)

Task: complete the Searcher.topKSort(...) method.

Your Searcher.topKSort(k, list) method is responsible for returning the top \(k\) elements from a list containing \(n\) comparable elements. You can find the Searcher class inside the misc package.

One naive way of implementing this algorithm would be to implement some sorting algorithm such as quicksort or mergesort, sort the list then return the last \(k\) elements.

However, this would take \(\mathcal{O}(n\log(n))\) time to run! We can do better: your implementation must run in \(\mathcal{O}(n\log(k))\) time.

Notes:

  1. Make sure your topKSort method works with both small and large input lists!

  2. Hint 1: the only extra data structure you will need to use is a single ArrayHeap.

  3. Hint 2: you should need to loop over your input list exactly once.

Deliverables

The following deliverables are due on .

A single person from your partnership should turn in a zip file containing the entire contents of your src folder on Canvas.

Please follow the above instructions EXACTLY. In particular, DO NOT rename your "src" folder before zipping it or modify the folder structure in any way.

Before submitting, be sure to double-check and make sure that...