Link

Heap

Composing and testing a complex data structure from scratch.1

Learning Goals

  • Composing programs at multiple levels of abstraction.

    An important programming skill when designing and developing novel data structures is to learn how to plan appropriately. One important subtlety about planning is that it occurs at multiple levels of abstraction. As we saw with the Implementer’s Design Decision Hierarchy, there are invariants that need to be planned at a high level. However, planning also occurs when we design algorithms to implement lower-level methods.

  • Write timing tests for your programs.

    The past few homeworks have stressed testing as a skill. This assignment continues with that trend by introducing runtime as a requirement that needs to be validated with your own tests. To encourage this, the autograder will not only be limited in the verbosity of feedback but also in the number of opportunities that you have for feedback.

  • Integrate multiple data structures and algorithms to solve a problem with complex requirements.

    Due to the specific runtime requirements of this heap implementation, our solution will need to creatively combine different data structures and algorithms to design a better solution. Source code for heaps is widely available on the internet (which you’re welcome to reference so long as they’re cited as comments), but what’s tricky is learning how to identify, select, and integrate ideas from different sources.

Table of contents

  1. Getting Started
  2. ExtrinsicMinPQ API
  3. NaiveMinPQ
  4. ArrayHeapMinPQ
  5. Testing
  6. Submission

Getting Started

Pull the skeleton repository to get the heap assignment.

Warning
To reinforce the theme of learning to write automated tests for your own code, this homework will only allow 5 total submissions to the autograder. This will be enforced by the velocity limiter. Any further submissions will receive 0 points. You can mark a particular submission for grading by clicking “Submission History” in any of your Gradescope submissions.

ExtrinsicMinPQ API

Your task for this homework assignment will be to develop an array-based implementation of the ExtrinsicMinPQ interface. The ExtrinsicMinPQ interface similar to the MaxPQ interface that we discussed in lecture, except instead of optimizing for access to the maximum priority item, we optimize for access to the minimum priority item.

  • public void add(T item, double priority): Adds an item of type T with the given priority.
  • public T getSmallest(): Returns the item with smallest priority.
  • public T removeSmallest(): Removes and returns the item with smallest priority.
  • public int size(): Returns the number of items.
  • public boolean contains(T item): Returns true if the PQ contains the given item.
  • public void changePriority(T item, double priority): Sets the priority of the given item to the given value.

Furthermore, the priority is extrinsic to the object. That is, rather than relying on some sort of comparison function to decide which item is less than another, we assign a priority value using the add or changePriority functions.

In addition, this priority queue can only contain unique items. There cannot be two copies of the same item, though it’s possible for two different items to be assigned the same priority value. In this case, you can break ties arbitrarily.

NaiveMinPQ

For reference, we have provided NaiveMinPQ, a simple and slow implementation of the ExtrinsicMinPQ interface.

contains, getSmallest, and removeSmallest use built-in methods from ArrayList and Collections to do brute force searches over the entire list of items. This takes time proportional to the length of the list, or O(N).

Note
This implementation does not throw the correct exception for add when adding duplicate items, since that would make the class too slow to be usable for comparing to our ArrayHeapMinPQ.

ArrayHeapMinPQ

Your task is to create a class, ArrayHeapMinPQ, that implements the ExtrinsicMinPQ interface subject to the following rules.

  • One of your instance variables must be a min-heap represented as either an array or ArrayList. It is OK to have additional private variables.
  • You may only import from the java.util, org.junit libraries. You can also use the Stopwatch and StdRandom classes from the Princeton libraries.
  • You may not add additional public methods beyond what is expected by the ExtrinsicMinPQ interface. Private and package-private helper methods are allowed.

In addition, there are several asymptotic runtime requirements.

  • getSmallest, contains, size and changePriority must run in O(log N) time.
  • add and removeSmallest must run in O(log N) time except for the rare resize operation.

For reference, when making 1000 queries on a heap of size 1,000,000, our solution is about 300 times faster than the NaiveMinPQ. Iterating over your entire min-heap for any of these methods will take linear time!

About half of the points for this homework will be allocated towards evaluating the correctness of ArrayHeapMinPQ. The other half will be allocated towards evaluating the efficiency of ArrayHeapMinPQ, but these tests will only provide credit if the implementation is fully correct.

Tips

If you intend to use an array representation for your min-heap, make sure to increase the array size by a multiplicative factor when resizing. Otherwise, add will be linear time on average, not log time. To prevent the autograder from running out of memory, the array should never be more than approximately three-quarters empty.

We have also provided a PrintHeapDemo class that can print out an array in heap format. You’re welcome to adapt this code into your own. It might be useful for debugging. The TimingTestDemo class shows how you can use the System.currentTimeMillis method or the edu.princeton.cs.algs4.Stopwatch class to time your code.

You might find the MinPQ.java class from the textbook to be a helpful reference. We don’t recommend copying and pasting code directly into your implementation. While it might seem like this will save you time and effort, it will end up being more work than just building everything yourself from scratch.

As a reminder, your class may use other data structures.

Testing

Add your tests to a new class, ArrayHeapMinPQTest.

We will not provide any skeleton tests nor autograder messages for this homework. You will be responsible for writing your own tests and ensuring the correctness of your code. While you need to submit your tests to the autograder, they won’t be graded.

You may not share tests with any other students—all code in your possession, including tests, should be your own work. You can discuss different edge cases and share ideas about test cases, but test code falls under the same academic honesty policy as any other homework code.

Tips

Write tests as you implement each ExtrinsicMinPQ method. It is often helpful to write the tests before writing the implementation because writing a test helps with planning for the task you’re trying to solve. This helps ensure that you are building a solid foundation since each method may call previously-implemented methods.

We encourage you to primarily write tests that evaluate correctness based on the outputs of methods that provide output (e.g. getSmallest and removeSmallest). This is in contrast to tests that try to directly test the states of instance variables. For example, to test changePriority, you might use a sequence of add operations, a changePriority call, and then finally check the output of a removeSmallest call. This is usually a better idea than iterating over your private variables to see if changePriority is setting some specific instance variable.

If you really want to write tests that require inspecting the internal behavior of your program, then you’ll need to add them to the ArrayHeapMinPQ class itself. For example, suppose you want to verify that your array is [1, 2, 4, 5, 3] after adding the items 5, 4, 3, 2, 1. In this case, a hypothetical testAdd54321 method would need to be part of the ArrayHeapMinPQ class since the ArrayHeapMinPQTest can’t access private variables in the ArrayHeapMinPQ class. We don’t recommend doing this.

Don’t forget edge cases! Consider how the heap could look before inserting or removing items, and ensure that your implementation handles all possible cases. For example, consider what happens when sinking a node when its priority is greater than both of its children.

After you have confidence that your implementation passes basic tests, one possibility for comprehensive testing is to perform randomized testing and comparing the outputs of your implementation against the provided NaiveMinPQ. By randomly choosing to add, remove, and get items, we can simulate real-world workloads and check to see that our complex-but-fast ArrayHeapMinPQ matches the results of a simple-but-slow NaiveMinPQ. Make sure that your priority values are unique: duplicate priority values can be returned in any arbitrary order, and the order returned by the NaiveMinPQ might not match the order returned by your ArrayHeapMinPQ.

Submission

Commit and push your changes to GitLab before submitting your homework to Gradescope.

  1. Josh Hug. 2019. Project 2AB: Extrinsic PQ and KDTree. In CS 61B: Data Structures, Spring 2019. https://sp19.datastructur.es/materials/proj/proj2ab/proj2ab