Link Search Menu Expand Document

ArrayIntList

Table of contents

  1. Letter Inventory
  2. Collections
  3. Class invariants
  4. Pre/post conditions
  5. Exceptions

Letter Inventory

Natural language processing (NLP) is an interdisciplinary area of computer science focused on programming computers to work with natural languages used by humans everyday. Since all data in computers are ultimately represented as numbers, a key challenge in NLP is vectorization: the process of turning a natural language text into a meaningful numeric representation.

A letter inventory is a data type that represents a character-by-character count vectorization for letters in the English alphabet. Each letter inventory keeps track of how many a’s, how many b’s, etc. are contained in a given string. For example, the letter inventory for the string “Washington State” corresponds to [aaeghinnosstttw]. The case of the letter is ignored, so ‘s’ and ‘S’ are exactly the same.

Letter Inventory

Oct 5
ArrayIntList
BJP 10.1, 15.1, 15.2
  1. Define classes with an array (storing primitive data types or strings) as a field.
  2. Define methods that reassign field values while maintaining class invariants.
  3. Document pre/post conditions and throw exceptions to report problems to clients.
Oct 6
SectionArrayIntList
Oct 7
Stacks and Queues
BJP 14
  1. Define static methods that accept multiple objects as parameters.
  2. Loop over each item in a queue using the add, remove, and size methods.
  3. Loop over each item in a stack, storing popped items in another queue or stack.
Oct 8
SectionStacks and Queues
Oct 9
Algorithm Analysis
BJP 13.2
  1. Apply the runtime analysis process to formally describe an algorithm’s runtime.
  2. Describe an example where a queue or stack would be preferred over a list.
  3. Explain why ArrayList does not implement the Queue interface.

We can use letter inventories to autocorrect text as the user types!

Collections

Letter inventories are a specific type of collection. A collection is an object that stores elements (data). Almost every problem that we’ll solve in this course involves defining algorithms that use collections to store and manipulate data.

To help us learn how to design a letter inventory, we’ll study how Java implements a popular collection type called ArrayList by building a simplified ArrayIntList class. An ArrayIntList can be used to store a list (ordered sequence) of integer data.

// An ArrayIntList stores a list of integers
public class ArrayIntList {
    private int[] elementData;
    private int size;

    // Constructs an empty list
    public ArrayIntList() {
        elementData = new int[10];
        size = 0;
    }

    // Appends the given value to the end of the list
    public void add(int value) {
        elementData[size] = value;
        size += 1;
    }
}

Internally, the ArrayIntList class maintains two fields: an int[] elementData representing the list of integers and an int size representing the current number of elements in the list.

Class invariants

So what’s the difference between an ArrayIntList and a raw int[]? Lists disagree with two key features of arrays.

  1. Array elements can be stored at any index.
  2. Array size is fixed and cannot be changed after construction.

These two features are part of what make arrays so useful as building blocks for programs. But they also inconvenience client programmers in two ways.

  1. Since array elements can be stored at any index, there can be empty indices between elements.
  2. Since array size is fixed, full arrays cannot store any more elements. (A larger array needs to be constructed, and all of the elements need to be copied over.)

Lists help clients store ordered sequences of elements without having to worry about these inconveniences. Our study of ArrayIntList will primarily focus on the first point about the location of array elements. Unlike int[], ArrayIntList enforces a class invariant (a relationship that must always hold) between the values in elementData and the size field.

ArrayIntList class invariant
The i-th item in the list is always stored at elementData[i].
  • Before any public method call, the class invariant must hold for external correctness.
  • In the implementation, the class invariant can be temporarily broken so long as it is later restored.
  • After any public method call, the class invariant must hold for external correctness.

Pre/post conditions

A getter method for ArrayIntList elements can be written in a few lines.

// Returns the integer at the given index in the list
public int get(int index) {
    return elementData[index];
}
Give an example of a problematic input index to the get method.

Calling get(-8) or get(100000) would cause an ArrayOutOfBoundsException.

A more subtle example: any index greater than or equal to the size of the list but still within the bounds of the underlying elementData array. get would return whatever “garbage” value was stored in the vacant array index.

To tell the client about these errors, we’ll introduce a documentation format called pre/post conditions.

Preconditions
Conditions that the client needs to follow in order for the method to behave as intended.
Postconditions
The intended behavior of a method assuming preconditions hold.
// pre : 0 <= index < size()
// post: Returns the integer at the given index in the list
public int get(int index) {
    return elementData[index];
}

While these comments are helpful for the client, it would be better to add validation checks in the code to prevent unintended behavior from occurring when the preconditions are not met. Here are two approaches.

Returning a dummy value
We can return a specific dummy value to notify the client. For example, some implementations of indexOf (returns the index of a given value in a list) return -1 if the value is not contained in the list. Since the client knows that the only valid indices that should be returned from indexOf are in the range 0 <= index < size(), -1 signifies a false result. If implemented this way, this behavior should be clearly documented!
Why might it be bad for get to return -1 when given an invalid index?

The client might have actually added the value -1 as an element to the list earlier, so they wouldn’t know if the -1 returned from get represented the dummy value or the value that they added to the list.

Throwing an exception
The implementer can report a problem to the client by throwing an exception. Most of the time, we prefer this approach.

Exceptions

We can check preconditions by adding an if statement. In order to stop the program within the if statement, throw an exception.

// pre : 0 <= index < size() (throws IllegalArgumentException if not)
// post: Returns the integer at the given index in the list
public int get(int index) {
    if (index < 0 || index >= size) {
        throw new IllegalArgumentException();
    }
    return elementData[index];
}

In this example, we chose to throw new IllegalArgumentException(). There are many different types of exceptions, each of which informs the client of a different problem. Don’t worry about which one to use: for the purpose of this course, we’ll walk you through the exceptions that need to be thrown when they become relevant.

Note that we also documented this exception by adding it to our method comment.