ArrayIntList
Table of contents
Letter Inventory
Natural language processing (NLP) is an interdisciplinary area of computer science focused on programming computers to work with natural languages used by humans everyday. Since all data in computers are ultimately represented as numbers, a key challenge in NLP is vectorization: the process of turning a natural language text into a meaningful numeric representation.
A letter inventory is a data type that represents a character-by-character count vectorization for letters in the English alphabet. Each letter inventory keeps track of how many a’s, how many b’s, etc. are contained in a given string. For example, the letter inventory for the string “Washington State” corresponds to [aaeghinnosstttw]. The case of the letter is ignored, so ‘s’ and ‘S’ are exactly the same.
Letter Inventory
- Oct 5
- ArrayIntList
- BJP 10.1, 15.1, 15.2
- Define classes with an array (storing primitive data types or strings) as a field.
- Define methods that reassign field values while maintaining class invariants.
- Document pre/post conditions and throw exceptions to report problems to clients.
- Oct 6
- SectionArrayIntList
- Oct 7
- Stacks and Queues
- BJP 14
- Define static methods that accept multiple objects as parameters.
- Loop over each item in a queue using the
add
,remove
, andsize
methods. - Loop over each item in a stack, storing popped items in another queue or stack.
- Oct 8
- SectionStacks and Queues
- Oct 9
- Algorithm Analysis
- BJP 13.2
- Apply the runtime analysis process to formally describe an algorithm’s runtime.
- Describe an example where a queue or stack would be preferred over a list.
- Explain why
ArrayList
does not implement theQueue
interface.
We can use letter inventories to autocorrect text as the user types!
Collections
Letter inventories are a specific type of collection. A collection is an object that stores elements (data). Almost every problem that we’ll solve in this course involves defining algorithms that use collections to store and manipulate data.
To help us learn how to design a letter inventory, we’ll study how Java implements a popular collection type called ArrayList
by building a simplified ArrayIntList
class. An ArrayIntList
can be used to store a list (ordered sequence) of integer data.
// An ArrayIntList stores a list of integers
public class ArrayIntList {
private int[] elementData;
private int size;
// Constructs an empty list
public ArrayIntList() {
elementData = new int[10];
size = 0;
}
// Appends the given value to the end of the list
public void add(int value) {
elementData[size] = value;
size += 1;
}
}
Internally, the ArrayIntList
class maintains two fields: an int[] elementData
representing the list of integers and an int size
representing the current number of elements in the list.
Class invariants
So what’s the difference between an ArrayIntList
and a raw int[]
? Lists disagree with two key features of arrays.
- Array elements can be stored at any index.
- Array size is fixed and cannot be changed after construction.
These two features are part of what make arrays so useful as building blocks for programs. But they also inconvenience client programmers in two ways.
- Since array elements can be stored at any index, there can be empty indices between elements.
- Since array size is fixed, full arrays cannot store any more elements. (A larger array needs to be constructed, and all of the elements need to be copied over.)
Lists help clients store ordered sequences of elements without having to worry about these inconveniences. Our study of ArrayIntList
will primarily focus on the first point about the location of array elements. Unlike int[]
, ArrayIntList
enforces a class invariant (a relationship that must always hold) between the values in elementData
and the size
field.
- ArrayIntList class invariant
- The
i
-th item in the list is always stored atelementData[i]
.
- Before any public method call, the class invariant must hold for external correctness.
- In the implementation, the class invariant can be temporarily broken so long as it is later restored.
- After any public method call, the class invariant must hold for external correctness.
Pre/post conditions
A getter method for ArrayIntList
elements can be written in a few lines.
// Returns the integer at the given index in the list
public int get(int index) {
return elementData[index];
}
Give an example of a problematic input index to the get method.
Calling get(-8)
or get(100000)
would cause an ArrayOutOfBoundsException
.
A more subtle example: any index greater than or equal to the size of the list but still within the bounds of the underlying elementData
array. get
would return whatever “garbage” value was stored in the vacant array index.
To tell the client about these errors, we’ll introduce a documentation format called pre/post conditions.
- Preconditions
- Conditions that the client needs to follow in order for the method to behave as intended.
- Postconditions
- The intended behavior of a method assuming preconditions hold.
// pre : 0 <= index < size()
// post: Returns the integer at the given index in the list
public int get(int index) {
return elementData[index];
}
While these comments are helpful for the client, it would be better to add validation checks in the code to prevent unintended behavior from occurring when the preconditions are not met. Here are two approaches.
- Returning a dummy value
- We can return a specific dummy value to notify the client. For example, some implementations of
indexOf
(returns the index of a given value in a list) return -1 if the value is not contained in the list. Since the client knows that the only valid indices that should be returned fromindexOf
are in the range0 <= index < size()
, -1 signifies a false result. If implemented this way, this behavior should be clearly documented!
Why might it be bad for get to return -1 when given an invalid index?
The client might have actually added the value -1 as an element to the list earlier, so they wouldn’t know if the -1 returned from get
represented the dummy value or the value that they added to the list.
- Throwing an exception
- The implementer can report a problem to the client by throwing an exception. Most of the time, we prefer this approach.
Exceptions
We can check preconditions by adding an if
statement. In order to stop the program within the if
statement, throw an exception.
// pre : 0 <= index < size() (throws IllegalArgumentException if not)
// post: Returns the integer at the given index in the list
public int get(int index) {
if (index < 0 || index >= size) {
throw new IllegalArgumentException();
}
return elementData[index];
}
In this example, we chose to throw new IllegalArgumentException()
. There are many different types of exceptions, each of which informs the client of a different problem. Don’t worry about which one to use: for the purpose of this course, we’ll walk you through the exceptions that need to be thrown when they become relevant.
Note that we also documented this exception by adding it to our method comment.