Chapter 7
Arrays

Copyright © 2005 by Stuart Reges

7.1 Introduction

The sequential nature of files severely limits the number of interesting things you can easily do with them. The algorithms you have examined so far have all been sequential algorithms: algorithms that can be performed by examining each data item once in sequence. There is an entirely different class of algorithms that can be performed when you have random access of the data, when you can access the data items as many times as you like and in whatever order you like.

This chapter examines a new object that provides this random access--the array. The discussion of array applications begins with some sequential algorithms that are enhanced by arrays.

7.2 Arrays

Arrays, like files, are structured objects that have individual components. The array structure, however, allows random access. You can access any component at any time, no matter what you have accessed already.

An array is a collection of data values that have the same type. These different components are indexed by integers. The indexes differentiate the different components. This is similar to the way post office boxes are set up. Each post office has a series of boxes that all store the same thing: mail. The boxes are indexed with numbers so that you can refer to an individual box by using a description like "post office box 884."

You might, for example, have a series of temperature readings that you want to store. You could keep them in a series of different variables:

        double temperature1;
        double temperature2;
        double temperature3;
But you can simplify this scheme by using an array:

        double[] temperature = new double[3];
This declares a variable called temperature of type "double[]" (an array of doubles). Arrays are objects, so they need to be constructed by a call on "new". The right-hand side of this declaration tells Java to construct an array of doubles that is 3 long. They are all initialized to 0.0:

                                    +-------+
                                [0] |  0.0  |
                    +---+           +-------+
        temperature | +-+--->   [1] |  0.0  |
                    +---+           +-------+
                                [2] |  0.0  |
                                    +-------+
As the picture above indicates, the variable temperature is not itself the array. Instead, it stores a reference to the array. Notice that the lowest index of the array is 0. This is a convention that goes back to the C programming language and is referred to as "0 based indexing".

Once declared, we can refer to individual elements of this array using the square bracket notation:

        temperature[0] = 75.2;
        temperature[1] = 68.4;
        temperature[2] = 70.3;
which would modify the array to have the following values:

                                    +-------+
                                [0] |  75.2 |
                    +---+           +-------+
        temperature | +-+--->   [1] |  68.4 |
                    +---+           +-------+
                                [2] |  70.3 |
                                    +-------+
As noted above, the type of this variable is "double[]". In general, you can add empty square brackets after any Java type to form a new type (an array of those type of values). For example, the following are legal types:

        int
        double
        char
        String
Which means that the following are legal types:

        int[]: an array of ints
        double[]: an array of double values
        char[]: an array of characters
        String[]: an array of Strings
In fact, you can apply this idea indefinitely to form what are known as multidimensional arrays:

        int: one int
        int[]: a one-dimensional array of ints
        int[][]: a 2-dimensional grid of ints
        int[][][]: a 3-dimensional collection of ints
        ...
Arrays are always indexed using integers with a starting index of 0, but when constructed, you specify how many elements you would like in this particular array. For example, to have a list of 100 temperatures, we could have instead declared our variable as follows.

        double[] temperature = new double[100];
Java always allocates a contiguous block of memory for each array, which is what allows you to quickly access any element. If you executed the declaration above and put some values into the components of temperature, it would look something like this.

                             +-------+
                        [0]  |  74.3 |
                             +-------+
                        [1]  |  85.2 |
                             +-------+
                        [2]  | 102.3 |
            +---+            +-------+
temperature | +-+--->   [3]  |  99.8 |
            +---+            +-------+
                        [4]  | 101.7 |
                             +-------+
                      [...]  |  ...  |
                             +-------+
                       [99]  |  84.6 |
                             +-------+
Sometimes we draw the array sideways:

                         [0]     [1]     [2]     [3]     [4]    [...]    [99]
            +---+     +-------+-------+-------+-------+-------+-------+-------+
temperature | +-+---> |  74.3 |  85.2 | 102.3 |  99.8 | 101.7 |  ...  |  84.6 |
            +---+     +-------+-------+-------+-------+-------+-------+-------+
The key point is that this represents a contiguous block of memory that can be indexed using a specific array index. For example, temperature[3] refers to the value with index 3, which, because of zero-based indexing, turns out to be the fourth value in the array (in the picture above, the box with the value 99.8 in it).

You are not restricted to simple literal values inside of the brackets. You can use any integer expression. This allows you to combine arrays with loops, which greatly simplifies the code you write. For example, suppose we want to find the average of the values in our array. We could do so by finding the sum and dividing by the length.

        double sum = 0.0;
        sum += temperature[0];
        sum += temperature[1];
        sum += temperature[2];
        sum += temperature[3];
        ...
        sum += temperature[99];
        double average = sum/100;
Obviously this code is very tedious and redundant. The only thing that changes in the various summing statements is the specific index we are accessing. We can use a for loop to capture this redundancy:

        double sum = 0.0;
        for (int i = 0; i < 100; i++)
            sum += temperature[i];
        double average = sum/100;
This is a very concise way to access all elements of the array. The code works when the array has a length of 100, but you can imagine the array having a different length. We can improve this code by referring to the length field of the array. Every array keeps track of its length and we can ask for it by putting ".length" after the name of an array variable (temperature.length in our example). Notice that this is similar to how we ask a String for its length, although there we have to include parentheses as in str.length(). This is one of those unfortunate inconsistencies in Java that you just have to remember.

Using the length field, we can improve the code above as follows.

        double sum = 0.0;
        for (int i = 0; i < temperature.length; i++)
            sum += temperature[i];
        double average = sum/temperature.length;
This code provides a template that you will see often with array processing code: a for loop that starts at 0 and that continues while the loop variable is less than the length of the array, doing something with [i] in the body of the loop.

It is possible to refer to an illegal index of an array, in which case Java throws an exception. For example, if we were to accidentally use "<=" in the for loop above instead of "<", then the loop would attempt to access temperature[100], which is not a legal element of our array. If your code makes such an illegal reference, Java will halt your program with an ArrayIndexOutOfBoundsException.

7.2.1 Program Example--Tally

Suppose you have an input file of integers between 0 and 10 and you want to compute how many of each number are in the input file. To do so, you must first recognize that you are doing eleven separate tallies. You are tallying the occurrences of the number 0, the number 1, the number 2, the number 3, and so on up to and including the number 10. You will more easily solve the overall task if you first solve the task of performing one tally. To tally the occurrences of the number 0, you simply count how often they occur in the input file.

	set count to 0.
	while (more numbers to read) {
	    read a number.
	    if (the number is 0)
		increment total.
        }
	report the value of count.
To solve the task of tallying all eleven numbers, you can use an array with one component for each of the numbers you are tallying. Because the simple tally involves a single int variable, you will want your array to have int components. That way you will have a different int variable for each of the eleven tallying tasks. Here is the declaration you would use for the array.

	int[] count = new int[11];
Here is what count looks like initially:

                   [0]     [1]     [2]     [3]     [4]    [...]    [10]
      +---+     +-------+-------+-------+-------+-------+-------+-------+
count | +-+---> |   0   |   0   |   0   |   0   |   0   |  ...  |   0   |
      +---+     +-------+-------+-------+-------+-------+-------+-------+
You could solve the problem by processing the file once looking for 0's, then on a second pass looking for 1's, and then a third pass looking for 2's, and so on. This is highly inefficient, however. It is more efficient to do all the tallying operations in parallel. Thus, our pseudocode becomes:

	set all counts to 0.
	while (more numbers to read) {
	    read a number.
 	    increment the count for the component corresponding to number.
        }
	report the value of each count.
How do we increment the appropriate component of total depending on the number read in? We use the number read in to index the component we are interested in. Thus, instead of saying

        increment the count for the component corresponding to number.
You can approach Java more by saying

        increment count[number].
One last point you should consider is what happens if number is out of range. What if the file contains a number that is less than 0 or greater than 10? You would obtain an array index out of bounds. There are many ways to account for this problem. One simple solution is to keep track of how many illegal values appear. Thus, the line above is replaced by:

	if (number is in range)
	    increment count[number].
	else
	    increment number of illegal values.
This requires introducing a new variable that tallies the number of illegal values. You would initialize the variable to 0 before the loop.

	set all counts to 0.
        set illegal count to 0.
	while (more numbers to read) {
	    read a number.
	    if (number is in range)
	        increment count[number].
	    else
	        increment number of illegal values.
        }
	report the value of each count.
        report the number of illegal values.
This now can be fairly easily translated into Java. The program introduces a class constant for the maximum value to be tallied.

// This program reads a series of values between 0 and MAX from the user and // reports the frequency of occurrence of each value. import java.io.*; public class Tally { public static final int MAX = 10; // maximum legal value public static void main(String[] args) throws FileNotFoundException { Scanner input = new Scanner(new File("tally.dat")); tally(input); } public static void tally(Scanner input) { int[] count = new int[MAX + 1]; int numBad = 0; while (input.hasNextInt()) { int next = input.nextInt(); if (next >= 0 && next <= MAX) count[next]++; else numBad++; } System.out.println("Value\tOccurrences"); for (int i = 0; i < count.length; i++) System.out.println(i + "\t" + count[i]); System.out.println("Other\t" + numBad); } } Given an input file like the following.

2 8 0 7 4 12 0 2 1 8 7 1 7 4 4 6 79 5 8 5 1 2 7 4 3 2 4 2 7 0 4 5 4 10 2 0 275 93 1 5 4 5 6 8 1 2 6 -999 9 5 7 0 64 3 4 2 3 10 2 4 9 0 5 3 -36 9 1 0 6 2 2 1 6 6 1 3 8 6 The program would produce the following output.

Value Occurrences 0 7 1 8 2 11 3 5 4 10 5 7 6 7 7 6 8 5 9 3 10 2 Other 7

7.3 Variable length data

So far we have discussed an array that is in some sense full: all of its elements were in use. This is not always the case. Often we build an array with a maximum in mind, but only fill it up partially.

Suppose we want to use an array to process an input file of integers where the various input lines have different lengths. Some lines might have many integers on them and some might have very few. Because the lines differ in length, we don't know how big to make the array. Therefore, the best we can do is to set some arbitrary maximum for the length of a line and make the array that large.

Suppose that we assume that no input line has more than 30 integers on it. Then we can define an array of length 30 as follows:

        int[] numbers = new int[30];
To solve this problem we define not only an array for the integers, but also an integer variable to keep track of how many elements of the array are currently being used. We can call this variable "length". If we were to read the following input line:

        18 203 14 9 8 7
We would expect the variables numbers and length to look like this.

                          [0]   [1]   [2]   [3]   [4]   [5]   [6]  [...]  [29]
            +---+       +-----+-----+-----+-----+-----+-----+-----+-----+-----+
    numbers | +-+--->   |  18 | 203 |  14 |  9  |  8  |  7  |  0  | ... |  0  |
            +---+       +-----+-----+-----+-----+-----+-----+-----+-----+-----+

            +---+
    length  | 6 |
            +---+
The integers from the input line appear in the first six cells of this array. Because six cells of the array are in use, we store the value 6 in the memory cell for length. The elements 6 through 29 are 0 in this picture, but they could really have any value at all. By keeping track of the length, we can be sure to use only elements of the array that are currently active.

Let's examine a program that uses this data structure. Consider the problem of reversing the integers on each line of a file. Why does this problem require an extra structure? With a file you can read the numbers from a line in sequential order. You can't, however, read them backwards. An array has no such limitations. To reverse the line you read it into an array and then manipulate the array. You should think of this as a transfer of information. The numbers from the input line are transferred from the file structure to the array structure. Once transferred, you can manipulate the numbers using the array and not worry about the file until it is time to process another line. Thus, your pseudocode description would be:

        while (more lines to process) {
            read next input line.
            transfer line from file to array.
            write array contents in reverse order.
        }
Let's assume that as in the program examples in the last chapter, we will read each line of the file into a String and use that String to construct a Scanner for that particular input line. To transfer numbers from the Scanner for a particular input line into the array, we'd do the following.

        while (more numbers to read)
            read a number into the next array component.
To improve this pseudocode we need to specify what we mean by the next array component. Because we want to store the first number in component [0], the next in component [1], and so on, we need an integer variable to keep track of this. We also need to keep track of the length. Fortunately, we can easily combine these two tasks.

        set length to 0.
        while (more numbers to read) {
            read a number into numbers[length].
            increment length.
        }
Some novices reverse the order of the two lines inside the while loop and that makes a certain amount of sense. It seems like you'd want to increase your length first and then store the value in the array. But remember that arrays use zero-based indexing. That means that length is always one ahead of where you last stored something. In other words:

        store the first value in position [0] and remember a length of 1
        store the second value in position [1] and remember a length of 2
        store the third value in position [2] and remember a length of 3
        store the fourth value in position [3] and remember a length of 4
        ...
As a result, it makes sense to store first and then increment length rather than the other way around.

Writing the contents of the array in reverse order is a simple operation. We run our loop backwards using i-- instead of i++. Below is a complete program that puts all of this together.

// This program takes an input file of numbers and prints the numbers with // each line reversed in order. import java.io.*; public class ReverseLines { public static final int MAX_NUMBERS = 30; // max # of numbers on a line public static void main(String[] args) throws FileNotFoundException { Scanner input = new Scanner(new File("reverse.dat")); processFile(input); } // processes the file line by line, reversing each line public static void processFile(Scanner input) { while (input.hasNextLine()) { Scanner data = new Scanner(input.nextLine()); reverseLine(data); } } // reads the numbers in the given Scanner, writing them in reverse order public static void reverseLine(Scanner data) { int[] numbers = new int[MAX_NUMBERS]; int length = 0; while (data.hasNextInt()) { numbers[length] = data.nextInt(); length++; } for (int i = length - 1; i >= 0; i--) System.out.print(numbers[i] + " "); System.out.println(); } } Given the following input file.

18 203 14 9 8 7 2 3 4 5 6 8 4 12 14 19 23 43 23 33 34 43 23 98 76 36 57 37 28 38 54 58 39 29 39 18 42 19 23 40 702 83 4 2 3 1 0 4 2 The program would produce the following output.

7 8 9 14 203 18 5 4 3 2 4 8 6 39 29 39 58 54 38 28 37 57 36 76 98 23 43 34 33 23 43 23 19 14 12 702 40 23 19 42 18 83 2 4 0 1 3 2 4

7.4 Limitations of arrays

You should be aware of some general limitations of arrays.

There is one distinct advantage of arrays being objects, which is that you can pass them as parameters to another method and that other method will be able to change the contents of the array.

7.5 Initializing arrays

Java has a special syntax for initializing an array when you know exactly what you want to put into it. For example, you could write the following code to initialize an array of ints and an array of Strings.

        int[] numbers = new int[5];
        numbers[0] = 13;
        numbers[1] = 23;
        numbers[2] = 480;
        numbers[3] = -18;
        numbers[4] = 75;

        String[] words = new String[6];
        words[0] = "hello";
        words[1] = "how";
        words[2] = "are";
        words[3] = "you";
        words[4] = "feeling";
        words[5] = "today";
This works, but it's a rather tedious way to declare these arrays. Java provides a shorthand:

        int[] numbers = {13, 23, 480, -18, 75};
        String[] words = {"hello", "how", "are", "you", "feeling", "today"};
You use the curly braces to enclose a series of values that will be stored in the array. Order is important. The first value will go into index 0, the second value will go into index 1 and so on. Java counts how many values you include and constructs an array of just the right size. It then stores the various values into the appropriate spots in the array.

This is one of only two places where Java will construct an object without an explicit call on new. The other place we saw this was with String literals where Java constructs String objects for you without you having to call new. Both of these are conveniences for programmers because these tasks are so common that the designers of the language wanted to make it easy to do them.

7.6 A Longer Program Example: Hours Worked

Let's look at a more complex program example that involves using arrays. Suppose that you have an input file that has data indicating how many hours an employee has worked with each line of the input file indicating the hours worked for a different week. Each week has 7 days, so there could be up to 7 numbers listed on each line. We generally consider Monday to be the start of the work week, so let's assume that each line lists hours worked on Monday followed by hours worked on Tuesday, and so on, and ending with hours worked on Sunday. But we'll allow the lines to have fewer than 7 numbers if someone worked fewer than 7 days. Below is a sample input file that we'll call "hours.txt":

8 8 8 8 8 8 4 8 4 8 4 4 8 4 8 4 8 3 0 0 8 6 4 4 8 8 0 0 8 8 8 8 4 8 4 Let's write a program that reads this input file, reporting totals for each row and each column. The totals for each row will tell us how many hours the person has worked each week. The totals for each column will tell us how many hours the person worked on Mondays versus Tuesdays versus Wednesdays, and so on.

It's pretty obvious that the number 7 is likely to show up in several places in this program, so it makes sense to define a class constant that we can use instead of the magic number:

        public static final int DAYS = 7;  // # of days in a week
Our main method can be fairly short, opening the input file we want to read and calling a method to process the file:

        public static void main(String[] args) throws FileNotFoundException {
	    Scanner input = new Scanner(new File("hours.txt"));
	    processFile(input);
        }
We saw in chapter 6 that for line-oriented data, we can generally use the following pattern as a starting point for file processing code:

	while (input.hasNextLine()) {
	    String text = input.nextLine();
            process text.
	}
This loop allows us to process one line at a time. But each line has internal structure because there are up to 7 numbers on each line. So in this case it makes sense to construct a Scanner for each input line and to write a separate method for processing a line of data:

	while (input.hasNextLine()) {
	    String text = input.nextLine();
	    Scanner data = new Scanner(text);
	    call method to process data.
	}
So what are we supposed to do to process a line of data? One thing we have to do is to add up the hours worked during the week and report that number. If that's all we had to do, then we could solve this as a simple file-processing problem and avoid arrays altogether. But we also have to add up columns, reporting the total hours worked on Monday, Tuesday, and so on.

This is a perfect application for an array. We are, in effect, solving seven different cumulative sum tasks. We want to sum up all hours worked on Monday, all hours worked on Tuesday, and so on. So having an array of 7 integers will allow us to calculate these column totals. So our pseudocode becomes:

	int[] total = new int[DAYS];
	while (input.hasNextLine()) {
	    String text = input.nextLine();
	    Scanner data = new Scanner(text);
	    call method to process data.
            update total to include this week's data.
	}
        report total.
Remember that in cumulative sum we always initialize our sum to 0. When we construct an int array, Java initializes the variables to 0. So our total array will store 7 integers each with value 0:

                       [0]     [1]     [2]     [3]     [4]     [5]     [6]
          +---+     +-------+-------+-------+-------+-------+-------+-------+
    total | +-+---> |   0   |   0   |   0   |   0   |   0   |   0   |   0   |
          +---+     +-------+-------+-------+-------+-------+-------+-------+
So how do we process the data from an input line in such a way that we can report the sum across and also add the values into this total array? The first line of the sample input file is "8 8 8 8 8". There are several approaches we could take, but if we're going to be manipulating an array to keep track of the total for each column, then it makes sense to create a second array that represents the data for the current week. In effect, we transfer the data from the input file into an array. Once we have the data in array form, we can do all sorts of manipulations on it.

In our pseudocode above we indicate that we should "call a method to process data" (the Scanner that contains the next line's worth of data). So let's have that method transfer the data from the Scanner into an array. It should take the Scanner as a parameter and should return a reference to a new array that it constructs. Thus, our pseudocode becomes:

	int[] total = new int[DAYS];
	while (input.hasNextLine()) {
	    String text = input.nextLine();
	    Scanner data = new Scanner(text);
	    int[] next = transferFrom(data);
            do any processing necessary on next.
            update total to include this week's data.
	}
        report total.
The code necessary to transfer data from the Scanner into an array is not particularly complex, but it makes sense to separate this kind of operation into another method to keep the file processing method short and easy to read. We will write the code for transferFrom shortly, but we are actually fairly close to finishing this pseudocode.

We have just three pieces left to fill in. Once we have transfered the data from an input line into a Scanner, what do we do with it? We have to report the total hours for this particular week and we have to add this week's hours into our cumulative sum for the columns. Neither of these operations is particularly complex, but we can again simplify our file-processing code by introducing methods that perform most of the details associated with these tasks. We might often find ourselves wanting to add up the numbers in an array, so it makes sense to have a method called sum that we can use to add up this row. We might also find ourselves wanting to add one array to another (the array equivalent of saying sum += next), so it also makes sense to make that a separate method.

The final element to fill in for our pseudocode is reporting the total. This is something that also can be easily put in a separate method. With these three methods in mind, we can finish translating our pseudocode into actual code:

	int[] total = new int[DAYS];
	while (input.hasNextLine()) {
	    String text = input.nextLine();
	    Scanner data = new Scanner(text);
	    int[] next = transferFrom(data);
	    System.out.println("Total hours = " + sum(next));
	    addTo(total, next);
	}
	System.out.println();
	print(total);

7.6.1 The transferFrom Method

The transferFrom method is given a Scanner as a parameter and is supposed to construct a new array that has the numbers from the Scanner. Each input line has at most 7 numbers, but it might have fewer. As a result, it makes sense to use a while loop that tests whether there are more numbers left to read from the Scanner.

        construct an array of 7 integers.
	while (data.hasNextInt()) {
            process data.nextInt().
        }
In this case, processing the next integer from the input means storing it in the array. The first number should go into position 0, the second number in position 1, the third number in position 2 and so on. That means that we need some kind of integer counter that goes up by one each time through the loop.

        construct an array of 7 integers.
        start i at 0.
	while (data.hasNextInt()) {
            store data.nextInt() in position i of the array.
            increment i.
        }
This is now fairly easy to translate into actual code.

	int[] result = new int[DAYS];
	int i = 0;
	while (data.hasNextInt()) {
	    result[i] = data.nextInt();
	    i++;
	}
	return result;
It may seem odd to have a method return an array, but it makes perfect sense to do so in Java. Arrays are objects, so what is returned is a reference to the array. We have to be careful to specify this properly in our header. We are returning something of type int[], so the header would look like this:

        public static int[] transferFrom(Scanner data)

7.6.2 The sum Method

This is the simplest of the four array-processing methods that we need to write. We simply want to add up the numbers stored in the array. This is a classic cumulative sum problem and we can use a for loop to accomplish the task.

        public static int sum(int[] numbers) {
	    int sum = 0;
	    for (int i = 0; i < numbers.length; i++)
	        sum += numbers[i];
	    return sum;
        }
In the for loop test, we could have used the class constant DAYS. I have instead used the length field of the array itself because this is such a general purpose operation that we might use it for other programs.

7.6.3 The addTo Method

The idea for this method is to write the array equivalent of:

        sum += next;
We will be given two arrays, one with the total hours and one with the next week's hours. Let's consider where we'll be in the middle of processing the sample input file. The first three lines of input are as follows:

        8 8 8 8 8
        8 4 8 4 8 4 4
        8 4 8 4 8
Suppose we have properly processed the first two lines and have just read in the third line for processing. Then our two arrays will look like this:

                       [0]     [1]     [2]     [3]     [4]     [5]     [6]
          +---+     +-------+-------+-------+-------+-------+-------+-------+
    total | +-+---> |  16   |  12   |  16   |  12   |  16   |   4   |   4   |
          +---+     +-------+-------+-------+-------+-------+-------+-------+

                       [0]     [1]     [2]     [3]     [4]     [5]     [6]
          +---+     +-------+-------+-------+-------+-------+-------+-------+
     next | +-+---> |   8   |   4   |   8   |   4   |   8   |   0   |   0   |
          +---+     +-------+-------+-------+-------+-------+-------+-------+
It would be nice if we could just say:

        total += next;
Unfortunately, you can't use operations like "+" and "+=" on arrays. But those operations can be performed on simple integers and these arrays are composed of simple integers. So we basically just have to tell the computer to do 7 different "+=" operations on the individual array elements. This can be easily written with a for loop.

        public static void addTo(int[] total, int[] next) {
	    for (int i = 0; i < DAYS; i++)
	        total[i] += next[i];
        }

7.6.4 The print Method

The final method we have to write involves printing out the column totals (the totals for each day of the week). The basic code involves a simple for loop, as in:

	for (int i = 0; i < DAYS; i++)
	    System.out.println(total[i]);
But we don't want to simply write out 7 lines of output each with a number on it. It would be nice to give some information about what the numbers mean. We know that total[0] represents the total hours worked on various Mondays and total[1] represents the total hours worked on Tuesdays, and so on, but somebody reading our output might not know. So it would be helpful to label the output with some information about which day each total goes with. We can do so by defining our own array of String literals:

	String[] dayNames = {"Mon", "Tue", "Wed", "Thu", "Fri", "Sat", "Sun"};
Using this array, we can improve our for loop to print out the name of each day along with its total. We can also call the sum method to write out the overall total hours worked after the for loop.

    public static void print(int[] total) {
        String[] dayNames = {"Mon", "Tue", "Wed", "Thu", "Fri", "Sat", "Sun"};
	for (int i = 0; i < DAYS; i++)
	    System.out.println(dayNames[i] + " hours = " + total[i]);
	System.out.println("Total hours = " + sum(total));
    }

7.6.4 The Complete Program

Putting all the pieces together, we end up with the following complete program.

// This program reads an input file with information about hours worked and // produces a list of total hours worked each week and total hours worked for // each day of the week. Each line of the input file should contain // information for one week and should have up to 7 integers representing how // many hours were worked each day (Monday first, then Tuesday, through // Sunday). import java.io.*; public class Hours { public static final int DAYS = 7; // # of days in a week public static void main(String[] args) throws FileNotFoundException { Scanner input = new Scanner(new File("hours.txt")); processFile(input); } // processes an input file of data about hours worked by an employee public static void processFile(Scanner input) { int[] total = new int[DAYS]; while (input.hasNextLine()) { String text = input.nextLine(); Scanner data = new Scanner(text); int[] next = transferFrom(data); System.out.println("Total hours = " + sum(next)); addTo(total, next); } System.out.println(); print(total); } // constructs an array of DAYS integers initialized to 0 and transfers // data from the given Scanner into the array in order; assumes the // Scanner has no more than DAYS integers public static int[] transferFrom(Scanner data) { int[] result = new int[DAYS]; int i = 0; while (data.hasNextInt()) { result[i] = data.nextInt(); i++; } return result; } // returns the sum of the integers in the given array public static int sum(int[] numbers) { int sum = 0; for (int i = 0; i < numbers.length; i++) sum += numbers[i]; return sum; } // adds the values in next to their corresponding entries in total public static void addTo(int[] total, int[] next) { for (int i = 0; i < DAYS; i++) total[i] += next[i]; } // prints information about the totals including total hours for each // day of the week and total hours overall public static void print(int[] total) { String[] dayNames = {"Mon", "Tue", "Wed", "Thu", "Fri", "Sat", "Sun"}; for (int i = 0; i < DAYS; i++) System.out.println(dayNames[i] + " hours = " + total[i]); System.out.println("Total hours = " + sum(total)); } } When executed on the sample input file:

8 8 8 8 8 8 4 8 4 8 4 4 8 4 8 4 8 3 0 0 8 6 4 4 8 8 0 0 8 8 8 8 4 8 4 It produces the following output:

Total hours = 40 Total hours = 40 Total hours = 32 Total hours = 25 Total hours = 16 Total hours = 16 Total hours = 32 Mon hours = 43 Tue hours = 32 Wed hours = 36 Thu hours = 40 Fri hours = 34 Sat hours = 8 Sun hours = 8 Total hours = 201

7.7 Arrays in the Java class libraries

You will find arrays appearing throughout the Java class libraries. In fact, there is a special class called Arrays that has a collection of utility programs for manipulating arrays. One particularly handy method in the Arrays class is the method sort. For example, given the following variable declarations:

        int[] numbers = {13, 23, 480, -18, 75};
        String[] words = {"hello", "how", "are", "you", "feeling", "today"};
We could make the following calls:
        Arrays.sort(numbers);
        Arrays.sort(words);
If you then examine the values of the arrays, you will see that the numbers appear in numerically increasing order and the words appear in alphabetically increasing order. Below is a complete program that demonstrates this. The Arrays class is part of the java.util package, so you will see that the program includes an import from java.util.

import java.util.*; public class SortingExample { public static void main(String[] args) { int[] numbers = {13, 23, 480, -18, 75}; String[] words = {"hello", "how", "are", "you", "feeling", "today"}; System.out.println("Before sorting:"); display(numbers, words); Arrays.sort(numbers); Arrays.sort(words); System.out.println("After sorting:"); display(numbers, words); } public static void display(int[] numbers, String[] words) { for (int i = 0; i < numbers.length; i++) System.out.print(numbers[i] + " "); System.out.println(); for (int i = 0; i < words.length; i++) System.out.print(words[i] + " "); System.out.println(); System.out.println(); } } The program produces the following output: Before sorting: 13 23 480 -18 75 hello how are you feeling today After sorting: -18 13 23 75 480 are feeling hello how today you Arrays appear in many other places in the Java class libraries. For example, the String class has a method called toCharArray that will construct an array of characters that stores the individual characters of the array. There is a corresponding constructor for the String class that allows you to construct a String from a character array. As an example, the code below converts a word into a corresponding character array, then sorts the characters in the array, then recombines the characters to make a string:

        public static String sorted(String s) {
            char[] letters = s.toCharArray();
            Arrays.sort(letters);
            String result = new String(letters);
            return result;
        }
This might seem like a strange method to define, but it can be very helpful for solving word puzzles known as anagrams. An anagram of a word is a word that has the same letters but in a different order. For example, the following words are all anagrams of each other: pears, spear, pares, reaps, spare. Using the method above, all of these words would be converted to "aeprs". So to solve an anagram, you can simply go through a dictionary, converting every word to its sorted form and see if it matches the sorted form of the anagram you are working with.

7.8 Arrays of Objects

All of the arrays we have looked at so far have stored primitive values like simple integer values, but as section 7.2 pointed out, you can have arrays of any Java type. Arrays of objects behave slightly differently because of the fact that objects are stored as references rather than as data values.

Consider, for example, the following statement:

        Point[] points = new Point[3];
This declares a variable called points that refers to an array of length 3 that stores references to Point objects. The call on new constructs the array, but it doesn't construct any actual Point objects. Instead it constructs an array of length 3 each element of which can store a reference to a Point. When Java constructs the array, it initializes these array elements to a special value known as "null" that indicates that the variable does not currently point to any object.

                         [0]     [1]     [2]
            +---+     +-------+-------+-------+
     points | +-+---> |   /   |   /   |   /   |
            +---+     +-------+-------+-------+
If we want to have some actual Point objects, they have to be constructed separately with calls on new, as in:

        Point[] points = new Point[3];
        points[0] = new Point(3, 7);
        points[1] = new Point(4, 5);
        points[2] = new Point(6, 2);
After these lines of code execute, we would have individual Point objects that the various array elements refer to:

                                [0]     [1]     [2]
                   +---+     +-------+-------+-------+
            points | +-+---> |   |   |   |   |   |   |
                   +---+     +---+---+---+---+---+---+
                                 |       |       |
                +----------------+       |       +------------------+
                |                        |                          |
                V                        V                          V
    +--------------------+    +--------------------+    +--------------------+
    |   +----+    +----+ |    |   +----+    +----+ |    |   +----+    +----+ |
    | x |  3 |  y |  7 | |    | x |  4 |  y |  5 | |    | x |  6 |  y |  2 | |
    |   +----+    +----+ |    |   +----+    +----+ |    |   +----+    +----+ |
    +--------------------+    +--------------------+    +--------------------+
Notice that we need four different calls on new because there are four objects to be constructed: the array itself and the three individual Point objects. We could also use the curly brace notation for initializing the array in which case we don't need the explicit call on new to construct the array itself:

        Point[] points = {new Point(3, 7), new Point(4, 5), new Point(6, 2)};
We have seen a reference an array of objects ever since we wrote the "hello world" program of chapter 1. Whenever we define a main method, we are required to include as its parameter "String[] args", which is an array of String objects. This array is initialized by Java itself if the user provides what are known as "command line arguments" when Java is invoked. For example, normally a Java class called DoSomething would be started from the command interface by a command like this:

        java DoSomething
The user has the option to type extra arguments, as in:

        java DoSomething temperature.dat temperature.out
In this case the user has specified two extra arguments that are file names that the program should use (e.g., the names of an input and output file). If the user types these extra arguments when starting up Java, then the "String[] args" parameter to main will be initialized to an array of length 2 storing these two strings:

                      [0]     [1]
         +---+     +-------+-------+
    args | +-+---> |   |   |   |   |
         +---+     +---+---+---+---+
                       |       |
              +--------+       +-------+
              |                        |
              V                        V
    +-------------------+    +-------------------+
    | "temperature.dat" |    | "temperature.out" |
    +-------------------+    +-------------------+

7.9 ArrayLists

Arrays in general and arrays of objects in particular are so useful that Java has a special class called an ArrayList that takes care of a lot of the low-level details for you. As its name implies, an ArrayList uses an array as its underlying data structure. It can be used to store variable length data which means that, as described in section 7.3, each ArrayList object keeps track of an array along with its current length.

You build up an ArrayList by constructing it and adding values to it:

        ArrayList list = new ArrayList();
        list.add("hello");
        list.add("world");
        list.add("this is fun");
This code constructs an empty ArrayList and then adds three Strings to the list. ArrayLists can be printed with a simple println, as in:

        System.out.println(list);
or:

        System.out.println("list = " + list);
If we execute this println after executing the code above that adds three Strings to the list, we get the following output:

        list = [hello, world, this is fun]
The ArrayList class also provides methods for adding values in the middle of the list or removing values from the list. These operations preserve the order of other list elements, shifting values left or right as appropriate. To indicate positions within the list, we use the Java convention of indexing starting at 0 for the first value, 1 for the second value and so on. For example, given the list above, consider the effect of inserting a value at index 1:

        list.add(1, "middle value");
        System.out.println("now list = " + list);
The following output is produced:

        now list = [hello, middle value, world, this is fun]
So now the list has a total of four values. Consider what happens if we now remove the value at position 0 and then remove the value at position 1:

        list.remove(0);
        list.remove(1);
        System.out.println("and now list = " + list);
The output is as follows:

        and now list = [middle value, this is fun]
This result is a little surprising. We asked the list to remove the value at position 0 and then to remove the value at position 1. You might imagine that this would get rid of the Strings "hello" and "middle value" since they were at positions 0 and 1, respectively, before this code was executed. But you have to remember that an ArrayList is a dynamic structure where values are moving around and shifting into new positions. The first call on remove does remove the String "hello" because it's the value currently in position 0. But once that value is removed, everything else shifts over. So the String "middle value" moves to the front (to position 0) and the String "world" shifts into position 1. So when the second call on remove is performed, Java removes "world" from the list because it is the value that is in position 1 at that point in time.

As another example, consider the following code that creates an ArrayList and stores several words in it:

        ArrayList words = new ArrayList();
        words.add("four");
        words.add("score");
        words.add("and");
        words.add("seven");
        words.add("years");
        words.add("ago");
        System.out.println("words = " + words);
This code produces the following output:

        words = [four, score, and, seven, years, ago]
Suppose that we want to insert a String containing a single asterisk in front of each word ("*"). We expect to double the length of the String because each of these words will have an asterisk in front of it if our code works properly. Here is a first attempt that makes sense intuitively:

        for (int i = 0; i < words.size(); i++)
            words.add(i, "*");
        System.out.println("after loop words = " + words);
Notice that to get the number of elements in the ArrayList we call a method "size". This is an example of an unfortunate inconsistency in the Java class libraries. We have seen a total of three different conventions now for the same kind of task. Suppose, for example, that you have a variable called x. Then:

It would have been much simpler if the folks at Sun had decided on a single convention and used it consistently, but unfortunately they didn't and so we just have to remember when to use which convention.

So let's get back to the for loop. It has the usual array-style for loop with an index variable i that starts at 0 and goes up by one each time. In this case it is inserting an asterisk at position i each time through the loop. The problem is that the loop never terminates. If you're patient enough you will find that it actually does terminate but only after using up all available memory.

The problem is once again the fact that the ArrayList is a dynamic structure where things move around to new positions. Let's think about this carefully to see what is going on. Initially we have this list with the String "four" in position 0:

        [four, score, and, seven, years, ago]
The first time through the loop, we insert an asterisk at position 0. This shifts the String "four" one to the right, so that it is now in position 1.

        [*, four, score, and, seven, years, ago]
Then we come around the for loop and increment i to be 1. And we insert an asterisk at position 1. But notice that the word "four" is currently at position 1, which means that this second asterisk also goes in front of the word "four", shifting it into position 2:

        [*, *, four, score, and, seven, years, ago]
We go around the loop again, incrementing i to be 2 and once again inserting an asterisk at that position, which is once again in front of the word "four":

        [*, *, *, four, score, and, seven, years, ago]
This continues indefinitely because we keep inserting asterisks in front of the first word in the list. The for loop test compares i to the size of the list, but because the list is growing, the size keeps going up. So this process continues until we exhaust all available memory.

To fix this loop, we have to realize that inserting an asterisk at position i is going to shift everything one to the right. So on the next iteration of the loop, we will want to deal with the position two to the right, not one to the right. So we can fix the loop simply by changing the update part of the for loop to add 2 to i instead of adding 1 to i:

        for (int i = 0; i < words.size(); i += 2)
            words.add(i, "*");
        System.out.println("after loop words = " + words);
When we execute this version of the code, we get the following output:

    after loop words = [*, four, *, score, *, and, *, seven, *, years, *, ago]
The ArrayList is defined in terms of a generic class called Object (explained in more detail in chapter 9). As a result, we can store many different kinds of data in an ArrayList. We can store String values, as described above, but we can also store Point objects or any other kind of object. One down side of this generic nature of the ArrayList is that Java can get confused when we try to get values back out of an ArrayList. The ArrayList has a method called "get" that allows you to get a value at a specific index. The return type for the "get" method is defined as Object (the generic type). That's because the ArrayList itself is defined generically so that it can be used for any kind of Object. As a result, we often have to use a type cast to let Java know what kind of object is coming out of the list.

Let's look at an example of this. Suppose you initialize an ArrayList as follows:

        ArrayList names = new ArrayList();
        names.add("Erica Kane");
        names.add("Adam Chandler");
        names.add("Maria Santos");
Suppose we want to add up the lengths of these names using a loop:

        int sum = 0;
        for (int i = 0; i < names.size(); i++)
            sum += names.get(i).length();
        System.out.println("Total of name lengths = " + sum);
This code makes logical sense. We know that we put three String values into this list and so we can get each value in turn and call its length method, adding them up as we do so. But this code does not compile. The problem is that even though we know that the ArrayList has Strings in it, Java doesn't know that. From Java's point of view these are generic objects and so it has no way of knowing that these objects will have a length method. For example, what if the ArrayList had been filled up with Point objects? They don't have a length method, so the call on length wouldn't make sense.

The way to avoid this problem is to cast the result of the call on "get" to String. This informs the Java compiler that the value coming out of the ArrayList will be of type String. It's a way of saying, "Trust me, Java, the value that comes out of this ArrayList will be of type String." Java is not a particularly trusting language, so in fact Java will take your word for it temporarily, but checks to make sure that your claim ends up being true. If during executing Java finds that something other than a String comes back, then it throws a ClassCastException.

So this is the line of code that is causing the problem:

        sum += names.get(i).length();
And we want to fix it by casting the result of the call on "get" to a String:

        sum += (String)names.get(i).length();
Even this doesn't work because of precedence issues. The "dot" has higher precedence than the cast, so Java tries to call the length method anyway before applying the cast. So we'd need an extra set of parentheses to force the cast to be performed before Java tries to call the length method:

        sum += ((String)names.get(i)).length();
This compiles correctly. This kind of expression is complicated enough that often people choose to separate the casting part of this from the call on length:

        String s = (String)names.get(i);
        sum += s.length();
The ArrayList class is part of the java.util package, so to include it in a program, you would have to include an import declaration.

Below is a table of some of the most common ArrayList operations. A more complete list can be found in the online Java documentation.

Useful Methods in the ArrayList Class
Method Description Example
add(Object value) adds value at the end of the list list.add("end");
add(int index, Object value) adds value at the given index, shifting subsequent values right list.add(1, "middle");
clear() makes the list empty list.clear();
get(int index) gets the value at the given index (return type is Object) list.get(1)
indexOf(Object value) returns the index of the first occurrence of the given value in the list (-1 if not found) list.indexOf("hello")
remove(int index) removes the value at the given index, shifting subsequent values left list.remove(1);
set(int index, Object value) replaces the value at given index with given value list.set(2, "hello");
size() returns the current number of elements in the list list.size()

7.10 Programming problems

  1. Java's type int has a limit on how large of an integer it can store. This limit can be circumvented by representing an integer as an array of digits. Write an interactive program that adds two integers of up to 50 digits each.

  2. Personal mailing labels can prove quite useful. Write a program that reads a five-line address from an input file and produces an output file with the address repeated 50 times in three columns.

  3. Write a program that reads an input file of numbers and reports the average of the numbers as well as how far each number is from the average.

  4. Write a program that plays a variation of the game of Mastermind with a user. For example, the program can use pseudorandom numbers to generate a 4-digit number. The user should be allowed to make guesses until the user gets the number correct. Clues should be given to the user indicating how many digits of the guess are correct and in the correct place, and how many are correct but in the wrong place.