CSE143 Notes for Monday, 4/4/11

I said that we were beginning a week-long discussion of how to use the most important structures in the Java Collections Framework. In other words, we're going to learn how to be clients of the collections classes.

I first discussed the built-in ArrayList class. Remember that we're studying the ArrayIntList class as a way to understand the built-in ArrayList class. We know that for arrays, it is possible to construct arrays that store different types of data:

But arrays are a special case in Java. If they didn't already exist, we couldn't easily add them to Java. Instead, Java now allows you to declare generic classes and generic interfaces. For example, the ArrayList class is similar to an array. Instead of declaring ordinary ArrayList objects, we declare ArrayList<E> where E is some type (think of E as being short for "Element type"). The "E" is a type parameter that can be filled in with the name of any class.

For example, suppose, we want an ArrayList of Strings. We describe the type as:

        ArrayList<String>
When we construct an ArrayIntList, we say:
        ArrayIntList list = new ArrayIntList();
Imagine replacing both occurrences of "ArrayIntList" with "ArrayList<String>" and you'll see how to construct an ArrayList<String>:
        ArrayList<String> list = new ArrayList<String>();
Once you have declared an ArrayList<String>, you can manipulate it with the kinds of calls we have made on our ArrayIntList but using Strings instead of ints:

        ArrayList<String> list = new ArrayList<String>();
        list.add("four");
        list.add("score");
        list.add("seven");
        list.add("years");
        list.add("what was next?");
        list.add("ago");
        list.add(2, "and");
        list.remove(5);
        System.out.println("list = " + list);
        System.out.println(list.indexOf("seven"));
which produces this output:

        list = [four, score, and, seven, years, ago]
        3
All of the methods we have seen with ArrayIntList are defined for ArrayList: the appending add, add at an index, remove, size, get, etc. So we could write the following loop to print each String from an ArrayList<String>:

        for (int i = 0; i < list.size(); i++) {
            System.out.println(list.get(i));
        }
Then I discussed the idea of interfaces. An interface is a description of a set of behaviors. For example, all of the behaviors we have just discussed that are included in the ArrayList<E> class are also included in an interface known as List<E>. The List<E> interface says that a list has to have an appending add method, a method to add at an index, a method to remove at an index, a get method, a size method, an indexOf method, and so on.

But there are different ways to implement a list. We have been looking at how to implement it using an array. Next week we will look at how to implement it using something called a linked list. To make your programs flexible, you should declare your variables, parameters, fields, and method return types using interfaces. So instead of saying:

        ArrayList<String> list = new ArrayList<String>();
you should instead say:
        List<String> list = new ArrayList<String>();
With this declaration, the variable list is more flexible. It can store a reference to any list, not just an ArrayList. I mentioned that the best analogy I have for interfaces is that they are similar to how we use the concept of certification. You can't claim to be a certified doctor unless you have been trained to do certain specific tasks. Similarly, to be a certified teacher you have to know how to behave like a teacher, to be a certified nurse you have to know how to behave like a nurse, and so on. In Java, if you want to claim to be a certified List<E>, then you have to have several different methods. I then mentioned that this is an idea that has been used throughout the collections classes in Java (the java.util package). This idea is stressed by Joshua Bloch, the author of Effective Java. Joshua Bloch was the primary architect of the collections framework and has influenced much of Sun's work.

In the collections framework, Bloch was careful to define data structure abstractions with interfaces. For example, there are interfaces for List, Set and Map which are abstractions that we'll be discussing this week. class web page (under "useful links"). His item 52 is to "Refer to objects by their interfaces." He says, "you should favor the use of interfaces rather than classes to refer to objects. If appropriate interface types exist, parameters, return values, variables and fields should all be declared using interface types." This last sentence was in bold face in the book, indicating how important Bloch thinks this is, and I've reproduced that here. He goes on to say that, "The only time you really need to refer to an object's class is when you're creating it."

For now, this will mostly be a style issue for us. In a few weeks we will look at how interfaces are actually defined. For now, just realize that we will require you to use interface types for defining variables, fields, parameters, and method return types.

Then I then spent a little time discussing the issue of primitive data versus objects. Even though we can construct an ArrayList<E> for any class E, we can't construct an ArrayList<int> because int is a primitive type, not a class. To get around this problem, Java has a set of classes that are known as "wrapper" classes that "wrap up" primitive values like ints to make them an object. It's very much like taking a candy and putting a wrapper around it. For the case of ints, there is a class known as Integer that can be used to store an individual int. Each Integer object has a single data field: the int that it wrapped up inside.

Java also has quite a bit of support that makes a lot of this invisible to programmers. If you want to put int values into an ArrayList, you have to remember to use the type ArrayList<Integer> rather than ArrayList<int>, but otherwise Java does a lot of things for you. For example, you can construct such a list and add simple int values to it:

        List<Integer> numbers = new ArrayList<Integer>();
        numbers.add(18);
        numbers.add(34);
In the two calls on add, we are passing simple ints as arguments to something that really requires an Integer. This is okay because Java will automatically "box" the ints for us (i.e., wrap them up in Integer objects). We can also refer to elements of this list and treat them as simple ints, as in:

        int product = numbers.get(0) * numbers.get(1);
The calls on list.get return references to Integer objects and normally you wouldn't be allowed to multiply two objects together. In this case Java automatically "unboxes" the values for you, unwrapping the Integer objects and giving you the ints that are contained inside.

Every primitive type has a corresponding wrapper class: Integer for int, Double for double, Character for char, Boolean for boolean, and so on.

Then I discussed the for-each loop. If we have an array that we have constructed using the array initializer sytax:

        int[] data = {18, 4, 97, 3, 4, 18, 72, 4, 42, 42, -3};
If we wanted to print these numbers, we could use a standard traversal loop:
        for (int i = 0; i < data.length; i++) {
            System.out.println(data[i]);
        }
This approach works, but there is a simpler way to do this. If all you want to do is to iterate over the values of an array one at a time, you can use what is called a for-each loop:

        for (int n : data) {
            System.out.println(n);
        }
We generally read the for loop header as, "For each int n in data...". The choice of "n" is arbitrary. It defines a local variable for the loop. I could just as easily have called it "x" or "foo" or "value". Notice that in the for-each loop, I don't have to use any bracket notation. Instead, each time through the loop Java sets the variable n to the next value from the array. I also don't need to test for the length of the array. Java does that for you when you use a for-each loop.

There are some limitations of for-each loops. You can't use them to change the contents of the list. If you assign a value the variable n, you are just changing a local variable inside the loop. It has no effect on the array itself.

As with arrays, we can use a for-each loop for ArrayLists, so we could say:

        for (String s : list) {
            System.out.println(s);
        }
which produces this output for the list we defined earlier:

        four
        score
        and
        seven
        years
        ago
I then reminded people that you can also traverse a list using an iterator. For our ArrayIntList class, we had a special class called ArrayIntListIterator for the iterator. With a structure like an ArrayList, we use an interface. Not surprisingly, the interface is called Iterator<E>. So we can also traverse our list of strings this way:

        Iterator<String> i = list.iterator();
        while (i.hasNext()) {
            System.out.println(i.next());
        }
Finally, I mentioned that we will be looking at a kind of structure known as a Set. There is an interface Set<E>. For now, all of the sets we will construct all of our sets using the TreeSet<E> class. For example, to make a set of integers using our array of data, we can say:

        int[] data = {18, 4, 97, 3, 4, 18, 72, 4, 42, 42, -3};
        Set<Integer> s = new TreeSet<Integer>();

        for (int n : data) {
            s.add(n);
        }
        System.out.println("set = " + s);
This produced the following output:

        set = [-3, 3, 4, 18, 42, 72, 97]
There are two major differences between a set and a list. Sets don't allow duplicates. So the duplicate values like 42 and 4 in the array appear just once in the set. Sets also don't allow the client to control the order of elements. The TreeSet class keeps things in sorted order. So the numbers will always be in that order. If you want to control the order, then you should use a list instead.

Sets have many of the same methods that lists do. You can add to a set, get its size, ask for an iterator, use it with a foreach loop. But it doesn't have a notion of indexing. So you can't remove at an index. Instead you remove a specific value. And you can't get at a specific index. Instead you use an iterator or a foreach loop.

I said that this would be much clearer in section when we practice writing code that manipulates sets. Chapter 13 also has a useful table of set operations.


Stuart Reges
Last modified: Tue Apr 5 10:13:23 PDT 2011