CSE143 Notes for Friday, 5/25/12

I returned to our case study using the IntList class that we've been discussing all quarter. I reminded people that I'm discussing these classes as a way to understand the ArrayList<E> and LinkedList<E> classes and the List<E> interface that are part of the collections framework in the java.util package. Our versions use simple ints, but they have the same kind of methods and are implemented in a very similar manner to the others.

I first gave a recap of what we've seen. We've been looking at an array-based class called ArrayIntList that has a number of operations for manipulating a list of integers. We discussed a variation of ArrayIntList called LinkedIntList that uses a linked list instead of an array to store the data. We saw that we can capture the "int list" abstraction by defining an IntList interface that both classes implement. So we end up with a generic IntList interface and we have two specific implementations: ArrayIntList and LinkedIntList.

I began by showing a new version of the IntList interface with some extra operations:

        public interface IntList {
            public int size();
            public int get(int index);
            public int indexOf(int value);
            public boolean isEmpty();
            public boolean contains(int value);
            public void add(int value);
            public void add(int index, int value);
            public void addAll(IntList other);
            public void remove(int index);
            public void removeAll(IntList other);
            public void set(int index, int value);
            public void clear();
        }
We expect that each of our implementations will provide all of this functionality. As we go to implement these different operations, we'll find that some of them are quite different. For example, when we write the "get" method to return a value at a particular index, for the array we'll be able to just ask for elementData[index], but for the linked list, we'll have to start at the beginning of the list and keep doing some kind of "current = current.next" operation to position ourselves to the right spot in the list. So the implementations of "get" will be very different in the two classes.

But what about a method like addAll? It is supposed to add all of the values from one list to another list. We'll probably build it on top of low-level operations like add. So perhaps we'll write the same code for each class.

How do we eliminate redundancy? Should we change the interface to an abstract class instead? Someone said we want both and I said that's a good idea. Because IntList is an interface, anyone can implement their own version of IntList in any way they want. If we were to change it to an abstract class, then we'd be forcing people to extend our class. You only get one inheritance relationship, so it would be very annoying for someone to tell you that it has to be used to extend a particular abstract class.

Instead, we have both. We keep IntList as an interface, but we also introduce an abstract class that we can use to factor out common code between our two implementations:

              AbstractIntList----(implements)----> IntList
               /          \
              /            \
        ArrayIntList  LinkedIntList
This is a very flexible approach. The abstract class allows us to eliminate any redundant code for our two implementations. Anyone who wants to can extend our abstract class as well to take advantage of that common code. But they can also do something completely different with no connection to our abstract class as long as they implement the IntList interface. This pattern is so useful that you'll find it used throughout the collections framework. For example, there is a Map interface that has two implementations called TreeMap and HashMap and each of the implementations extend a class called AbstractMap. There is a similar structure for sets and lists.

Then I turned to the question of how we would implement an operation like addAll to include in the AbstractIntList class. How do we do that? The idea is to get values from the second list to add them to "this" list. How do we get those values? Someone said we'd call get. We could do so, writing code along these lines:

        for (int i = 0; i < other.size(); i++)
            add(other.get(i));
This will work, but there is a problem with it. The idea is that we're writing one version of the code that works for both implementations. This will work fine for the array-based implementation, but it will end up being very slow for the linked list implementation.

Think about how get will be implemented for the linked list. We'll start a variable current at the front of the list and we'll move along until we get to the right spot. This is fairly quick when you want to get to something towards the front of the list. But what happens when you have a really long list? You might find yourself asking the list for the element at index 1000. That takes a lot of work (moving current over 1000 times). Then you'd ask it for the value at index 1001, which requires again a lot of work (moving current over 1001 times).

In fact, writing the code this way will turn addAll into an O(n2) operation for the linked list operation. We obviously don't want it to be that slow if we can avoid it. So how do we access the values more efficiently? Someone mentioned iterators. But what kind of iterator? I showed people Java's generic iterator interface:

        public interface Iterator<E> {
            public boolean hasNext();
            public E next();
            public void remove();
        }
Someone said that we want an Iterator<Integer>. But that means we would need to add something to the IntList interface that indicates that we have a method that we can call to produce an iterator. The convention in Java is to call the method "iterator":

        public interface IntList {
            ...
            public Iterator<Integer> iterator();
        }
If we assume this is part of the interface, then we can write the following code:

        public void addAll(IntList other) {
            Iterator<Integer> i = other.iterator();
            while (i.hasNext())
                add(i.next());
        }
Then I reminded people about the foreach loop. Sun added this with Java 5. It has a simpler syntax than the while loop above. There is an interface in Java that is known as the Iterable interface. Saying that you implement the Iterable interface means that you can produce an iterator:

        public interface Iterable<E> {
            public Iterator<E> iterator();
        }
So we modified the header for the IntList interface to specify that it also implements the Iterable interface (remember that you use the "extends" keyword when one interface is being related to another interface):

        public interface IntList extends Iterable<E> {
            ...
        }
With this change, we were ablew to use a foreach loop for a method like addAll:

        public void addAll(IntList other) {
	    for (int i: other)
	        add(i);
        }
This version works exactly the same as the previous version, because the foreach loop is implemented using iterators.

We then spent a few minutes looking at the implementation of the removeAll method. This method takes an IntList as a parameter and removes all values from this list that appear in the other list. Here is the code:

        public void removeAll(IntList other) {
	    Iterator<Integer> i = iterator();
	    while (i.hasNext())
	        if (other.contains(i.next()))
		    i.remove();
        }
This method works by calling a method a method the iterator class called remove. The idea is that while you are iterating over a list, you might want to remove the last value you saw. It's like saying, "I didn't like that last value...get rid of it."

The iterator's remove method is supposed to remove the value that was most recently returned by a call on next. As a result, it's not legal to call remove two times in a row or to call remove before next has been called. I asked people to think about how this would be implemented. Someone said we could use a boolean variable that keeps track of whether it's okay to remove. When the iterator is first constructed, it is not okay to call remove. Once you call next, then remove becomes okay. But then when you call remove, okay goes back to false, to prevent two calls on remove in a row. This is exactly how it is implemented.

We then spent a few minutes looking at the new version of the ArrayIntList class. We saw that the iterator method constructs an object of type ArrayIterator. ArrayIterator is a private inner class. Because it is defined inside the ArrayIntList class, each instance of the class has access to the ArrayIntList object that constructs it. This allows it to call methods like size to decide what value hasNext should return.

I mentioned that I have posted under the handouts tab a new version of the LinkedIntList handout and that we would briefly discuss it in section.


Stuart Reges
Last modified: Fri May 25 11:19:50 PDT 2012