CSE143 Notes for Monday, 4/5/21

I said that I wanted to finish up our discussion of the ArrayIntList class. We ended with a pretty good version of the class to serve as a guide for the first homework, but I mentioned that there is at least one important method that we were missing.

I asked people to consider the situation where a client wants to replace a value at a particular location. The only option we have given the client is to remove the value and then to add a new value back in its place. This requires shifting values twice, which can be very inefficient if the list is long and if the change occurs towards the front of the list. So I said that we would include a method called set that can be used to replace the value at a given inde:

        public void set(int index, int value) {
	    elementData[index] = value;
        }
Of course, we have to indicate the precondition on the index and we have to check the index to make sure it is legal. We introduced a private method called checkIndex that performs the check for us:

    // pre : 0 <= index < size()
    // post: replaces the integer at the given index with the given value
    public void set(int index, int value) {
        checkIndex(index);
        elementData[index] = value;
    }
I mentioned that the new version also has a method called clear that returns the list to being empty by resetting the size to 0:

    // post: list is empty
    public void clear() {
        size = 0;
    }
In the earlier version we had a method called addAll that added all of the values in a second ArrayIntList at the end of this ArrayIntList. I spent some time discussing how to write the corresponding removeAll method.

I asked for suggestions and someone suggested going through every value of the other list and removing it from the first list as long as the first list still contains it. That approach would work, but I said that I wanted to consider doing it the other way because we eventually want to get to an efficient version and that approach won't lead there.

So instead the pseudocode version of what we want to do is:

        for (all values in this list) {
            if (the other list contains this value) {
                remove this value
            }
        }
We translated this into corresponding code:

        for (int i = 0; i < size; i++) {
            if (other.contains(elementData[i])) {
                remove(i);
            }
        }
This version didn't work. I ran a testing program that produced this output:

        original values:
            list1 = [1, 2, 3, 4, 5, 2, 2, 3, 4, 4, 4, 4, 5, 6]
            list2 = [2, 4, 6, 8]
        after the call list1.removeAll(list2):
            list1 = [1, 3, 5, 2, 3, 4, 4, 5]
            list2 = [2, 4, 6, 8]
        list1 should be = [1, 3, 5, 3, 5]
After the call on RemoveAll, list1 still contains values it shouldn't, like 1, 6, and 9. What happened? We puzzled over it a bit and someone mentioned that it was skipping values. Because we are calling remove, we are shifting values to the left. For example, suppose that i is equal to 5 and we are removing the value at that index. We shift a new value into index 5 when we do that, and then the for loop increments i to be 6. So we skip looking at that value. An easy fix is to decrement i when we remove:

        for (int i = 0; i < size; i++) {
            if (other.contains(elementData[i])) {
                remove(i);
                i--;
            }
        }
That version works. Another way to fix it is to run the loop backwards. When you do that, the values being shifted are values we have already examined, so we don't end up missing any. So this version works as well:

        for (int i = size - 1; i >= 0; i--) {
            if (other.contains(elementData[i])) {
                remove(i);
            }
        }
I said that both of these solutions are inefficient. Someone mentioned that it would be easier if we had a temporary array to work with. The idea would be to build up a new list of values that are the ones to keep from the original, placing them into a temporary array. So our pseudocode would be:

        make a new temporary array
        for (all values in list) {
            if (value is not in the other list) {
                add the value to the temporary array
            }
        }
We need to do a bit more, because we need to keep track of where to put values in the temporary array. The first value to be moved will go into index 0, the next one will go into index 1, the next one into index 2, and so on. In our ArrayIntList we manage this by keeping track of the current size of the list and we can apply the same idea here:

        int[] temp = new int[elementData.length];
        int newSize = 0;
        for (int i = 0; i < size; i++) {
            if (!other.contains(elementData[i])) {
                temp[newSize] = elementData[i];
                newSize++;
            }
        }
        // copy values back from temporary to original
We would have to figure out how to copy values back. But there is an easier way. It turns out we don't need a temporary array at all. We can use elementData itself. Consider, for example, a case where the list1 and list2 have the following values:

        list1 : [3, 8, 5, 7, 0, 2, 4, 3, 9, 7]
        list2 : [0, 1, 2, 3, 5, 6, 8, 9]
When we make the call list1.removeAll(list2), the only values we should have left are the values 4 and 7. Think of how those values will be shifted from the original version of elementData to the new version:

        +---+---+---+---+---+---+---+---+---+---+
        | 3 | 8 | 5 | 7 | 0 | 2 | 4 | 3 | 9 | 7 |
        +---+---+---+---+---+---+---+---+---+---+
         [0] [1] [2] [3] [4] [5] [6] [7] [8] [9]
                      |           |           |
          +-----------+           |           |
          |   +-------------------+           |
          |   |   +---------------------------+
          |   |   |
          V   V   V
        +---+---+---+---+---+---+---+---+---+---+
        | 7 | 4 | 7 | - | - | - | - | - | - | - |
        +---+---+---+---+---+---+---+---+---+---+
         [0] [1] [2] [3] [4] [5] [6] [7] [8] [9]
I am displaying the indexes starting at 3 with a dash instead of a number because it doesn't matter what values are stored there because we are going to reset the size to be 3 when we are done. Notice that we are always copying values to earlier spots in the array. That means that we don't need a second temporary array. We can use elementData itself. But we have to be careful to reset the size after we finish copying the values to be retained in the list:

        public void removeAll(ArrayIntList other) {
            int newSize = 0;
            for (int i = 0; i < size; i++) {
                if (!other.contains(elementData[i])) {
                    elementData[newSize] = elementData[i];
                    newSize++;
                }
            }
            size = newSize;
        }
Then I said I wanted to discuss how to implement an iterator. Recall that an iterator as having three basic operations:

I gave an example of client code we might want to write:

	Iterator<Integer> i =  list.iterator();
	int product = 1;
	while (i.hasNext()) {
	    int next = i.next();
	    product = product * next;
	}
	System.out.println("product = " + product);
this variation of the code prints each value and removes any occurrences of values that are multiples of 3:

	Iterator<Integer> i =  list.iterator();
	int product = 1;
	while (i.hasNext()) {
	    int next = i.next();
	    product = product * next;
            if (next % 3 == 0) {
                i.remove();
            }
	}
	System.out.println("product = " + product);
This code examines each value in the list and removes all the multiples of 3.

Then we spent some time discussing how the ArrayIntListIterator is implemented. The main function the iterator performs is to keep track of a particular position in a list, so the primary field will be an integer variable for storing this position:

        public class ArrayIntListIterator {
            private int position;

            public ArrayIntListIterator(?) {
                position = 0;
            }

            public int next() {
                position++;
            }

            ...
        }
I asked people how we would implement hasNext and someone said we'd have to compare the position against the size of the list. I then said, "What list?" Obviously the iterator also needs to keep track of which list it is iterating over. We can provide this information in the constructor for the iterator. So the basic outline became:

        public class ArrayIntListIterator {
            private ArrayIntList list;
            private int position;

            public ArrayIntListIterator(ArrayIntList list) {
                position = 0;
                this.list = list;
            }

            public int next() {
                use get method of list & position
                position++;
            }

            public boolean hasNext() {
                check position against size
            }

            ...
        }
We briefly discussed how to implement remove. We have to keep track of when it's legal to remove a value. Recall that you can't remove before you have called next and you can't call remove twice in a row. We decided that this could be implemented with a boolean flag inside the iterator that would keep track of whether or not it is legal to remove at any given point in time. Using this flag, we can throw an exception in remove if it is not legal to remove at that point in time:

        public class ArrayIntListIterator implements Iterator<Integer> {
            private ArrayIntList list;
            private int position;
            private boolean removeOK;
        
            public ArrayIntListIterator(ArrayIntList list) {
                position = 0;
                this.list = list;
                removeOK = false;
            }
        
            public int next() {
                use get method of list & position
                position++
                removeOK = true;
            }
        
            public boolean hasNext() {
                check position against size
            }
        
            public void remove() {
                if (!removeOK)
                    throw new IllegalStateException()
                call remove method on list
                removeOK = false;
            }
        }
This is a fairly complete sketch of the ArrayIntListIterator code. The calendar includes a complete version. You will notice some odd details that will make more sense after we have learned more about the collections framework (e.g., the class implements the Iterator<Integer> interface and the next method returns a value of type Integer instead of int).

Then I discussed the fact that the new version of the list "grows" the list as needed if it runs out of capacity. It isn't, in general, easy to make an array bigger. We can't simply grab the memory next to it because that memory is probably being used by some other object. Instead, we have to allocate a brand new array and copy values from the old array to the new array. This is pretty close to how shops and other businesses work in the real world. If you need some extra space for your store, you can't generally break down the wall and grab some of the space from the store next door. More often a store has to relocate to a larger space.

The new version of ArrayIntList has this functionality built into it. In the previous version we manually checked the capacity and threw an exception if the array wasn't big enough. In the new version that has been replaced by an ensureCapacity method that constructs a new array if necessary.

Obviously you don't want to construct a new array too often. For example, suppose you had space for 1000 values and found you needed space for one more. You could allocate a new array of length 1001 and copy the 1000 values over. Then if you find you need space for one more, you could make an array that is 1002 in length and copy the 1001 old values over. This kind of growth policy would be very expensive.

Instead, we do something like doubling the size of the array when we run out of space. So if we have filled up an array of length 1000, we double its size to 2000 when the client adds something more. That makes that particular add expensive in that it has to copy 1000 values from the old array to the new array. But it means we won't need to copy again for a while. How long? We can add another 999 times before we'd need extra space. As a result, we think of the expense as being spread out or "amortized" over all 1000 adds. Spread out over 1000 adds, the cost is fairly low (a constant).

You will find that the built-in ArrayList class does something similar. The documentation is a little coy about this saying, "The details of the growth policy are not specified beyond the fact that adding an element has constant amortized time cost." If you look at the actual code, you'll find that increase by 50% each time (a multiplier of 1.5).

The latest version of the ArrayIntList class along with the ArrayIntListIterator class are included in the calendar for this lecture.


Stuart Reges
Last modified: Mon Apr 5 15:35:34 PDT 2021