CSE143 Notes 4/5/06

"Infinite" StringLists

Before we move on to linked lists, it's worth taking another look at StringList to fix a major limitation that it has relative to standard Java containers like ArrayList. The standard library containers have the property that they expand as needed to give the illusion of an (almost) infinitely-large container.

Terminology: when talking about collections, it's standard to refer to two properties:

Analogy: think of a parking lot. The capacity is the number of spaces available in the lot. The current "size" is the number of cars currently parked in the lot. The difference between the capacity and size is the available space for additional cars.

What needs to change to provide this illusion of infinite size? When we are asked to add something to a StringList and all of the available space is in use, we need to do something different instead of throwing an exception.

How? We can't reach inside and make the array "bigger". Instead, we'll allocate a new array, copy the old entries over, and replace the existing items variable with a reference to the new array.

How much bigger should it be? If we're only adding a single item to the collection, all we really need is to create a new array whose size is 1 larger than the existing one. But allocating a new array and copying the existing entries is a relatively expensive operation. There's probably a good chance that we'll add more items to the collection later, so it would be nice if we could avoid having to do all of this work each time we add a new item once the initial capacity is exceeded. So instead of increasing the size by 1 each time we run out of room, a good heuristic is to allocate a new array that is a substantial percentage larger than the old one. One good heuristic is to double the size of the array each time; the Java standard libraries increase the size by 1.5, which also gives good performance. Either way, every now and then adding a new item is expensive, because we need to increase the capacity, but, on average, adding a new item remains cheap, since almost all of the additions just require adding a new item at the end.

So let's implement it. One issue: where does the code to increase the capacity go? We could put it in the add method (e.g., if (size() == capacity) allocate the new array; copy everything over; replace items). But this is a nice example of something that is a self-contained operation, so we'll implement it as a separate method ensureExtraCapacity(howMuch). We can then call this method whenever we are about to add things to the list to be sure that there is enough unused capacity available.

(see sample code for details)

 

 

Two other StringList refinements

There are two other things we'd like to do to our array-based StringList before we move on. First, we want to add a simple iterator capability. The idea is to add an iterator() method to StringList that constructs and returns a new object that we can use to retrieve items from the StringList one at a time. This object needs to know a couple of things: first, it needs to be able to access the instance variables of the StringList so it can retrieve items from the list and also know when there are more items to retrieve. It also needs to provide the hasNext() and next() methods that client code can use to perform the iteration. And to make all this work, we'll need a constructor that properly initializes the StringListIterator object when it is created.

There are several ways to define the StringListIterator class. One is to just define it inside the StringList class. Java classes can, as we've seen, contain instance variable and method definitions. It turns out that we can also declare new classes inside of a class. The advantage here is that since the new class is inside the original StringList, it can access private data that is not observable outside of StringList, which makes it easy to retrieve items and decide when no more items are available during the iteration.

See the sample code on the web for details. The sample code implements a simple iterator with only hasNext() and next() methods. It does not contain remove(), although this wouldn't be too hard to include. It also doesn't attempt to detect errors caused by modifications to the StringList between calls of the iterator methods, which, in production code, should throw an exception. This takes more work to implement, but the basic idea isn't too complex. The idea is to store extra information in the StringList and in the iterator to keep track of when modification are made and check to be sure no changes happen during an iteration. In the standard Java libraries, this is done by including an instance variable modCount in the collection that is increased by 1 each time a structural modification is made to the collection. When an iterator is created, it stores a copy of the current modCount value and then checks that this value has not changed each time an iterator operation is attempted. If the collection's modCount differs from the one stored in the iterator when it was created, a ConcurrentModificationException is thrown.

 

Last change: Java provides automatic memory management, known as garbage collection to recycle memory that was allocated at one time for objects or arrays but is no longer in use. Normally we don't need to worry too much about how this works, which is a big advantage over languages like C and C++ where a lot of programmer time is spent handling and debugging memory management issues.

However, good Java code should be a bit careful to help out the garbage collector in the following way. If we have a variable that refers to an object and we are done using that object, then we should store the special value null in the variable to break the link between the variable and the object. As long as some active variable(s) refer to an object, it cannot be reclaimed by the garbage collector, since it can potentially be used in the future. By storing null in the variable, we no longer retain a reference to the object, and if that was the last active reference to that object, the space it occupies can be reclaimed.

In a good implementation of a class like StringList, we should go through the code and store null in spaces in the array that no longer contain an active reference to something in the collection. In particular, when we use remove to delete an entry, we should store a null in items[size], since that location in the array is "outside" the part of the array that refers to objects currently stored in the list.