We first need to spend a couple of minutes on compareTo()
and equals()
, which
we glossed over last time.
Binary search evaluation. How fast is it?
Informally, it looks like we do a lot less work, measured by the number of items in a list we need to examine when we're searching for a value. Let's see if we can be a bit more precise about this.
What we'd like to figure out is, given a list of size n, how much work does it take to search for a value using binary search? One way to approach the problem is to look it the other way around: given a fixed number of comparisons, how big a list can we search?
k = # comparisons n = list size
OK, given this information, what is the relationship between k and n? That is, what is k as a function of n?
Comparing linear and binary search. So now we have an idea of the cost of binary search in a (sorted) list of n items. How does that compare to linear search (while (k < n && item not found yet) k++)?
Algorithm cost to search a list of size n
linear search
binary search
Graph:
Bottom line: binary search isn't just 2x or 5x or 10x faster than linear search. It's running time is a different kind of function than the running time of linear search - and that difference is independant of implementation details, the particular kind of processor we're using, how much memory we've got, or other details. This is a first example of a computational complexity result, comparing two algorithms abstractly without reference to lots of low-level coding specifics. We'll return to this idea throughout the course as a way of comparing and characterizing algorithms.
Iterators
One of our reasons for looking at simple examples like the StringList
and
SortedStringList
classes is to get a concrete idea of the basics
behind container or
collection classes - general purpose classes that are used to hold
collections of data. These are needed so often that most programming
languages these days include a library of generally useful ones. Java, for
example, includes things like ArrayList
, which is a general version
of our StringList
that can hold objects of any type.
One operation that we often need once we have a collection of items is the
ability to go through the collection and process the items
in it one at a time. For collections like ArrayList
and StringList
,
where the items are stored in an underlying data structure (array) that permits
efficient
access to individual elements by their location, we could always
do this to access the items by their position:
for (int k = 0; k < s.size(); k++) { process s.get(k) }
While this works fine for array-based collections, it isn't a general
solution. In some data structures like a linked list
(something we'll get to shortly), the get(k)
operation is very
inefficient - access to
individual
items in the list by their
position is slow. For some other collections, the notion of position doesn't
even make sense. A set of items is a collection, but there is no notion of
a first
item
in a set, then a second, and so on.
What we want is a general mechanism to "process all the items in a collection one by one", in a way that works efficiently with any kind of collection. The mechanism to do this is an iterator. The idea is that we can ask a collection to give us an iterator object that can be used to access the items in the collection. Any iterator provides the following capabilities in some form or another:
The Java collection classes provide iterators for all of the collection classes, but in a slightly different form. Java's iterators combine the "return the current object" and "advance to the next object" operations into a single method. The Java versions of these operations are
hasNext()
- return true if there are more objects available in this iterationnext()
- return the next object in the collection, and advance to the following
one. Precondition: hasNext()
is true.The classic versions of the Java collections like ArrayList
can store arbitrary
objects, but as a result, you have to use a cast when you retrieve an item
from the collection to specify the actual type of the object you've retrieved.
It's a bit of extra noise, but not too bad. So using the classic version of
ArrayList
, here is how to create a list, fill it with a few strings, then retrieve
them one by one.
ArrayList lst = new ArrayList(); lst.add("some strings"); // add some strings ... Iterator it = lst.iterator(); while (it.hasNext()) { String nextString = (String)it.next(); ...process nextString... }
Picture: (in particular, what information does the iterator object need to know to do its job?)
Java 5 (the latest version) introduced the notion of generic containers (and
generic types for all sorts of things besides containers). The idea is that
instead of declaring something like a plain ArrayList
, which can
hold objects of any types, we can specify the kinds of objects we want the
container
to hold by
putting the type name between angle brackets in the declaration.
ArrayList<String> lst = new ArrayList<string>(); lst.add("some strings"); // add some strings ... Iterator<String> it = lst.iterator(); while (it.hasNext()) { String nextString = it.next(); ...process nextString... }
We'll come back to generic types and explore their significance later. For now, the idea is to get the hang of the iterator pattern.
One issue is what happens if changes are made to the underlying collection
while an iteration is in progress? Say someone adds a new item to the list,
or deletes something? While there are various ways to handle this, Java's answer
is that these operations are not allowed, since changes to the underlying list
could leave the iteration in a strange state where the meaning of hasNext()
or next()
is
not clear. If someone changes a list and then an iterator operation is attempted,
a ConcurrentModificationException
is normally
thrown.
Another feature of Java iterators: These iterators support one other
method: remove()
. This method allows us to remove the last item
returned by next()
during an iteration, and in that sense, is
the one exception to the general rule that no changes may be made to a container
in the middle of an iteration. For
example, we can eliminate all occurrences of "bad words" from a list of strings
as follows
(using the
old-style ArrayList
collection):
ArrayList lst = new ArrayList(); lst.add("some strings"); // add some strings ... Iterator it = lst.iterator(); // retrieve an iterator object while (it.hasNext()) { String s = (String)it.next(); if (s.equals("bad words") { it.remove(); } }There are several restrictions on
remove()
that cause it to throw
an exception. Attempting remove()
before retrieving an object with
next()
, or two remove()
operations without a next()
between
them are examples of this. For more details, it's worth looking at the description
of
the ArrayList
iterator()
method and the Iterator
class in the standard
library JavaDoc pages.