CSE143 Notes for Wednesday, 1/25/06

I spent a few minutes running the assassin program and answering questions about it. Then I reviewed the concept of interfaces. I mentioned that the basic idea is to specify a certain set of behaviors without specifying how those behaviors are implemented. I spent some time looking at the class definitions in the Java api documentation. If you go to ArrayList, for example, you will find a list of all of the interfaces that it implements. If you follow the List interface, you'll find all of the classes that implement it.

Of course, many of these interfaces are defined using generics. So it's not just the List interface implemented by the ArrayList and LinkedList classes. It's the List<E> interface implemented by the ArrayList<E> and LinkedList<E> classes. Remember that the "E" is short for "Element type". We will be exploring the use of generics in the next programming assignment with the Stack<E> and Queue<E> interfaces that are implemented by the ArrayStack<E> and LinkedQueue<E> classes.

I also briefly discussed an interface known as the Comparable<E> interface that is very useful. Classes that implement the Comparable<E> interface can be used for many built-in operations like sorting and searching. Some people pronounce it as "come-pair-a-bull" and some people pronounce it as "comp-ra-bull". Either pronunciation is okay. The interface contains a single method called compareTo.

        public interface Comparable<T> {
            public int compareTo(T other);
        }
Notice that this is a generic declaration using "T" to stand for "Type". And notice that we have just a method header for compareTo. Instead of a method body that would be enclosed in curly braces we have just a semicolon. That's because an interface indicates that the method should exist, but doesn't say how it is implemented.

Given this interface, we can define the subset of Java classes for which sorting makes sense. Anything that implements the Comparable interface can be sorted. So what does compareTo return? The rule in Java is that:

Exactly what negative or positive integer is returned is left up to the person who implements the class. In the case of String, the method returns the difference between the character values of the first two characters that differ in the Strings (in the example, str1.compareTo(str2) returns ('h' - 't'), which equals -12, indicating that "hello" is less than "there").

I then spent a little time discussing the issue of primitive data versus objects. Even though we can construct an ArrayList<E> for any class E, we can't construct an ArrayList<int> because int is a primitive type, not a class. To get around this problem, Java has a set of classes that are known as "wrapper" classes that "wrap up" primitive values like ints to make them an object. It's very much like taking a candy and putting a wrapper around it. For the case of ints, there is a class known as Integer that can be used to store an individual int. Each Integer object has a single data field: the int that it wrapped up inside.

Java 5 also has quite a bit of support that makes a lot of this invisible to programmers. If you want to put int values into an ArrayList, you have to remember to use the type ArrayList<Integer> rather than ArrayList<int>, but otherwise Java does a lot of things for you. For example, you can construct such a list and add simple int values to it:

        ArrayList<Integer> list = new ArrayList<Integer>();
        list.add(18);
        list.add(34);
In the two calls on add, we are passing simple ints as arguments to something that really requires an Integer. This is okay as of Java 5 because Java will automatically "box" the ints for us (i.e., wrap them up in an Integer object). We can also refer to elements of this list and treat them as simple ints, as in:

        int product = list.get(0) * list.get(1);
The calls on list.get return references to Integer objects and normally you wouldn't be allowed to multiply two objects together. In this case Java automatically "unboxes" the values for you, unwrapping the Integer objects and giving you the ints that are contained inside.

Every primitive type has a corresponding wrapper class: Integer for int, Double for double, Character for char, Boolean for boolean, and so on.

At that point I switched into a new topic: complexity. The word "complexity" can be interpreted in many ways. It sounds like a measure of how complex or how complicated a program is, but that's not how computer scientists use the term. When we refer to the complexity of an algorithm or a code fragment, we are referring to the resources that it requires to execute. The two resources that we are generally most interested in are:

We'll find that a common result is that these two primary resources can often be traded off. We can generally make a program work with less memory if we're willing to have it take more time to run. We can also generally get programs to run faster if we're willing to allocate some extra memory to the task.

Of these two, the resource that computer scientists most often refer to when talking about complexity is time. In particular, we are interested in the growth rate as the input size increases. We begin by deciding on some way to measure the size of the input (e.g., the number of names to sort, the number of numbers to examine, etc) and call this "n". We are interested in what happens when we change n. For example, if it takes time "t" to execute for n items, how much time does it take to execute for 2n items?

I pointed out that this is one of the few places where computer science is actually like a science. Some instructors ask their students to collect empirical timing data for different input sizes and have them plot these values to see if the plot matches the prediction. Unfortunately, these experiments are more difficult to perform on modern computers because features like cache memory skew the results. The important thing is that the predictions hold for large values of n.

I then mentioned a simple rule of thumb that you can apply to Java programs to figure out the complexity of a code segment. I mentioned that it "almost" works. The idea is to find the line of code that is executed most often. In thinking about this, you have to be careful how you count. For example, with a for loop, we'd count the loop itself as executing just once, but the statements controlled by the loop might be executed many times. Of course, a for loop can be inside a for loop in which case the inner loop is executed multiple times. But think in terms of how many times you enter the loop when counting the number of executions of the line of code that begins with "for".

In terms of the growth rate of different algorithms, I mentioned that some of your intuitions from calculus will be helpful. You've probably been asked to solve problems like figuring out what the limit is as you approach infinity of an expression like this:

        n^3 - 18 n^2 + 385 n + 708
        --------------------------
        0.005 n^4 - 13 n^2 + 73842
When you solve a limit like this, you ignore things like coefficients and you ignore small terms. What matters here is that you basically have:

        n^3
        ---
        n^4
The rest is noise. So this is something that you know is going to approach 0 because eventually the n^4 will dominate the n^3 no matter what the coefficients and lower-order terms are. We use similar reasoning with complexity. We ignore constant multipliers and we ignore lower order terms to focus on the main term.

We ran out of time, so I had to cut this discussion short, but we'll return to this topic later and discuss other typical complexity classes.


Stuart Reges
Last modified: Tue Jan 31 21:51:08 PST 2006