CSE143 Notes for Monday, 1/23/06

I began by comparing the IntList and LinkedIntList classes that we have been looking at. In the IntList class we have some data fields and methods including the "appending add" method that adds a value at the end of the list:

        public class IntList {
            private int[] elementData;
            private int size;

            ...

            public void add(int value) {
                ...
            }

            ...
        }

The LinkedIntList class also has data fields and an appending add method, although they are different data fields and the implementation of the appending add is different:

        public class LinkedIntList {
            private ListNode front;

            ...

            public void add(int value) {
                ...
            }

            ...
        }

In computer science we try to use abstraction to find what is common between these two classes even though we recongize that there are things that are quite different about the two. We would imagine an "integer list" abstraction of which these are two possible implementations. They're both the same in the sense that they provide basic "integer list" functionality like an appending add. But they are different in the sense that they are implemented quite differently (one using an array and the other using a linked list).

We have been using this concept for many years in programming. It is traditionally referred to as an Abstract Data Type or ADT:

        A bstract
        D ata
        T ype

With Java, we have actual language support for this concept. Not only can we talk abstractly about an "integer list" abstraction, we can actually define it using what is known in Java as an "interface".

        public interface IntList {
            public void add(int value);
            ...
        }

Notice that in place of the word "class" we have the word "interface". An interace has similar properties to a class. I said that a good way of thinking of an interface is to think of it as being a "hollow" class. It would be as if you had taken a working radio and removed all of the electronics that make it work. You'd be left with a hollow shell that has all of the buttons and controls for the radio (on/off switch, volume setting, station setting, etc), but none of the electronics to actually make the radio work.

In particular, you can not include data fields in an interface. There is no "state" allowed in an interface. It specifies behaviors only, not how those behaviors are implemented. In fact, as a shorthand, you can think of an interface as specifying the "what" part of an ADT and the implementation as specifying the "how" part. Notice that this interface has the header for the appending add method. But in an interface, we aren't allowed to include an actual implementation of the method. Remember that it is like a hollow class without the actual electronics that make it work. So in place of the curly braces that would normally contain the method definition, we put a semicolon. This is a convention borrowed from C and C++ where this is known as a "function prototype". From the header we know the signature of the method, but that's all we need for the abstraction (the "what" part).

But remember that we have two implementations. We have an array-based implementation that we used to call IntList. Now that we are using the name IntList for the interface, we can change its name to ArrayIntList. We would now define it as follows:

        public class ArrayIntList implements IntList {
            ...
        }

Notice how the class header contains an "implements" clause that mentions the fact that this class implements the Intlist interface. We'd make a similar change to the LinkedIntList class header:

        public class LinkedIntList implements IntList {
            ...
        }

By saying that a class implements an interface, you are promising that all of the methods mentioned in the interface will be implemented in the class itself. So both classes need to have a definition for the appending add method. If the class doesn't implement one of the methods mentioned in the interface, then the compiler generates an error when you try to compile it.

I mentioned that the best analogy I have for interfaces is that they are similar to how we use the concept of certification. You can't claim to be a certified doctor unless you have been trained to do certain specific tasks. Similarly, to be a certified teacher you have to know how to behave like a teacher, to be a certified nurse you have to know how to behave like a nurse, and so on. In Java, if you want to claim to be a certified IntList object, then you have to have several different methods, including an appending add method . Java classes are allowed to implement as many interfaces as they want to, just as a single person might be certified to do several jobs (e.g., both a certified doctor and a certified lawyer).

Because an interface is like a hollow class, you can't create instances of it. For example, given our new IntList interface, we can't say:

        IntList list = new IntList();

The left-hand side of this is okay. You can declare variables of type IntList. But the right-hand side is not okay. It wouldn't make sense to construct one of these hollow objects. Instead, we'd have to choose one of the two implementations. For example, we can say:

        IntList list = new LinkedIntList();

or we can say:

        IntList list = new ArrayIntList();

In this case it is better to use the interface type for the variable. It makes the code more flexible. For example, we might write hundreds of lines of code in terms of the interface IntList. If we decide to switch the implementation (changing, for example, from a LinkedIntList to an ArrayIntList or vice versa), then we'd only have to change the lines of code where we actually call "new". The rest of the code would be unchanged.

I then mentioned that this is an idea that has been used throughout the collections classes in Java (the java.util package). I read a short passage from the book "Effective Java" by Joshua Bloch, which I said was one of the most useful books I've read about Java. Joshua Bloch was the primary architect of the collections framework and has influenced much of Sun's work. He now works at Google.

In the collections framework, Bloch was careful to define ADTs with interfaces. For example, there are interfaces for List, Set and Map which are ADTs that we'll be discussing this quarter (we've already been examining the List concept). In addition to the interfaces, there are various implementations of each. For example, ArrayList and LinkedList both implement the List interface. TreeMap and HashMap both implement the Map interface. And TreeSet and HashSet both implement the Set interface.

Bloch's book is written as a series of 57 suggested practices for Java programmers. I have a link to the book and the suggested practices from our class web page (under "useful links"). His item 34 is to "Refer to objects by their interfaces." The passage I quoted was, "you should favor the use of interfaces rather than classes to refer to objects. If appropriate interface types exist, parameters, return values, variables and fields should all be declared using interface types." This last sentence was in bold face in the book, indicating how important Bloch thinks this is, and I've reproduced that here. He goes on to say that, "The only time you really need to refer to an object's class is when you're creating it."

You'll find almost identical language in my assignment writeup for Sieve. I ask you to declare all variables using the Queue interface and say that you should use the LinkedQueue class name only when it is constructed (i.e., only after the word "new"). This is something we're going to be taking points off for when people don't follow the rule because it's an important point to understand about interfaces versus implementations and it is an important style issue, which is why Bloch included it in his book.

Then I discussed two classic data structures known as stacks and queues. The two structures are similar in that they each store a sequence of values in a particular order. But stacks are what we call LIFO structures while queues are FIFO structures:

        stacks        queues

        L-ast         F-irst
        I-n           I-n
        F-irst        F-irst
        O-ut          O-ut

The analogy for stacks is to think of a cafeteria and how trays are stacked up. When you go to get a tray, you take the one on the top of the stack. You don't bother to try to get the one on the bottom, because you'd have to move a lot of trays to get to it. Similarly if someone brings clean trays to add to the stack, they are added on the top rather than on the bottom. The result is that stacks tend to reverse things. Each new value goes to the top of the stack, and when we take them back out, we draw from the top, so they come back out in reverse order.

The analogy for queues is to think about standing in line at the grocery store. As new people arrive, they are told to go to the back of the line. When the store is ready to help another customer, the person at the front of the line is helped. In fact, the British use the word "queue" the way we use the word "line" telling people to "queue up" or to "go to the back of the queue".

For a minimal stack, we'd need:

a way to put a value on the top of the stack (something we call "pushing" a value on the top)
a way to remove a value from the top of the stack (something we call "popping" the stack)
and a way to test whether the stack is empty

I showed people an interface with these three operations and a fourth one that turns out to be convenient that tells you how many values are currently in the stack:

        public interface Stack<E> {
            public void push(E value);
            public E pop();
            public boolean isEmpty();
            public int size();
        }

Notice that we are using Java generics to define the Stack in terms of an unspecified element type E. That way we'll be able to have a Stack<String> or Stack<Integer> or a Stack of any other kind of element type we are interested in.

For queues, we have a corresponding set of operations but they have different names. When values go into a queue we refer to it as "enqueueing" a value. When values are removed from a queue we refer to it as "dequeueing" a value. So the Queue interface looks like this:

        public interface Queue<E> {
            public void enqueue(E value);
            public E dequeue();
            public boolean isEmpty();
            public int size();
         }

These are interfaces I have written and I have developed an implementation for each. For the stack, I have a class called ArrayStack<E> and for the queue I have a class called LinkedQueue<E>. We don't really care how these implementations work. For now, we are interested in the general properties of stacks and queues, so the only thing we need to know about ArrayStack and LinkedQueue is that we use those classes when we call "new" to construct an actual object.

I did not have time to review the example code from handout #12, but I mentioned that we'd be reviewing stacks and queues in section.

Below are links to the interface files and implementation files for stacks and queues:

Stack.java, the Stack interface
Queue.java, the Queue interface
ArrayStack.java, one Stack implementation (we could imagine defining many such implementations)
LinkedQueue.java, one Queue implementation (we could imagine defining many such implementations)

Stuart Reges

Last modified: Tue Jan 31 17:34:03 PST 2006