CSE143 Notes for Friday, 1/8/18

I began by discussing the concept of interfaces. I started with a nonprogramming example. I mentioned that you can plug in many kinds of devices to a modern TV including a cable box, a DVD player, a VCR, a game console, etc. It doesn't make sense to have specific plugs for each kind of device. Instead these different devices are usually each HDMI-compliant source devices. This is an example of the interface concept in use. In fact, the "I" of HDMI stands for "Interface" (High-Definition Multimedia Interface). This provides a good example of where being less specific makes something more flexible. Instead of having a place to plug in a cable box and only a cable box, it is better to have a place to plug in any HDMI-compliant device (anything that satisfies the interface).

Then I used jGRASP to look at a programming example. I pointed out that I had available to us the ArrayIntList class that we talked about in the first week of the quarter. Soon we will discuss a different way of implementing such a list using a linked structure that we will refer to as a LinkedIntList. For now the only important thing to understand is that it is a second way of implementing a list. I purposely designed the two classes to have very similar methods. They each have:

The point is that these classes are similar in terms of what they can do by they are very different in how they do it.

To underscore the similarity, I wrote the following client code that does parallel operations on two different lists, adding three values, removing one, and printing it before and after the remove:

        public class ListClient {
            public static void main(String[] args) {
                ArrayIntList list1 = new ArrayIntList();
                list1.add(18);
                list1.add(27);
		list1.add(93);
		System.out.println(list1);
		list1.remove(1);
                System.out.println(list1);

                LinkedIntList list2 = new LinkedIntList();
                list2.add(18);
                list2.add(27);
		list2.add(93);
		System.out.println(list2);
		list2.remove(1);
                System.out.println(list2);
            }
        }
The program produced the following output:

        [18, 27, 93]
        [18, 93]
        [18, 27, 93]
        [18, 93]
As expected, the two kinds of list behave the same way. I pointed out that in CSE142 we tried to emphasize the idea that you shouldn't have redundant code like this. So ideally we'd like to move this code into a method. As a first attempt, I said:

        public class ListClient {
            public static void main(String[] args) {
                ArrayIntList list1 = new ArrayIntList();
                processList(list1);

                LinkedIntList list2 = new LinkedIntList();
                processList(list2);
            }

            public static void processList(ArrayIntList list) {
                list.add(18);
                list.add(27);
		list.add(93);
		System.out.println(list);
		list.remove(1);
                System.out.println(list);
            }
        }
This is obviously a better way to write this program, but it didn't compile. jGRASP highlighted the second call on processList and said:

        File: /Users/reges/143/ListClient.java  [line: 4]
        Error: processList(ArrayIntList) in ListClient cannot be applied to (LinkedIntList)
The error indicates that the method processList takes an ArrayIntList as an argument and that it cannot be applied to a call that passes a LinkedIntList as a parameter. So I tried changing the parameter type to LinkedIntList and then it produced an error for the other call.

The point is that we want to be able to think of these lists as being the same thing. In computer science we try to use abstraction to find what is common between these two classes even though we recognize that there are things that are quite different about the two. We would imagine an "integer list" abstraction of which these are two possible implementations. They're both the same in the sense that they provide basic "integer list" functionality like an appending add. But they are different in the sense that they are implemented quite differently (one using an array and the other using a linked list).

With Java, we have actual language support for this concept. Not only can we talk abstractly about an "integer list" abstraction, we can actually define it using what is known in Java as an "interface".

In an interface, we want to specify that certain behaviors exist without saying how they are implemented. We want to specify the "what" part without specifying the "how" part. So I went to the ArrayIntList class and deleted all of its comments and all of the method bodies. That left me with this:

        public class ArrayIntList {
            public int size()
            public int get(int index)
            public String toString()
            public int indexOf(int value)
            public void add(int value)
            public void add(int index, int value)
            public void remove(int index)
        }
To turn this into an interface, I had to change the name to something new. I decided to call it IntList. I also had to change the word "class" to "interface". The method have no curly braces because I deleted those lines of code. I replaced each of those with a semicolon. That left me with:

        public interface IntList {
            public int size();
            public int get(int index);
            public String toString();
            public int indexOf(int value);
            public void add(int value);
            public void add(int index, int value);
            public void remove(int index);
        }
This is how you define an interface. In place of the method bodies, we have a semicolon to indicate, "The implementation isn't given." You can think of an interface as being like a hollow radio. It has all of the knobs and buttons that are used to control the radio, but it has none of the "innards" that make it work.

So I went back to our code and changed the header for the processList method to use the interface instead:

        public static void processList(IntList list) {
Unfortunately, this led to two errors. Both of the calls now failed with messages like this:

        File: /Users/reges/143/ListClient.java  [line: 4]
        Error: processList(IntList) in ListClient cannot be applied to (ArrayIntList)
That seems a bit odd, because both ArrayIntList and LinkedIntList have the methods mentioned in the IntList interface. The explanation is that Java requires classes to explicitly state what interfaces they implement. So we had to modify the two classes to include this notation:

        public class ArrayIntList implements IntList {
             ...
        }
        
        public class LinkedIntList implements IntList {
             ...
        }
With this change, the code compiled and executed properly.

I then tried creating an instance of the IntList class:

        IntList list = new IntList();
This produced an error. Interfaces cannot be instantiated because they are incomplete.

I then asked people to think about the types of these objects. Consider our variable list1 from main:

        ArrayIntList list1 = new ArrayIntList();
This describes an object of type ArrayIntList, but it also describes an object of type IntList. That's why the method call works. The object is of more than one type. This idea is important in object oriented programming because Java objects can typically fill many roles. This is related to the notion of type because each role is defined by a new type. Interfaces are a way to define a new role, a new type. By saying that ArrayIntList and LinkedIntList both implement the IntList interface, we say that they both fill that role. So given an ArrayIntList, we can say that it is of type ArrayIntList, but we can also say that it is of type IntList. Similarly, a LinkedIntList is of type LinkedIntList and of type IntList.

As I have mentioned before, it is a good idea to use interfaces when defining the types of variables. I changed our main method to use type IntList for the variables instead of listing the individual types:

        public static void main(String[] args) {
            IntList list1 = new ArrayIntList();
            processList(list1);

            IntList list2 = new LinkedIntList();
            processList(list2);
        }
These variables are more flexible than the old variables. The variable list1, for example, can now refer to any IntList object, not just one of type ArrayIntList. In fact, list1 can even store a reference to other kinds of objects, as long as they implement the IntList interface.

Then I discussed the built-in ArrayList class. Remember that we're studying the ArrayIntList class as a way to understand the built-in ArrayList class. We know that for arrays, it is possible to construct arrays that store different types of data:

But arrays are a special case in Java. If they didn't already exist, we couldn't easily add them to Java. Instead, Java now allows you to declare generic classes and generic interfaces. For example, the ArrayList class is similar to an array. Instead of declaring ordinary ArrayList objects, we declare ArrayList<E> where E is some type (think of E as being short for "Element type"). The "E" is a type parameter that can be filled in with the name of any class.

For example, suppose, we want an ArrayList of Strings. We describe the type as:

        ArrayList<String>
When we construct an ArrayIntList, we say:
        ArrayIntList list = new ArrayIntList();
Imagine replacing both occurrences of "ArrayIntList" with "ArrayList<String>" and you'll see how to construct an ArrayList<String>:
        ArrayList<String> list = new ArrayList<String>();
Once you have declared an ArrayList<String>, you can manipulate it with the kinds of calls we have made on our ArrayIntList but using Strings instead of ints:

        ArrayList<String> list = new ArrayList<String>();
        list.add("four");
        list.add("score");
        list.add("seven");
        list.add("years");
        list.add("what was next?");
        list.add("ago");
        System.out.println("list = " + list);
which produces this output:

        list = [four, score, seven, years, what was next?, ago]
Obviously this isn't the correct quote. We could fix the individual calls on the appending add method, but it was a more interesting exercise to explore how to fix the list after it is constructed.

The first problem is that there is a missing word. The word "and" should appear between "score" and "seven". We want to include it at index 2 in the structure, so we used this line of code to do so:

        list.add(2, "and");
The other problem is that it includes the string "what was next?". This was originally at index 4 of the list, but now that we have inserted a new value, it has been shifted over so that it appears at index 5. So we added this line of code:

        list.remove(5);
With those two lines of code added after constructing the list, the output is now correct:

        list = [four, score, and, seven, years, ago]
All of the methods we have seen with ArrayIntList are defined for ArrayList: the appending add, add at an index, remove, size, get, etc. I said that I wanted to explore all of the ways to traverse the list, printing the values one per line. The most basic is to use the standard traversal loop for a list that uses calls on the size and get methods:

        for (int i = 0; i < list.size(); i++) {
            System.out.println(list.get(i));
        }
I mentioned that another way to do this is with a foreach loop. Chapter 7 describes how to use the foreach loop for arrays and chapter 10 describes how to use it for an ArrayList. It is the best structure to use when you are performing a "read only" operation on a list that doesn't require keeping track of the index.

The foreach loop syntax is fairly simple:

        for (String s : list) {
            System.out.println(s);
        }
We read this loop as, "for each string s that is in list..." The choice of "s" is arbitrary. It defines a local variable for the loop. I could just as easily have called it "x" or "foo" or "value". Each time through the loop Java sets the variable s to the next value from the list. You don't have to test for the size of the list or to use a call on the get method. Java does that for you when you use a for-each loop.

There are some limitations of for-each loops. You can't use them to change the contents of the list. If you assign a value the variable s, you are just changing a local variable inside the loop. It has no effect on the list itself.

Then we explored a different approach to traversing a list using what is known as an iterator. In Java there are three fundamental operations that an iterator performs:

We wrote the following code to use an iterator for printing our list elements:

        Iterator<String> i = list.iterator();
        while (i.hasNext()) {
            System.out.println(i.next());
        }
Then I discussed the idea of interfaces. An interface is a description of a set of behaviors. For example, all of the behaviors we have just discussed that are included in the ArrayList<E> class are also included in an interface known as List<E>. The List<E> interface says that a list has to have an appending add method, a method to add at an index, a method to remove at an index, a get method, a size method, an indexOf method, and so on.

But there are different ways to implement a list. We have been looking at how to implement it using an array. Next week we will look at how to implement it using something called a linked list. To make your programs flexible, you should declare your variables, parameters, fields, and method return types using interfaces. So instead of saying:

        ArrayList<String> list = new ArrayList<String>();
you should instead say:
        List<String> list = new ArrayList<String>();
With this declaration, the variable list is more flexible. It can store a reference to any list, not just an ArrayList. I mentioned that the best analogy I have for interfaces is that they are similar to how we use the concept of certification. You can't claim to be a certified doctor unless you have been trained to do certain specific tasks. Similarly, to be a certified teacher you have to know how to behave like a teacher, to be a certified nurse you have to know how to behave like a nurse, and so on. In Java, if you want to claim to be a certified List<E>, then you have to have several different methods. I then mentioned that this is an idea that has been used throughout the collections classes in Java (the java.util package). This idea is stressed by Joshua Bloch, the author of Effective Java. Joshua Bloch was the primary architect of the collections framework and has influenced much of Sun's work.

In the collections framework, Bloch was careful to define data structure abstractions with interfaces. For example, there are interfaces for List, Set and Map which are abstractions that we'll be discussing this week. class web page (under "useful links"). His item 34 is to "Refer to objects by their interfaces." He says, "you should favor the use of interfaces rather than classes to refer to objects. If appropriate interface types exist, parameters, return values, variables and fields should all be declared using interface types." This last sentence was in bold face in the book, indicating how important Bloch thinks this is, and I've reproduced that here. He goes on to say that, "The only time you really need to refer to an object's class is when you're creating it."

For now, this will mostly be a style issue for us. In a few weeks we will look at how interfaces are actually defined. For now, just realize that we will require you to use interface types for defining variables, fields, parameters, and method return types.

Then I then spent a little time discussing the issue of primitive data versus objects. Even though we can construct an ArrayList<E> for any class E, we can't construct an ArrayList<int> because int is a primitive type, not a class. To get around this problem, Java has a set of classes that are known as "wrapper" classes that "wrap up" primitive values like ints to make them an object. It's very much like taking a candy and putting a wrapper around it. For the case of ints, there is a class known as Integer that can be used to store an individual int. Each Integer object has a single data field: the int that it wrapped up inside.

Java also has quite a bit of support that makes a lot of this invisible to programmers. If you want to put int values into an ArrayList, you have to remember to use the type ArrayList<Integer> rather than ArrayList<int>, but otherwise Java does a lot of things for you. For example, you can construct such a list and add simple int values to it:

        List<Integer> numbers1 = new ArrayList<Integer>();
        numbers.add(18);
        numbers.add(34);
In the two calls on add, we are passing simple ints as arguments to something that really requires an Integer. This is okay because Java will automatically "box" the ints for us (i.e., wrap them up in Integer objects). We can also refer to elements of this list and treat them as simple ints, as in:

        int product = numbers.get(0) * numbers.get(1);
The calls on list.get return references to Integer objects and normally you wouldn't be allowed to multiply two objects together. In this case Java automatically "unboxes" the values for you, unwrapping the Integer objects and giving you the ints that are contained inside.

Every primitive type has a corresponding wrapper class: Integer for int, Double for double, Character for char, Boolean for boolean, and so on.

Then I mentioned that we will be looking at a kind of structure known as a Set. There is an interface that defines the behaviors of a set known as Set<E>. For now, all of the sets we will construct all of our sets using the TreeSet<E> class. For example, I used an array of data to initialize both a list and a set by adding values from the array to each:

        int[] data = {18, 4, 97, 3, 4, 18, 72, 4, 42, 42, -3};
        List<Integer> numbers1 = new ArrayList<Integer>();
        Set<Integer> numbers2 = new TreeSet<Integer>();

        for (int n : data) {
            numbers1.add(n);
            numbers2.add(n);
        }
        System.out.println("numbers1 = " + numbers1);
        System.out.println("numbers2 = " + numbers2);
This produced the following output:

        numbers1 = [18, 4, 97, 3, 4, 18, 72, 4, 42, 42, -3]
        numbers2 = [-3, 3, 4, 18, 42, 72, 97]
There are two major differences between a set and a list. Sets don't allow duplicates. So the duplicate values like 42 and 4 in the array appear just once in the set. Sets also don't allow the client to control the order of elements. The TreeSet class keeps things in sorted order. So the numbers will always be in that order. If you want to control the order, then you should use a list instead.

Sets have many of the same methods that lists do. You can add to a set, get its size, ask for an iterator, use it with a foreach loop. But it doesn't have a notion of indexing. So you can't remove at an index. Instead you remove a specific value. For example, we wrote this code to remove the value 42 from the set:

        numbers2.remove(42);
        System.out.println("numbers2 = " + numbers2);
After executing this line of code, the set no longer had 42 in it:

        numbers2 = [-3, 3, 4, 18, 72, 97]
If you don't know exactly what values you want to remove from a set, you typically use an iterator to do the removal. We began by writing this code as an attempt to remove all of the multiples of 3 from the set:

        Iterator<Integer> i2 = numbers2.iterator();
        while (i2.hasNext()) {
            int n = i2.next();
            if (n % 3 == 0) {
                numbers2.remove(n);
            }
        }
This code doesn't work. It throws a ConcurrentModificationException. Java has a rule that you can't call a mutating method on a collection while you are iterating over it. You can potentially talk to two different objects: the set or the iterator. What Java doesn't want you to do is to ask the set to change its contents while you are also talking to an iterator.

The solution is to ask the iterator to do the removal so that all of your communication is with that one object:

        Iterator<Integer> i2 = numbers2.iterator();
        while (i2.hasNext()) {
            int n = i2.next();
            if (n % 3 == 0) {
                i2.remove();
            }
        }
        System.out.println("numbers2 = " + numbers2);
This code worked and produced the following output:
        now numbers2 = [4, 97]
This is the approach you need to take when you want to both examine and remove values from a set. Because you are not allowed to alter a set while you are iterating over it, you also can't modify it with a foreach loop. That is why the foreach loop is appropriate only if you are doing a "read only" operation.

An iterator is what we would call a lightweight object. You can use the iterator to gain access to everything in the structure, but it doesn't store the data itself. I gave the analogy that this is like going to a pharmacy and you'd really like to just jump over the counter and grab your prescription, but instead you have to talk to the a person behind the counter. The person behind the counter has access to everything in the pharmacy. But that person is not the pharmacy. The person has access to the pharmacy and you (the client) talk to the person behind the counter to get things done. That's how an iterator works. It has full access to the underlying structure and it keeps track of how much of the structure it has traversed, but that's not the same thing as being the structure.

I said that this would be much clearer in section when we practice writing code that manipulates sets. Chapter 13 also has a useful table of set operations.


Stuart Reges
Last modified: Mon Jan 8 16:59:22 PST 2018