CSE143X Notes for Monday, 10/29/18

I began by talking about the LinkedIntList implementation that was discussed in section along with a class called ArrayIntList. In the 143 class, we spend a lot more time developing the ArrayIntList class and it is the subject of chapter 15 of the textbook. We don't have time to do the full version, but I pointed out some important things to notice about the class.

It stores a list of integers by keeping track of an array of ints along with a field that stores the size. Any given ArrayIntList, therefore, has a capacity (how many ints it can store) versus a current size (how many ints it currently stores). This is a little like how a hotel has many rooms and often has vacancies.

The LinkedIntList class mentions various preconditions that must be satisfied before a method is called. As we move into the material from 143, it is important to understand the idea of documenting for a client of a class. We think of this as the contract for the class. If a method is making some kind of assumption, then that needs to be documented as a precondition.

The ArrayIntList class goes further than just documenting the preconditions. It throws an exception if the precondition is violated, as in this code at the beginning of the constructor that takes a capacity:

        // pre : capacity >= 0 (throws IllegalArgumentException if not)
        // post: constructs an empty list with the given capacity
        public ArrayIntList(int capacity) {
            if (capacity < 0) {
                throw new IllegalArgumentException("capacity: " + capacity);
            }
            elementData = new int[capacity];
            size = 0;
        }

The string passed to the IllegalArgumentException constructor is optional. If you have useful information to provide, however, it is helpful to the client to pass that along. In this case, we show the client the illegal capacity that was passed to the constructor.

Going forward, we will expect you to comment your classes in a much more detailed manner. In particular, we will expect you to document all of the following:

all preconditions (assumptions)
all exceptions that are thrown and a description of when it is thrown and exactly what type of exception is thrown (a client needs to know the type in order to "catch" an exception)
the meaning of all parameters passed to the method
a description of all meaningful behavior that a client will want to know about, including the postconditions (what action is performed and/or what value is returned)

Then I turned to the topic of interfaces. I used jGRASP so that we could play with some code. The LinkedIntList and ArrayIntList classes have very similar methods. They each have:

a size method
a get method
a toString method
an indexOf method
a one-argument add method (the appending add)
a two-argument add method (add at an index)
and a remove method (remove at an index)

This isn't an accident. I purposely added this minimal set of operations to each class. The point is that these classes are similar in terms of what they can do by they are very different in how they do it.

To underscore the similarity, I wrote the following client code that does parallel operations on two different lists, adding three values, removing one, and printing it before and after the remove:

        public class ListClient {
            public static void main(String[] args) {
                ArrayIntList list1 = new ArrayIntList();
                list1.add(18);
                list1.add(27);
		list1.add(93);
		System.out.println(list1);
		list1.remove(1);
                System.out.println(list1);

                LinkedIntList list2 = new LinkedIntList();
                list2.add(18);
                list2.add(27);
		list2.add(93);
		System.out.println(list2);
		list2.remove(1);
                System.out.println(list2);
            }
        }

The program produced the following output:

        [18, 27, 93]
        [18, 93]
        [18, 27, 93]
        [18, 93]

As expected, the two kinds of list behave the same way. I pointed out that in CSE142 we tried to emphasize the idea that you shouldn't have redundant code like this. So ideally we'd like to move this code into a method. As a first attempt, I said:

        public class ListClient {
            public static void main(String[] args) {
                ArrayIntList list1 = new ArrayIntList();
                processList(list1);

                LinkedIntList list2 = new LinkedIntList();
                processList(list2);
            }

            public static void processList(ArrayIntList list) {
                list.add(18);
                list.add(27);
		list.add(93);
		System.out.println(list);
		list.remove(1);
                System.out.println(list);
            }
        }

This is obviously a better way to write this program, but it didn't compile. jGRASP highlighted the second call on processList and said:

        File: /Users/reges/143X/ListClient.java  [line: 4]
        Error: processList(ArrayIntList) in ListClient cannot be applied to (LinkedIntList)

The error indicates that the method processList takes an ArrayIntList as an argument and that it cannot be applied to a call that passes a LinkedIntList as a parameter. So I tried changing the parameter type to LinkedIntList and then it produced an error for the other call.

The point is that we want to be able to think of these lists as being the same thing. In computer science we try to use abstraction to find what is common between these two classes even though we recognize that there are things that are quite different about the two. We would imagine an "integer list" abstraction of which these are two possible implementations. They're both the same in the sense that they provide basic "integer list" functionality like an appending add. But they are different in the sense that they are implemented quite differently (one using an array and the other using a linked list).

With Java, we have actual language support for this concept. Not only can we talk abstractly about an "integer list" abstraction, we can actually define it using what is known in Java as an "interface".

In an interface, we want to specify that certain behaviors exist without saying how they are implemented. We want to specify the "what" part without specifying the "how" part. So I went to the ArrayIntList class and deleted all of its comments and all of the method bodies. That left me with this:

        public class ArrayIntList {
            public int size()
            public int get(int index)
            public String toString()
            public int indexOf(int value)
            public void add(int value)
            public void add(int index, int value)
            public void remove(int index)
        }

To turn this into an interface, I had to change the name to something new. I decided to call it IntList. I also had to change the word "class" to "interface". The method have no curly braces because I deleted those lines of code. I replaced each of those with a semicolon. That left me with:

        public interface IntList {
            public int size();
            public int get(int index);
            public String toString();
            public int indexOf(int value);
            public void add(int value);
            public void add(int index, int value);
            public void remove(int index);
        }

This is how you define an interface. In place of the method bodies, we have a semicolon to indicate, "The implementation isn't given." You can think of an interface as being like a hollow radio. It has all of the knobs and buttons that are used to control the radio, but it has none of the "innards" that make it work.

So I went back to our code and changed the header for the processList method to use the interface instead:

        public static void processList(IntList list) {

Unfortunately, this led to two errors. Both of the calls now failed with messages like this:

        File: /Users/reges/143X/ListClient.java  [line: 4]
        Error: processList(IntList) in ListClient cannot be applied to (ArrayIntList)

That seems a bit odd, because both ArrayIntList and LinkedIntList have the methods mentioned in the IntList interface. The explanation is that Java requires classes to explicitly state what interfaces they implement. So we had to modify the two classes to include this notation:

        public class ArrayIntList implements IntList {
             ...
        }
        
        public class LinkedIntList implements IntList {
             ...
        }

With this change, the code compiled and executed properly.

I then tried creating an instance of the IntList class:

        IntList list = new IntList();

This produced an error. Interfaces cannot be instantiated because they are incomplete.

I then asked people to think about the types of these objects. Consider our variable list1 from main:

        ArrayIntList list1 = new ArrayIntList();

This describes an object of type ArrayIntList, but it also describes an object of type IntList. That's why the method call works. The object is of more than one type. This idea is important in object oriented programming because Java objects can typically fill many roles. This is related to the notion of type because each role is defined by a new type. Interfaces are a way to define a new role, a new type. By saying that ArrayIntList and LinkedIntList both implement the IntList interface, we say that they both fill that role. So given an ArrayIntList, we can say that it is of type ArrayIntList, but we can also say that it is of type IntList. Similarly, a LinkedIntList is of type LinkedIntList and of type IntList.

It is a good idea to use interfaces when defining the types of variables. I changed our main method to use type intList for the variables instead of listing the individual types:

        public static void main(String[] args) {
            IntList list1 = new ArrayIntList();
            processList(list1);

            IntList list2 = new LinkedIntList();
            processList(list2);
        }

These variables are more flexible than the old variables. The variable list1, for example, can now refer to any IntList object, not just one of type ArrayIntList. In fact, list1 can even store a reference to other kinds of objects, as long as they implement the IntList interface.

I then mentioned that this is an idea that has been used throughout the collections classes in Java (the java.util package). This idea is stressed by Joshua Bloch, the author of a book called Effective Java, which I said was one of the most useful books I've read about Java. Joshua Bloch was the primary architect of the collections framework and has influenced much of Sun's work. He now works at Google.

In the collections framework, Bloch was careful to define data structure abstractions with interfaces. For example, there are interfaces for List, Set and Map which are abstractions that we'll be discussing in the coming weeks. His item 64 is, "Refer to objects by their interfaces," which begins by saying:

You should favor the use of interfaces over classes to refer to objects. If appropriate interface types exist, then parameters, return values, variables and fields should all be declared using interface types. The only time you really need to refer to an object's class is when you're creating it with a constructor.

The middle sentence was in bold face in the book, indicating how important Bloch thinks this is, and I've reproduced that here.

We have been using this concept for many years in programming. It is traditionally referred to as an Abstract Data Type or ADT:

        A bstract
        D ata
        T ype

The List, Set and Map abstractions from the collections framework are all ADTs.

I also mentioned that the best analogy I have for interfaces is that they are similar to how we use the concept of certification. You can't claim to be a certified doctor unless you have been trained to do certain specific tasks. Similarly, to be a certified teacher you have to know how to behave like a teacher, to be a certified nurse you have to know how to behave like a nurse, and so on. In Java, if you want to claim to be a certified IntList object, then you have to have several different methods, including an appending add method . Java classes are allowed to implement as many interfaces as they want to, just as a single person might be certified to do several jobs (e.g., both a certified doctor and a certified lawyer). When there are more than one, you list them separated by commas after the word "implements".

Then I discussed the built-in ArrayList class. Remember that we're studying the LinkedIntList and ArrayIntList class as a way to understand the built-in LinkedList and ArrayList classes. We know that for arrays, it is possible to construct arrays that store different types of data:

an int[] to store an array of int values
a double[] to store an array of double values
a String[] to store an array of references to Strings

But arrays are a special case in Java. If they didn't already exist, we couldn't easily add them to Java. Instead, Java now allows you to declare generic classes and generic interfaces. For example, the ArrayList class is similar to an array. Instead of declaring ordinary ArrayList objects, we declare ArrayList<E> where E is some type (think of E as being short for "Element type"). The "E" is a type parameter that can be filled in with the name of any class.

For example, suppose, we want an ArrayList of Strings. We describe the type as:

        ArrayList<String>

When we construct an ArrayIntList, we say:

        ArrayIntList list = new ArrayIntList();

Imagine replacing both occurrences of "ArrayIntList" with "ArrayList<String>" and you'll see how to construct an ArrayList<String>:

        ArrayList<String> list = new ArrayList<String>();

But remember that we want to use interface types whenever possible. The Java collections framework includes a List<E> interface that is like our IntList interface. So we can declare this variable using that interface:

        List<String> list = new ArrayList<String>();

Once you have declared an ArrayList<String>, you can manipulate it with the kinds of calls we have made on our ArrayIntList but using Strings instead of ints:

        ArrayList<String> list = new ArrayList<String>();
        list.add("four");
        list.add("score");
        list.add("seven");
        list.add("years");
        list.add("what was next?");
        list.add("ago");
        System.out.println("list = " + list);

which produces this output:

        list = [four, score, seven, years, what was next?, ago]

Obviously this isn't the correct quote. We could fix the individual calls on the appending add method, but it was a more interesting exercise to explore how to fix the list after it is constructed.

The first problem is that there is a missing word. The word "and" should appear between "score" and "seven". We want to include it at index 2 in the structure, so we used this line of code to do so:

        list.add(2, "and");

The other problem is that it includes the string "what was next?". This was originally at index 4 of the list, but now that we have inserted a new value, it has been shifted over so that it appears at index 5. So we added this line of code:

        list.remove(5);

With those two lines of code added after constructing the list, the output is now correct:

        list = [four, score, and, seven, years, ago]

Then I said that I wanted to explore all of the ways to traverse the list, printing the values one per line. The most basic is to use the standard traversal loop for a list that uses calls on the size and get methods:

        for (int i = 0; i < list.size(); i++) {
            System.out.println(list.get(i));
        }

I mentioned that another way to do this is with a foreach loop. Chapter 7 describes how to use the foreach loop for arrays and chapter 10 describes how to use it for an ArrayList. It is the best structure to use when you are performing a "read only" operation on a list that doesn't require keeping track of the index.

The foreach loop syntax is fairly simple:

        for (String s : list) {
            System.out.println(s);
        }

We read this loop as, "for each string s that is in list..." The choice of "s" is arbitrary. It defines a local variable for the loop. I could just as easily have called it "x" or "foo" or "value". Each time through the loop Java sets the variable s to the next value from the list. You don't have to test for the size of the list or to use a call on the get method. Java does that for you when you use a for-each loop.

There are some limitations of for-each loops. You can't use them to change the contents of the list. If you assign a value the variable s, you are just changing a local variable inside the loop. It has no effect on the list itself.

Then we explored a different approach to traversing a list using what is known as an iterator. In Java there are three fundamental operations that an iterator performs:

a "hasNext" method that tells you whether or not there are any values left
a "next" method that returns the next value and advances to the one beyond
a "remove" method that removes from the structure the value that was most recently returned by a call on "next"

We wrote the following code to use an iterator for printing our list elements:

        Iterator<String> itr = list.iterator();
        while (itr.hasNext()) {
            String s = itr.next();
            System.out.println(s);
        }

Then I then spent a little time discussing the issue of primitive data versus objects. Even though we can construct an ArrayList<E> for any class E, we can't construct an ArrayList<int> because int is a primitive type, not a class. To get around this problem, Java has a set of classes that are known as "wrapper" classes that "wrap up" primitive values like ints to make them an object. It's very much like taking a candy and putting a wrapper around it. For the case of ints, there is a class known as Integer that can be used to store an individual int. Each Integer object has a single data field: the int that it wrapped up inside.

Java also has quite a bit of support that makes a lot of this invisible to programmers. If you want to put int values into an ArrayList, you have to remember to use the type ArrayList<Integer> rather than ArrayList<int>, but otherwise Java does a lot of things for you. For example, you can construct such a list and add simple int values to it:

        List<Integer> numbers1 = new ArrayList<Integer>();
        numbers.add(18);
        numbers.add(34);

In the two calls on add, we are passing simple ints as arguments to something that really requires an Integer. This is okay because Java will automatically "box" the ints for us (i.e., wrap them up in Integer objects). We can also refer to elements of this list and treat them as simple ints, as in:

        int product = numbers.get(0) * numbers.get(1);

The calls on list.get return references to Integer objects and normally you wouldn't be allowed to multiply two objects together. In this case Java automatically "unboxes" the values for you, unwrapping the Integer objects and giving you the ints that are contained inside.

Every primitive type has a corresponding wrapper class: Integer for int, Double for double, Character for char, Boolean for boolean, and so on.

Then I mentioned that we will be looking at a kind of structure known as a Set. There is an interface that defines the behaviors of a set known as Set<E>. For now, all of the sets we will construct all of our sets using the TreeSet<E> class. For example, I used an array of data to initialize both a list and a set by adding values from the array to each:

        int[] data = {18, 4, 97, 3, 4, 18, 72, 4, 42, 42, -3};
        List<Integer> numbers1 = new ArrayList<Integer>();
        Set<Integer> numbers2 = new TreeSet<Integer>();

        for (int n : data) {
            numbers1.add(n);
            numbers2.add(n);
        }
        System.out.println("numbers1 = " + numbers1);
        System.out.println("numbers2 = " + numbers2);

This produced the following output:

        numbers1 = [18, 4, 97, 3, 4, 18, 72, 4, 42, 42, -3]
        numbers2 = [-3, 3, 4, 18, 42, 72, 97]

There are two major differences between a set and a list. Sets don't allow duplicates. So the duplicate values like 42 and 4 in the array appear just once in the set. Sets also don't allow the client to control the order of elements. The TreeSet class keeps things in sorted order. So the numbers will always be in that order. If you want to control the order, then you should use a list instead.

Stuart Reges

Last modified: Mon Oct 29 17:55:15 PDT 2018