CSE143 Notes for Friday, 1/5/18

I spent a few minutes reviewing some basic terminology of object-oriented programming. I mentioned that the following is a good summary of the basic object idea:

An object encapsulates state and behavior.
The terms state and behavior are the technical terms we use, but they are the same ideas people had mentioned. For example, for a radio, the states would include on/off, volume setting, station setting, am/fm and so on. The behavior is that it plays music and that it allows us to change these settings. In programming, state usually means variables (data) and behavior usually means methods.

Then we continued our discussion of the ArrayIntList class. We began by discussing some issues related to the toString method. The version we discussed in section had the following method:

        public String toString() {
            if (size == 0) {
                return "[]";
            } else {
                String result = "[" + elementData[0];
                for (int i = 1; i < size; i++) {
                    result += ", " + elementData[i];
                }
                result += "]";
                return result;
            }
        }
I changed the name of the method to ToString (different capitalization--this is actually what C# uses). Then I added the following lines of code to the client program:

        System.out.println("list1 = " + list1.ToString());
        System.out.println("list2 = " + list2.ToString());
This worked fairly well. The call on the ToString method showed the list contents properly, and we got these lines of output:
        list1 = [1, 82, 97]
        list2 = [7, -8]
I then said let's just see what happens when we include a simple println statement for each list. So at the end of the client code, we added these two lines of code:

        System.out.println("list1 = " + list1);
        System.out.println("list2 = " + list2);
This did not produce good output. It produced a weird output that included the name of the class and an "@" and a hexadecimal number. But when we changed the name of our method to toString, we found that the two lines of code above produced exactly the same output as these two lines of code:

        System.out.println("list1 = " + list1.toString());
        System.out.println("list2 = " + list2.toString());
You can explicitely call the toString method if you want to, but if you don't specify, Java will call the toString method for you, making an implicit call on toString.

Then we discussed the idea of encapsulation. When you buy a radio or other appliance at Best Buy that you'll find that all of the electronics are inside of a plastic or metal case. We would say that the electronics are encapsulated inside this case. You can't see them or touch them from the outside. In fact, if you flip the device over, you're likely to find a metal plate with screws that can be removed, but it often comes with a warning along the lines of, "Do not remove. You will void your warranty if you remove this."

Why the warning? Someone said that they don't want you to damage the electronics and that is exactly right. So is there something analogous in the ArrayIntList class we have been writing? What might the client might do to damage the object? Someone said that they could set the size to a huge number or a negative number:

        list1.size = 10000;
        list2.size = -384;
We can prevent this kind of interference by changing the fields of the class to be private. Currently they are declared this way:

    int[] elementData;
    int size;
We added the word private to each:

    private int[] elementData;
    private int size;
Then when we tried to compile the client code that changed size, we got an error message indicating that the size field is private and cannot be changed. Private fields cannot be modified outside of the class. This allows us to guarantee that our object is never in a corrupt state.

I mentioned that Joshua Bloch, who was the chief architect of the Java Collections Classes, emphasizes this in his book Effective Java (I have a link to it under "useful links"). He says:

The single most important factor that distinguishes a well-designed module from a poorly designed one is the degree to which the module hides its internal data and other implementation details from other modules. A well-designed module hides all of its implementation details, cleanly separating its API from its implementation. Modules then communicate only through their APIs and are oblivious to each others' inner workings. This concept known as information hiding or encapsulation is one of the fundamental tenants of software design.

Encapsulation is a good thing, but it seems like we want to allow clients to check the size of a list. To do so, we introduce a size method that the client can call to ask for the current list size:

        public int size() {
            return size;
        }
This allows the client to write lines of code like this:
        System.out.println("list1 size = " + list1.size());
I asked people whether there were other methods we probably want to supply that would allow a client to access the list. Someone mentioned that there should be a method to look at the individual values in the list. This is a method called "get" and it takes an index as a parameter, returning the integer at that index:

        public int get(int index) {
            return elementData[index];
        }
Someone asked what happens when a client calls the get method with an illegal value for index. So I spent a few minutes talking about the concept of preconditions and postconditions. This is a way of describing the contract that a method has with the client. Preconditions are assumptions the method makes. They are a way of describing any dependencies that the method has ("this has to be true for me to do my work"). Postconditions describe what the method accomplishes assuming the preconditions are met ("I'll do this as long as the preconditions are met.").

I have included pre/post comments on all of my ArrayIntList methods. I encourage people to use this style of commenting. It is not required, but if you use a different style, be sure that you have addressed the preconditions and postconditions of each method in the comments for the method.

As an example, I pointed out that methods like "get" that are passed an index assume that the index is legal. The method wouldn't know how to get something that is at a negative index or at an index that is beyond the size of the list. Whenever you find a method that has this kind of dependence, you should document it as a precondition of the method.

So we added the following comments to the get method:

        // pre : 0 <= index < size()
        // post: returns the value at the given index
        public int get(int index) {
            return elementData[index];
        }
It is important to document every precondition, every assumption that you are making about the method you are writing. But Java provides a mechanism for doing more. You can throw an exception when a client violates a precondition. For the get method, we can add the following code to do so:

The precondition for get says:

        // pre : 0 <= index < size()
We turned this into code by saying:

        if (index < 0 || index >= size) {
            throw new IndexOutOfBoundsException();
        }
There are different kinds of exceptions that can be thrown. For illegal values being passed to parameters, you would normally throw an IllegalArgumentException. But when you can be more specific, you should be. In this case, the IndexOutOfBoundsException is more helpful because it tells you more about the error (that it had to do with an index).

It may seem a little odd to construct an exception object each time you want to throw an exception, but there is a good reason for this. The exception object stores information about what was going on at the moment the exception was thrown. For example, it can provide a backtrace showing the sequence of calls that led to the exception. This is the information that Java displays when an exception like this is thrown. As a result, it is best to follow the pattern above of constructing the object as you are throwing it.

The throw statement is like a return statement in that it stops the method from executing. So we don't need an else branch for the other lines of code. The general pattern that Java programmers follow is to include one or more if statements that contain a throw statement at the beginning of a method to check obvious error conditions and then to include the actual code after the if statements.

You can also pass a String as a parameter to the exception object and that text will be displayed when the exception is thrown. This can be useful to provide feedback to the client. For example, in this case, you could let the client know what the illegal index was:

        if (index < 0 || index >= size) {
            throw new IndexOutOfBoundsException("index: " + index);
        }
Whenever a method of yours throws an exception, it is important to document the exception by name and indicate when it is thrown. So our precondition became:
        // pre : 0 <= index < size() (throws IndexOutOfBoundsException if not)
Then we discussed the issue of constructors. We have one constructor that creates an array length 100. I asked people whether they had any criticism of this code and someone pointed out that it's not very flexible to have the value 100 as the array size. What if you wanted an array 200 long or 500 long? So we modified the constructor to take an integer capacity:

        public ArrayIntList(int capacity) {
            elementData = new int[capacity];
            size = 0;
        }
Then we modified the client code to indicate a capacity:

        ArrayIntList list1 = new ArrayIntList(200);
        ArrayIntList list2 = new ArrayIntList(500);
This worked fine. But then I changed one of them so that it didn't list a capacity:

        ArrayIntList list1 = new ArrayIntList(200);
        ArrayIntList list2 = new ArrayIntList();
We got an error message. Java said that it could not find a zero-argument constructor. It turns out that by adding the constructor that takes a capacity, we lose the old constructor that takes no parameters. The rule is that if you don't define any constructors at all, Java will define a zero-argument constructor, but if you define even one constructor, then Java assumes you know what you're doing and doesn't define any for you. This means we have to add a new constructor for the zero-argument case:

        public ArrayIntList() {
            elementData = new int[100];
            size = 0;
        }
While you can define the constructor this way, it is better style to define this constructor in terms of the other constructor. Most Java classes have one "real" constructor that all the others call. It is most often the constructor that has more parameters than any of the others. You can have one constructor call another by including the keyword this and a set of parameter values inside parentheses, as in:

        public ArrayIntList() {
            this(100);
        }
The call on the other constructor must appear as the first statement. Java can tell that you are calling the other constructor because it sees an int inside of parentheses. I pointed out that if you accidentally wrote it this way:

        public ArrayIntList() {
            this();  // bad!!
        }
you'd have a constructor that calls itself infinitely.

We also discussed the idea that the number 100 is arbitrary, so we can introduce a class constant for it:

        public static final int DEFAULT_CAPACITY = 100;
We then rewrote our zero-argument constructor to be:

        public ArrayIntList() {
            this(DEFAULT_CAPACITY);
        }
I asked why it's okay to have a public constant but it's not okay to have public fields. Someone said that it's because it's a constant. The keyword "final" in the definition guarantees that nobody can alter the value of the constant, so there is no danger of someone damaging the constant. The same is not true of fields, which is why they should always be declared private.

Then we discussed the fact that the new version of ArrayIntList has a method called contains that can be used to ask the list if it contains a specific value. In some sense we don't need this method because we already have indexOf to find the location of a value in the list. But this is a standard method that many of the Java collections classes have, including the ArrayList that we are trying to understand.

I asked people what the return type should be and someone said boolean. So we want the method to look like this:

        public boolean contains(int value) {
            ...
        }
So how do we write this? We don't want to duplicate the code we included in indexOf, so instead we'll call indexOf. Remember that it returns a value of -1 if the value is not found, so we can test whether or not indexOf returned an index that is greater than or equal to 0:

        // very bad code
        public boolean contains(int value) {
            if (indexOf(value) >= 0) {
               return true;
            } else {
               return false;
            }
        }
I mentioned that we expect 143 students to use boolean variables and boolean expressions in their simplest form. This is something we refer to as "boolean zen" and it is described in chapter 5 of the textbook. Students will lose style points if they write code like what I have above.

Think about the core expression we're working with:

        indexOf(value) >= 0
What does this evaluate to? It evaluates either to true or to false. If it evaluates to true, we want to return true. If it evaluates to false, we want to return false. There's no need to use an if/else for this. We can simply return the value of the expression rather than including it in an if/else:

        return (indexOf(value) >= 0);
I put the expression inside parentheses to make it clear that it computes the value of that particular test, but even the parentheses are not necessary.

Finally we talked about a method called addAll. The idea is that we want to be able to append the sequence of values stored in another ArrayIntList. For example, if a list stores [1, 3, 8] and we do an addAll with a list that stores the values [4, 12, 17], we would expect the list to store [1, 3, 8, 4, 12, 17]. The header for the method looks like this:

        public void addAll(ArrayIntList other) {
            ...
        }
I asked people how we could do this. We have an add method that we can call to append individual values to the end of our list. We can write a loop that does that multiple times. But how many values do we want to add? Someone said the size of the other list. That's the right approach to use. So we can write a loop that looks like this:

        for (int i = 0; i < other.size; i++) {
            ...
        }
There is an interesting issue that comes up here. Do we write the code the way it is written above or do we use other.size()? The version above refers to the field while using the parentheses calls the size method. This issue comes up when we write the body of the loop as well. How do we find out what values are stored in the other list? We can call the get method, but we could also refer to the elementData field of the other object. I wrote it using a reference to the field as follows:

        for (int i = 0; i < other.size; i++) {
            add(other.elementData[i]);
        }
I said that this code probably wouldn't compile because we're refering to the fields and we have been careful to declare the fields to be private. But it did compile. This works because the understanding of the word private is that it is "private to the class." This is not at all the way that we as humans understand the meaning of private (if something is private to me, then it shouldn't be available to other humans). But in Java, one ArrayIntList object can access private elements of another ArrayIntList object because they are both of the same class.

There is an argument to be made that it can be good to use methods instead for this code:

        for (int i = 0; i < other.size(); i++) {
            add(other.get(i));
        }
Each approach has its own advantages and disadvantages. By referring directly to fields, the first code is slightly more efficient. But by using methods, the second version makes it easier to change in the future because we only have to rewrite the size and get methods and won't have to change this one as well. This is a classic engineering tradeoff where there isn't really a best solution. Students can use either approach in writing their own code unless we tell you otherwise.

These methods along with the methods from the first section constitute a more complete version of the ArrayIntList class that will serve as a good model for how to solve the first homework assignment. The calendar includes links to this new version and an example of a client program.

The new version has two extra improvements. There are preconditions for the adding methods related to not exceeding the capacity of the list. These have been included and the relevant methods throw exceptions if the precondition is violated. The new version also introduces two private helper methods to factor out the redundant error checking. The methods get and remove perform the same check for an illegal index and this has been captured in a private method called checkIndex. The two methods for adding a single value and the addAll method perform the same check for not exceeding the capacity and this has been captured in a private method called checkCapacity. You should not add new public methods to a class that are not part of the specification, but it is good style to introduce private methods that allow you to write better code for the implementation.

I said that in section we would discuss a "bad" version of this class that is correct but has poor programming style.


Stuart Reges
Last modified: Fri Jan 5 18:33:08 PST 2018