CSE143 Notes for Friday, 1/12/18

We started a new topic. We are going to learn about what are known as linked lists. In doing so, we will explore in great detail the difference between:

data
a reference to data

We have been examining how to store a list of values in an array and we saw that arrays use a contiguous block of memory. One downside to that is that it is not easy to expand the array to store more values. So why have contiguous memory? Someone pointed out that it makes it fast. That's true. Arrays are what we call "random access" structures because we can quickly access any value within the array. Even if you ask for element 5000, you'd know right where to find it in memory because everything is stored together as one big block.

We'll find that linked lists have what we would call "sequential access." That means that it can be slow to access things in the middle of the list. A good analogy is to think of CDs versus cassette tapes. With a CD, you can quickly jump from track 2 to track 18. With a cassette tape, you have to fast forward through the tracks in between. This can take a long time. The same is true with linked lists. In fact, we'll find that the things that arrays do particularly well linked lists tend to do badly and vice versa.

I asked people if they could think of some other thing that arrays do badly. Someone mentioned that inserting or removing in the middle can be expensive. That's exactly right. If you have something like 10 thousand values in an array and you want to get rid of the first one, you have to shift 9,999 values over to fill in the gap. This will be a case where linked lists are much faster than arrays.

With arrays, we might store a list of 6 ints as follows:

                         [0]   [1]   [2]   [3]   [4]   [5]
             +---+     +-----+-----+-----+-----+-----+-----+
        list | +-+-->  |  0  |  2  |  40 |  23 |  14 |  72 |
             +---+     +-----+-----+-----+-----+-----+-----+

Imagine cutting this array up into individual variables and scattering them throughout memory:

        +-----+   +-----+   +-----+   +-----+   +-----+   +-----+
        |  23 |   |  2  |   |  40 |   |  0  |   |  14 |   |  72 |
        +-----+   +-----+   +-----+   +-----+   +-----+   +-----+

If the values are going to be scattered throughout memory, we would have to somehow connect them to each other to keep track of the order of our list:

           +---->---->---->---->---->---->---->----+
           ^                                       |
           |         +----<----<----<----+         V
           ^         |                   ^         |
           |         V                   |         V
        +-----+   +-----+   +-----+   +-----+   +-----+   +-----+
        |  23 |   |  2  |-->|  40 |   |  0  |   |  14 |-->|  72 |-end
        +-----+   +-----+   +-----+   +-----+   +-----+   +-----+
           ^                   |
           |                   V
           +----<----<----<----+

Each bit of data is going to point to the next bit of data and the final bit of data (72) will have a special value that will indicate that we are at the end of the list. You might think that even with this interconnected structure, we'd have to keep track of where each value is stored. In fact, we just need a reference to the front of the list. So if we can get to the value that stores 0 in it, then from there we can get to every other value in the list. This is the basic idea that we are going to explore with linked lists.

Linked lists are composed of individual elements called nodes. Each node is like a Lego building block. It looks unimpressive by itself, but once you put a bunch of them together, it can form an interesting structure.

A basic list node looks like this:

        +------+------+
        | data | next |
        |  18  |  +---+--->
        +------+------+

It's an object with two data fields: one for storing a single item of data and one for storing a reference to the next node in the list. For a list of int values, we'd declare this as follows:

        public class ListNode {
            public int data;
            public ListNode next;
        }

I pointed out that this isn't a nicely encapsulated object because of the public data fields. I said that I'd discuss this later in the week (why this is okay to do). I also pointed out that this is a recursive data structure (a class that is defined in terms of itself in that the class is called ListNode and it has a data field of type ListNode).

Then we wrote some code that would build up the list (3, 7, 12). Obviously we're going to need three nodes that are linked together. With linked lists, if you have a reference to the front of the list, then you can get to anything in the list. So we'll usually have a single variable of type ListNode that refers to (or points to) the front of the list. So we began with this declaration:

        ListNode list;

The variable "list" is not itself a node. It's a variable that is capable of referring to a node. So we'd draw it something like this:

         +---+
    list | ? |
         +---+

where we understand that the "?" is going to be replaced with a reference to a node. So this box does not have a "data" field or a "next" field. It's a box where we can store a reference to such an object.

We don't have an actual node until we call new:

        list = new ListNode();

This constructs a new node and tells Java to have the variable "list" refer to it:

                    +------+------+
         +---+      | data | next |
    list | +-+--->  |      |      |
         +---+      +------+------+

What do we want to do with this node? We want to store 3 in its data field (list.data) and we want its next field to point to a new node:

        list.data = 3;
        list.next = new ListNode();

which leads us to this situation:

                    +------+------+      +------+------+
         +---+      | data | next |      | data | next |
    list | +-+--->  |   3  |   +--+--->  |      |      |
         +---+      +------+------+      +------+------+

When you program linked lists, you have to be careful to keep track of what you're talking about. The variable "list" stores a reference to the first node. We can get inside that node with the dot notation (list.data and list.next). So "list.next" is the way to refer to the "next" box of the first node. We wrote code to assign it to refer to a new node, which is why "list.next" is pointing at this second node.

Now we want to assign the second node's data field (list.next.data) to the value 7 and assign the second node's next field to refer to a third node:

        list.next.data = 7;
        list.next.next = new ListNode();

which leads us to this situation:

                    +------+------+      +------+------+      +------+------+
         +---+      | data | next |      | data | next |      | data | next |
    list | +-+--->  |   3  |   +--+--->  |   7  |   +--+--->  |      |      |
         +---+      +------+------+      +------+------+      +------+------+

I again repeated the idea of paying close attention to list versus list.next versus list.next.next and remember which box each of those coincides with:

                    +------+------+      +------+------+      +------+------+
         +---+      | data | next |      | data | next |      | data | next |
    list | +-+--->  |   3  |   +--+--->  |   7  |   +--+--->  |      |      |
         +---+      +------+------+      +------+------+      +------+------+
           |                   |                    |
           |                   |                    |
          list             list.next          list.next.next

Finally, we want to set the data field of this third node to 12 (list.next.next.data) and we want to set its next field to null. The keyword "null" is a Java word that means "no object". This provides a "terminator" for the linked list (a special value that indicates that we are at the end of the list). So we'd execute these statements:

        list.next.next.data = 12;
        list.next.next.next = null;

which leaves us in this situation:

                    +------+------+      +------+------+      +------+------+
         +---+      | data | next |      | data | next |      | data | next |
    list | +-+--->  |   3  |   +--+--->  |   7  |   +--+--->  |  12  |   /  |
         +---+      +------+------+      +------+------+      +------+------+

We draw a diagonal line through the last "next" field as a way to indicate that it's value is null. The assignment to null is actually unnecessary. Java will initialize all data fields to the "zero equivalent" for that particular type. For type int, that means initializing to 0. For double, it initializes to 0.0. For boolean, it initializes to false. For arrays and other objects, it initializes to null. But it's not a bad idea to include the code to make it perfectly clear what's going on.

Obviously this is a very tedious way to manipulate a list. It's much better to write code that involves loops to manipulate lists. But it takes a while to get used to this idea, so we're first going to practice how to do some raw list operations without a loop.

The calendar includes this simple code along with a new version of the ListNode class that includes several constructors:

        public class ListNode {
            public int data;
            public ListNode next;
        
            public ListNode() {
                this(0, null);
            }
        
            public ListNode(int data) {
                this(data, null);
            }
        
          public ListNode(int data, ListNode next) {
              this.data = data;
              this.next = next;
          }
        }

As with the other classes we've seen, there is one "real" constructor (the one that takes two arguments). The other two use the "this(...)" notation to call the third constructor with default values (0 for the data, null for next). With the new version of the class, it is possible to write a single line of code to construct the list.

In section we will go over 10 different exercises that involve list operations. Each will have a "before" picture and an "after" picture. The challenge is to write code that gets you from the one state to the other state.

As an example, suppose that you have two variables of type ListNode called p and q and that you have the following situation:

                    +------+------+      +------+------+
         +---+      | data | next |      | data | next |
       p | +-+--->  |   2  |   +--+--->  |   4  |   /  |
         +---+      +------+------+      +------+------+

                    +------+------+      +------+------+
         +---+      | data | next |      | data | next |
       q | +-+--->  |   3  |   +--+--->  |   9  |   /  |
         +---+      +------+------+      +------+------+

and you want to get to this situation:

                    +------+------+      +------+------+      +------+------+
         +---+      | data | next |      | data | next |      | data | next |
       p | +-+--->  |   2  |   +--+--->  |   4  |   +--+--->  |   3  |   /  |
         +---+      +------+------+      +------+------+      +------+------+

                    +------+------+
         +---+      | data | next |
       q | +-+--->  |   9  |   /  |
         +---+      +------+------+

How do we do it? I started by asking people how many variables of type ListNode we have. I got various answers. Some people said two (probably thinking of p and q). Other people said four (probably thinking of p, q and the two non-null links). But in fact, there are six different variables of type ListNode. I numbered each one:

                              2                    3
           1        +------+------+      +------+------+
         +---+      | data | next |      | data | next |
       p | +-+--->  |   2  |   +--+--->  |   4  |   /  |
         +---+      +------+------+      +------+------+

                               5                    6
           4        +------+------+      +------+------+
         +---+      | data | next |      | data | next |
       q | +-+--->  |   3  |   +--+--->  |   9  |   /  |
         +---+      +------+------+      +------+------+

Then I asked which of these variables has to change in value. The answer is that the boxes numbered 3, 4 and 5 have to be changed. If we change them appropriately, we'll be done. But we have to be careful of how we do so. Order can be important. For example, suppose we were going to start by changing box 4. In the final situation, it's supposed to point at the node with 9 in it. But if we started with that change, then what would happen to the node with 3 in it? We'd lose track of it. This is potentially a problem.

Of the three values we have to change to solve this problem, the one that is safe to change is box 3 because it's currently null. So we begin by setting it to point to the node with 3 in it:

        p.next.next = q;

Now that we've used the value of box 4 to reset box 3, we can reset box 4. It's supposed to point to the node that has 9 in it. We can do this by "leap frogging" over the current node it's pointing to:

        q = q.next;

Now we just have to reset box 5. But we can no longer refer to box 5 as q.next because we've changed q. Now we have to refer to it this way:

        p.next.next.next = null;

Putting these three lines together, we see the code that is needed to get from the initial state to the final state:

        p.next.next = q;
        q = q.next;
        p.next.next.next = null;

Obviously this can be very confusing. It is essential that you draw pictures to keep track of what is pointing where and what is going on when this code executes. It's the only way to master linked list code. We'll practice these small problems in section so that in lecture on Wednesday we can turn to the question of how to use loops to do more generalized processing of linked lists.

Then I switched to talking about the next programming assignment. I first gave an example that involved constructing an array of objects. I used the example of constructing an array of Point objects. The Point class is part of the java.awt package that is used for graphics. It was also a primary example in the CSE142 class.

I began with this code:

        import java.awt.*;
        
        public class PointArray {
            public static void main(String[] args) {
                Point[] points;
            }
        }

I asked what kind of objects this program creates. The answer is that it doesn't create any objects. It defines a variable called points that is of type Point[], which means that it is capable of storing a reference to an array of Point objects. But if we want an actual array or some actual Point objects, we have to explicitly construct them.

So I added code to construct the array:

        Point[] points = new Point[5];

We used jGRASP to see what the program is doing and we saw that it constructs the array, but not any Point objects. When you work with an array of objects, you have to construct not just the array, but also every individual object.

Java initializes the array to the zero-equivalent for the type, which in the case of an array of objects means that Java initializes each array element to null:

           +--+     +---------+---------+---------+---------+---------+
    points | -+-->  |    /    |    /    |    /    |    /    |    /    |
           +--+     +---------+---------+---------+---------+---------+
                        [0]     [1]       [2]       [3]       [4]

It is a common convention to use a slash ("/") to represent null. To fill up this array, we had to write a loop that constructed individual Point objects that were stored in the array:

        for (int i = 0; i < points.length; i++) {
            points[i] = new Point(i, 2 * i + 1);
        }

This constructed 5 different Point objects:

                     +-------+ +-------+ +-------+ +-------+ +-------+
                     | x = 0 | | x = 1 | | x = 2 | | x = 3 | | x = 4 |
                     | y = 1 | | y = 3 | | y = 5 | | y = 7 | | y = 9 |
                     +-------+ +-------+ +-------+ +-------+ +-------+
                         ^         ^         ^         ^         ^
                         |         |         |         |         |
           +--+     +----+----+----+----+----+----+----+----+----+----+
    points | -+-->  |    *    |    *    |    *    |    *    |    *    |
           +--+     +---------+---------+---------+---------+---------+
                        [0]       [1]       [2]       [3]       [4]

I then asked how we could print each of the Point objects with a println. Someone mentioned that we could use a foreach loop:

        for (Point p : points) {
            System.out.println(p);
        }

This produced the following output:

        java.awt.Point[x=0,y=1]
        java.awt.Point[x=1,y=3]
        java.awt.Point[x=2,y=5]
        java.awt.Point[x=3,y=7]
        java.awt.Point[x=4,y=9]

Then I started discussing the programming assignment. The assignment involves simulating guitar strings and musical instruments built from those strings. Each instrument will be stored in its own class. You will be provided with a class called GuitarLite that has just two guitar strings and you will define a class called Guitar37 that has 37 strings.

I asked people to consider the issue of making our code more general. If we know that we are going to want to use different objects as our guitar, then how do we structure our code so that we can make minimal changes to the code? We don't want to write all of our code for one kind of guitar and then find that it doesn't work for another kind of guitar.

Someone mentioned that this is a good place to introduce an interface. To do so, we have to think about what are the behaviors we expect of a guitar object. We expect the guitar to have these methods:

a playNote method that plays a specific note given its pitch
a hasString method that can be used to test whether the guitar recognizes a certain character as corresponding to one of its strings
a pluck method that plucks one of the strings
a sample method that will return the current sound sample
a tic method that will advance the simulation one step
a time method that will return the number of times tic has been called

This can be turned into an interface:

        public interface Guitar {
            public void playNote(int pitch);
            public boolean hasString(char string);
            public void pluck(char string);
            public double sample();
            public void tic();
            public int time();
        }

I mentioned that the first part of the assignment involves implement a GuitarString class that simulates a string that can be plucked. The writeup gives you all of the information about how the simulation works.

You are provided with a GuitarLite object that has two strings. Once you have a working GuitarString class, you can use the GuitarLite object to write some client code. I said it would be instructive for us to write some client code. I began by constructing a GuitarLite object:

        Guitar g = new GuitarLite();

Notice that the variable is of type Guitar using the interface. We only need to specify the specific type of Guitar object when we construct it.

The Guitar interface has methods for playing the guitar. You can play a note using the playNote method. The notes are specified using a chromatic scale where concert-A has the value 12. But you have to give more instructions to the guitar object than just to play the note. Telling it to play the note will pluck an appropriate string. But there are two important methods called sample and tic. The sample method returns the current sound information that we send to the sound card and the tic method advances the simulation. Most often you'll do these two things together, calling sample to play the sound and then calling tic. But they are distinct operations, so it's best to have them in separate methods. For example, you might want to mute the program so you might call tic to move time forward without calling sample.

This simulation involves sending information to the sound card at a rapid rate. We are using a class called StdAudio that has a constant called SAMPLE_RATE. It indicates that we are sampling 44,100 times a second. So that means we have to call sample and tic thousands of times just to get a fraction of a second of audio.

Then I mentioned that these Guitar objects also have a different interface for playing notes. Each Guitar object is allowed to introduce a mapping of characters to notes. For the GuitarLite object, the mapping is from "a" and "c" to concert-A and concert-C. The Guitar37 class has a more complex mapping that allows you to use the computer's keyboard to play it like a piano. I opened up a client program that is called GuitarHero. This is being provided to you. It is set up to play the GuitarLite guitar. I compiled it and ran it and I showed that it was able to play those two notes as I would hit the keys "a" and "c". Then I changed this line of code:

        Guitar g = new GuitarLite();

to be:

        Guitar g = new Guitar37();

and then I was able to play 37 different keys. I played a bad version of "Three Blind Mice." I spent the rest of the time discussing details about the homework that are included in the assignment writeup.

Stuart Reges

Last modified: Fri Jan 12 17:26:44 PST 2018