CSE341 Notes for Wednesday, 4/8/09

I started the lecture by discussing polymorphism in ML. For example, we can write a function to swap the order of values in a tuple by saying:

        fun switch(a, b) = (b, a)

When I typed that into the interpreter, it responded with this:

        val switch = fn : 'a * 'b -> 'b * 'a

The 'a and 'b are generic type parameters, similar to the type parameters we use in Java for defining structures like ArrayList. Just as in Java the "E" is filled in with a specific type, in ML the 'a and 'b are filled in with specific types.

You don't have to define functions polymorphically. We can, for example, say:

        fun switch(a:string, b:int) = (b, a)

In this case ML responds with:

        val switch = fn : string * int -> int * string

In general, though, we prefer to declare functions with polymorphism so that they can be applied to a wider range of values.

I then asked people to consider this definition for the last function that is supposed to return the last value of a list:

        fun last(lst) =
            if lst = [hd(lst)] then hd(lst)
            else last(tl(lst))

When I loaded this in the ML interpreter, I got a warning and a slightly different type notation:

        wed.sml:2.12 Warning: calling polyEqual
        val last = fn : ''a list -> ''a

The warning is generated by line 2 (in fact, character 12 of line 2 is what the "2.12" means). That happens because we have written this function in such a way that it depends on recognizing the equality of two different expressions. Many types in ML can be compared for equality, but not all. For example, we got an error when we went into the interpreter and asked:

        - 3.8 = 3.8;
        stdIn:1.1-1.10 Error: operator and operand don't agree [equality type required]
          operator domain: ''Z * ''Z
          operand:         real * real
          in expression:
    3.8 = 3.8

ML does not allow you to compare values of type real for equality. The reasoning is that floating point numbers are stored as approximations, not as exact representations, so you shouldn't use a strict equality operation.

So the warning is letting us know that we have written the function in such a way that we can apply it only to lists of equality types. We would not be able to use it on a list of real values. ML indicates that with the double apostrophe on the generic type. Instead of 'a, ML describes it in turns of ''a.

In general, you want write your functions so that they don't have this limitation. There is no reason that you can't write a the last function in such a way that it will be general. But sometimes you'll be writing a more specific kind of function where this limitation isn't a problem. In fact, in some cases you won't be able to avoid it because part of the work of the function is to compare values for equality.

I then talked about how to implement a function called member that would return true or false depending upon whether a particular value is a member of a list. I asked what kind of lists would make it easy to answer this question and someone said an empty list, in which case the answer is false, so we began with:

        fun member(x, []) = false
        ...

I asked people whether the variable x is used in this case and everyone said no. In ML, when you're not using the value of a variable, it is customary to use an anonymous variable instead which we indicate with an underscore:

        fun member(_, []) = false
        ...

I then asked if any other cases would be easy? Someone said that if the list begins with the value you're looking for, then we'd know its a member, so we tried saying:

        fun member(_, []) = false
        |   member(x, x::xs) = true

And what if it doesn't occur at the beginning of the list? Then we search the rest of the list for it:

        fun member(_, []) = false
        |   member(x, x::xs) = true
        |   member(x, y::ys) = member(x, ys);

When I loaded this definition into ML, we got an error message:

        Error: duplicate variable in pattern(s): x

Pattern matching is limited in what it can handle. In particular, it can't figure out this kind of match where the same variable is used in two different patterns. But we can do the same kind of thing ourselves with a boolean expression:

        fun member(_, []) = false
        |   member(x, y::ys) = (x = y) orelse member(x, ys);

This is a correct implementation of the function. When we loaded it into ML we the polyEqual warning:

wed.sml:7.11 Warning: calling polyEqual
val member = fn : ''a * ''a list -> bool

The warning is okay in this case because it is implicit in the nature of member that it has to perform an equals comparison.

Then I asked people how to write a function that would return a list in reverse order. For example, the call rev([1, 2, 3, 4]) should return [4, 3, 2, 1]. This can be easily written using the concatenation operator:

        fun rev([]) = []
        |   rev(x::xs) = rev(xs) @ [x]

This version works, but it is inefficient. To explain why, I spent a few minutes discussing how the :: and @ operators work in ML.

Consider the following val bindings:

        val x = [1, 2, 3];
        val y = 0::x;
        val z = x @ [4];

ML stores lists internally as a linked list of nodes that each have data and a reference to the next node (these correspond to the head and the tail). So the structure is similar to what we get with standard Java linked list nodes:

        public class ListNode {
            public int data;
            public ListNode next;
            ...
        }

So when we execute:

        val x = [1, 2, 3];

ML creates a list of 3 nodes:

        x --> [1] --> [2] --> [3]

What happens when we execute the second binding?

        val y = 0::x;

The :: operator is often referred to as "cons," which is short for "construct." In other words, it constructs a new list element:

        y --> [0] --> ??

This new list element has 0 as the head, but what does it use for the tail? ML could make a copy of the list that x refers to, but that's not what happens. instead, it sets up a link that shares the memory set aside for x:

             x --> [1] --> [2] --> [3]
                    ^
                    |
        y --> [0] --+

In the Java universe this would be a bad idea. If you share part of the list, then you're likely to end up with a confusing outcome. For example, what if you use the variable y to traverse the list and you change the second and third values in the list. The second and third values in the list that y refers to are the first and second values in the list that x refers to. So in that case, changing y would change x as well. Normally we'd want to avoid that kind of interference.

This is where the concept of mutable state comes into play. In ML, lists are immutable. Once you have constructed a list element with a particular head and tail, you can never change them. So it's not dangerous to allow this kind of sharing because ML prevents this kind of interference. This is an example where the choice to prevent mutation has made it easier for ML to be efficient about the allocation of space.

You can simulate this in Java as well. To get immutable lists in Java, you'd make the fields final:

        public class ListNode {
            public final int data;
            public final ListNode next;
            ...
        }

But what about the final binding?

        val z = x @ [4];

This is a case where ML can't make use of sharing. The variable x refers to a list that ends with 3. If you tried to change it to instead point to a new list element storing 4, then you'd be damaging the original list. So this is a case where ML has no choice but to make a copy of the contents of x:

        z --> [1] --> [2] --> [3] --> [4]

A simple rule of thumb to remember is that the :: operator always executes in O(1) time (constant time) because it always constructs exactly one list element while the @ operator runs in O(n) time where n is the length of the first list because that first list has to be copied.

So let's return to the inefficient version of the reversing function:

        fun rev([]) = []
        |   rev(x::xs) = rev(xs) @ [x]

Because we are using the @ operator, we are going to be created lots of list copies. In effect, to reverse a list of n elements, we're going to make a copy of a list of length 1, and a copy of a list of length 2, and a copy of a list of length 3, and so on, ending with making a copy of a list of length n-1. That will require O(n²) time.

We can do better than that. I asked people how you'd approach it iteratively. We came up with this pseudocode:

        list [1, 2, 3, 4]
        result = []
        while (list is not empty) {
           x = remove first element
           result = x::result;
        }

As I mentioned in the previous lecture, you can translate an iterative process like this in a functional equivalent by thinking about the different states that this computation goes through. There are two different variables involved here: list and result. Here's how they change as you iterate through the loop:

        list            result
        ----------------------------
        [1, 2, 3, 4]    []
        [2, 3, 4]       [1]
        [3, 4]          [2, 1]
        [4]             [3, 2, 1]
        []              [4, 3, 2, 1]

Instead of having two variables that change in value each time you iterate through the loop (the mutable state approach), you can instead have a function of two arguments where each time you call the function you compute the next pair of values to use in the computation. So we'll write this using a helper function:

        fun rev2(lst) =
            let fun loop(list, result) = ??
            in ??
            end

The loop starts with list being the overall list and result being empty. In the functional version, we make this the initial call on the helper function:

        fun rev2(lst) =
            let fun loop(list, result) = ??
            in loop(lst, [])
            end

The loop ends when list becomes empty, in which case the answer is stored in result, so this becomes one of the cases for our helper function:

        fun rev2(lst) =
            let fun loop([], result) = result
                ...
            in loop(lst, [])
            end

Now we just need a case for the other iterations. In the pseudocode, we pulled an x off the front of the list and moved it to result. We can accomplish this with a pattern of x::xs for the list and by moving x into the result in our recursive call:

        fun rev2(lst) =
            let fun loop([], result) = result
                |   loop(x::xs, result) = loop(xs, x::result)
            in loop(lst, [])
            end

We saw that this version worked and ran in a reasonable amount of time even for lists with a million values.

Ullman describes this example on pages 84-88 of the book. He mentions in the text that this second approach is known as a difference list.

Then I discussed the fact that ML allows you to introduce infix operators. It's important to understand that an infix operator is really a function. The only difference is that it's called in a different way using infix notation rather than prefix notation (having the function name in between the arguments instead of in front of the arguments). As an example, I showed how to define an infix operator that can be used to construct lists of consecutive integers:

        infix --;
        fun x--y =
            if x > y then []
            else x::((x+1)--y);

This allows you to ask for a list of integer in a particular range. Here are a few examples from the interpreter:

        - 1--10;
        val it = [1,2,3,4,5,6,7,8,9,10] : int list
        - 1--100;
        val it =
          [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,
           29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,
           54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,
           79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100]
          : int list

Then I briefly introduced a function called map:

        fun map(f, []) = []
        |   map(f, x::xs) = f(x)::map(f, xs);

This function applies a function to every element of a list. Using this function and map, I converted the integers 1 through 100 to real numbers:

        - map(real, 1--100);
          [1.0,2.0,3.0,4.0,5.0,6.0,7.0,8.0,9.0,10.0,11.0,12.0,13.0,14.0,15.0,16.0,
           17.0,18.0,19.0,20.0,21.0,22.0,23.0,24.0,25.0,26.0,27.0,28.0,29.0,30.0,31.0,
           32.0,33.0,34.0,35.0,36.0,37.0,38.0,39.0,40.0,41.0,42.0,43.0,44.0,45.0,46.0,
           47.0,48.0,49.0,50.0,51.0,52.0,53.0,54.0,55.0,56.0,57.0,58.0,59.0,60.0,61.0,
           62.0,63.0,64.0,65.0,66.0,67.0,68.0,69.0,70.0,71.0,72.0,73.0,74.0,75.0,76.0,
           77.0,78.0,79.0,80.0,81.0,82.0,83.0,84.0,85.0,86.0,87.0,88.0,89.0,90.0,91.0,
           92.0,93.0,94.0,95.0,96.0,97.0,98.0,99.0,100.0] : real list

and I used this list to compute the square roots of these:

        - map(Math.sqrt, map(real, 1--100));
        val it =
         [1.0,1.41421356237,1.73205080757,2.0,2.2360679775,2.44948974278,
           2.64575131106,2.82842712475,3.0,3.16227766017,3.31662479036,3.46410161514,
           3.60555127546,3.74165738677,3.87298334621,4.0,4.12310562562,4.24264068712,
           4.35889894354,4.472135955,4.58257569496,4.69041575982,4.79583152331,
           4.89897948557,5.0,5.09901951359,5.19615242271,5.29150262213,5.38516480713,
           5.47722557505,5.56776436283,5.65685424949,5.74456264654,5.83095189485,
           5.9160797831,6.0,6.0827625303,6.16441400297,6.2449979984,6.32455532034,
           6.40312423743,6.48074069841,6.5574385243,6.63324958071,6.7082039325,
           6.78232998313,6.8556546004,6.92820323028,7.0,7.07106781187,7.14142842854,
           7.21110255093,7.28010988928,7.34846922835,7.4161984871,7.48331477355,
           7.54983443527,7.61577310586,7.68114574787,7.74596669241,7.81024967591,
           7.87400787401,7.93725393319,8.0,8.0622577483,8.12403840464,8.18535277187,
           8.24621125124,8.30662386292,8.36660026534,8.42614977318,8.48528137424,
           8.54400374532,8.60232526704,8.66025403784,8.71779788708,8.77496438739,
           8.83176086633,8.88819441732,8.94427191,9.0,9.05538513814,9.11043357914,
           9.16515138991,9.21954445729,9.2736184955,9.32737905309,9.38083151965,
           9.43398113206,9.48683298051,9.53939201417,9.59166304663,9.64365076099,
           9.69535971483,9.74679434481,9.79795897113,9.8488578018,9.89949493661,
           9.94987437107,10.0] : real list

I said that we'd discuss this in more detail in Friday's lecture.

Stuart Reges

Last modified: Wed Apr 8 15:26:35 PDT 2009