CSE413 Notes for Friday, 1/19/24

I began by saying that I wanted to modify the qsort function we wrote earlier to take a comparison function as an extra parameter. That will allow us to sort data using whatever ordering criteria we prefer. Recall that we ended up with this code:

         let rec qsort(lst) =
             match lst with
             | []          -> []
             | pivot::rest ->
                 let rec split(lst1, lst2, lst3) =
                     match lst1 with
                     | []    -> qsort(lst2) @ [pivot] @ qsort(lst3)
                     | x::xs -> if x <= pivot then split(xs, x::lst2, lst3)
                                else split(xs, lst2, x::lst3)
                 in split(rest, [], [])

We don't have to change much. We have to change the function header to have an extra parameter that I called "less". We have to use that function instead of testing whether x<=pivot. And we have to include it as a parameter for our two recursive calls:

        let rec qsort(less, lst) =
            match lst with
            | []          -> []
            | pivot::rest ->
                let rec split(lst1, lst2, lst3) =
                    match lst1 with
                    | []    -> qsort(less, lst2) @ [pivot] @ qsort(less, lst3)
                    | x::xs -> if less(x, pivot) then split(xs, x::lst2, lst3)
                               else split(xs, lst2, x::lst3)
                in split(rest, [], [])

I set up a list we could use for testing and tried calling qsort passing it the built-in less-than operator:

        # let test = [17; 3; 42; 198; -5; 14; -7; 0; -203; -2; 6; 155];;
        val test : int list = [17; 3; 42; 198; -5; 14; -7; 0; -203; -2; 6; 155]
        # qsort((<), test);;
        Error: This expression has type 'a * 'a -> 'a * 'a -> bool
               but an expression was expected of type 'a * 'a -> bool
               Type 'a * 'a -> bool is not compatible with type bool

This produced an error because we wrote qsort with the assumption that the comparison function would take a tuple. The built-in less-than operator is curried. Remember that our utility file has a function called uncurry that can be used to convert a curried function into an uncurried one:

        # qsort(uncurry(<), test);;
        - : int list = [-203; -7; -5; -2; 0; 3; 6; 14; 17; 42; 155; 198]
        # qsort(uncurry(>), test);;
        - : int list = [198; 155; 42; 17; 14; 6; 3; 0; -2; -5; -7; -203]

Notice that we can sort in either ascending or descending order by using either the standard less-than or the standard greater-than. We also can define an anonymous function:

        # qsort((fun (a, b) -> a mod 3 < b mod 3), test);;
        - : int list = [-5; -2; -203; -7; 6; 3; 0; 42; 198; 17; 155; 14]
        # qsort((fun (a, b) -> abs(a) < abs(b)), test);;
        - : int list = [0; -2; 3; -5; 6; -7; 14; 17; 42; 155; 198; -203]

In the first case we are sorting by the value in mod 3 (with the multiples of 3 first, then the values 1 more than a multiple of 3, then the values 2 more than a multiple of three). In the second call we use the abs function to allow us to sort by absolute value.

Then I asked people how to write a function that would return a list in reverse order. For example, the call reverse(1--4) should return [4; 3; 2; 1]. We use the familiar pattern match of an empty list and a non-empty list, but we had to think about what to do with the "x" at the front of the list we are processing. Someone suggested using the cons operator (::) to add it to the end of the list.

        let rec reverse(lst) =
            match lst with
            | []    -> []
            | x::xs -> reverse(xs)::x

The problem is that :: expects a value followed by a list and this has it in the reverse order. Someone mentioned that we could use the append operator instead:

        let rec reverse(lst) =
            match lst with
            | []    -> []
            | x::xs -> reverse(xs) @ [x]

This version works, but it is inefficient. To explain why, I spent a few minutes discussing how the :: (cons) and @ (append) operators work in OCaml.

Consider the following bindings:

        let x = [1; 2; 3]
        let y = 0::x
        let z = x @ [4]

OCaml stores lists internally as a linked list of nodes that each have data and a reference to the next node (these correspond to the head and the tail). So the structure is similar to what we get with standard Java linked list nodes:

        public class ListNode {
            public int data;
            public ListNode next;
            ...
        }

So when we execute:

        let x = [1; 2; 3]

OCaml creates a list of 3 nodes:

        x --> [1] --> [2] --> [3]

What happens when we execute the second binding?

        let y = 0::x

Recall that the :: operator is referred to as "cons," which is short for "construct." In other words, it constructs a new list element:

        y --> [0] --> ??

This new list element has 0 as the head, but what does it use for the tail? OCaml could make a copy of the list that x refers to, but that's not what happens. instead, it sets up a link that shares the memory set aside for x:

             x --> [1] --> [2] --> [3]
                    ^
                    |
        y --> [0] --+

In the Java universe this would be a bad idea. If you share part of the list, then you're likely to end up with a confusing outcome. For example, what if you use the variable y to traverse the list and you change the second and third values in the list. The second and third values in the list that y refers to are the first and second values in the list that x refers to. So in that case, changing y would change x as well. Normally we'd want to avoid that kind of interference.

This is where the concept of mutable state comes into play. In OCaml, lists are immutable. Once you have constructed a list element with a particular head and tail, you can never change them. So it's not dangerous to allow this kind of sharing because OCaml prevents this kind of interference. This is an example where the choice to prevent mutation has made it easier for OCaml to be efficient about the allocation of space.

You can simulate this in Java as well. To get immutable lists in Java, you'd make the fields final:

        public class ListNode {
            public final int data;
            public final ListNode next;
            ...
        }

But what about the final binding?

        let z = x @ [4]

This is a case where OCaml can't make use of sharing. The variable x refers to a list that ends with 3. If you tried to change it to instead point to a new list element storing 4, then you'd be damaging the original list. So this is a case where OCaml has no choice but to make a copy of the contents of x:

        z --> [1] --> [2] --> [3] --> [4]

A simple rule of thumb to remember is that the :: operator always executes in O(1) time (constant time) because it always constructs exactly one list element while the @ operator runs in O(n) time where n is the length of the first list because that first list has to be copied.

So let's return to the inefficient version of the reversing function:

        let rec reverse(lst) =
            match lst with
            | []    -> []
            | x::xs -> reverse(xs) @ [x]

Because we are using the @ operator, we are going to be creating lots of list copies. In effect, to reverse a list of n elements, we're going to make a copy of a list of length 1, and a copy of a list of length 2, and a copy of a list of length 3, and so on, ending with making a copy of a list of length n-1. That will require O(n²) time. We saw that when we made a call on this version of reverse passing it a list with 10 thousand elements (1--10000).

We can do better than that. I asked people how you'd approach it iteratively. We came up with this pseudocode:

        set result to an empty list
        while (list we are reversing is not empty) {
           x = remove first element of list we are reversing
           add x to result list
        }

You can translate an iterative process like this in a functional equivalent by thinking about the different states that this computation goes through. There are two different variables involved here: list and result. Here's how they change as you iterate through the loop assuming we are working with the list 1--4.:

        list            result
        ----------------------------
        [1; 2; 3; 4]    []
        [2; 3; 4]       [1]
        [3; 4]          [2; 1]
        [4]             [3; 2; 1]
        []              [4; 3; 2; 1]

Instead of having two variables that change in value each time you iterate through the loop (the mutable state approach), you can instead have a function of two arguments where each time you call the function you compute the next pair of values to use in the computation. So we'll write this using a helper function:

        let reverse(lst) =
            let rec helper(lst1, lst2) = ??
            in helper(??)

The loop starts with list being the overall list and result being empty. In the functional version, we make this the initial call on the helper function:

        let reverse(lst) =
            let rec helper(lst1, lst2) = ??
            in helper(lst, [])

The loop ends when list becomes empty, in which case the answer is stored in result, so this becomes one of the cases for our helper function:

        let reverse(lst) =
            let rec helper(lst1, lst2) =
                match lst1 with
                | [] -> lst2
                ...
            in helper(lst, [])

Now we just need a case for the other iterations. In the pseudocode, we pulled an x off the front of the list and moved it to result. We can accomplish this with a pattern of x::xs for the list and by moving x into the result in our recursive call:

        let reverse(lst) =
            let rec helper(lst1, lst2) =
                match lst1 with
                | []    -> lst2
                | x::xs -> helper(xs, x::lst2)
            in helper(lst, [])

We saw that this version worked and ran in a reasonable amount of time even for lists with a million values.

Then we turned to a new topic: data types. We started by defining a type that corresponds to an enumerated type in Java, C, and C++. Suppose that you want to keep track of various colors and you'd like to have meaningful names for them. This is easy to do in OCaml:

        type color = Red | Blue | Green

This definition introduces a new type called "color". We use the vertical bar or pipe character ("|") to separate different possibilities for the type. This type has three possible forms. OCaml refers to the three identifiers as constructors, even though in this case they are very simple and don't require any data. OCaml requires that type names start with a lowercase letter and constructors start with an uppercase letter. You can ask about the constructors in the interpreter:

        # Red;;
        - : color = Red

You can also write functions that use these identifiers, as in:

        let rgb(c) =
            match c with
            | Red   -> (255, 0, 0)
            | Green -> (0, 255, 0)
            | Blue  -> (0, 0, 255)

This function returns a tuple of integers that correspond to standard RGB sequences for a given color (three integers in the range of 0 to 255 that represent the red, blue, and green components of each).

I then turned to a more complex example. I said that I wanted to explore the definition of a binary search tree in OCaml. I asked people what binary trees look like and someone said that they can be empty or they have a node with left and right subtrees. This becomes the basis of our type definition:

        type int_tree = Empty | Node of int * int_tree * int_tree

The name of the type is int_tree. It has two different forms. The first form uses the constructor Empty and has no associated data. The second form uses the constructor Node and takes a triple composed of the data for this node (an int), the left subtree and the right subtree. Notice how the keyword "of" is used to separate the constructor from the data type description.

Given this definition, we could make an empty tree or a tree of one node simply by saying:

        # Empty;;
        - : int_tree = Empty
        # Node(38, Empty, Empty);;
        - : int_tree = Node (38, Empty, Empty)

Notice that we use parentheses to enclose the arguments to the Node constructor. The Node constructor is similar to a function but has a slightly different status, as we'll see. In particular, we can use constructors in patterns, which makes our function definitions much clearer.

For example, we wrote the following function to insert a value into a binary search tree of ints.

        let rec insert(value, tree) =
            match tree with
            | Empty                   -> Node(value, Empty, Empty)
            | Node(root, left, right) ->
                if (value <= root) then Node(root, insert(value, left), right)
                else Node(root, left, insert(value, right))

If we are asked to insert a value into an empty tree, we simply create a leaf node with the value. Otherwise, we compare the value against the root and either insert it into the left or right subtrees. In a language like Java, we would think of the tree as being changed (mutated). In OCaml, we instead think of returning a new tree that includes the new value.

To insert a sequence of values, you can use list recursion calling the insert function repeatedly:

        let rec insert_all(lst) =
            match lst with
            | []    -> Empty
            | x::xs -> insert(x, insert_all(xs))

Then we wrote a function for finding the height of a tree. I mentioned that I'm using a slightly different definition for the height of a tree. In the usual definition, the empty tree has a height of -1. I prefer to define the height of the empty tree as 0, so this is returning a count of the number of levels in the tree:

        let rec height(t) =
            match t with
            | Empty                   -> 0
            | Node(root, left, right) -> max (height left) (height right) + 1

In writing this, we had to use parentheses slightly differently because the built-in max function is a curried function. Notice how we follow max by two parenthesized calls on height.

I pointed out that we are not using the value of "root" (the data stored at the root). This is a good place to use an anonymous variable, which you indicate with an underscore:

        let rec height(t) =
            match t with
            | Empty                -> 0
            | Node(_, left, right) -> max (height left) (height right) + 1

In the interpreter, I constructed a tree with a million random values and asked for its height by saying:

        let t = insert_all(random_numbers(100000))
        height(t)

We found that the height was around 50 even though we haven't done anything special to balance the tree.

The last topic we discussed was tail recursion . Recall that we wrote this version of the factorial function:

        let rec factorial(n) =
            if n = 0 then 1
            else n * factorial(n - 1)

This version works, but it turns out that we can improve on its efficiency. If we were writing this code in Java, we would say something like:

        int product = 1;
        for (int i = 1; i <= n; i++) {
            product = product * i;
        }

Several times I've tried to make the point that you can turn this kind of loop code into a functional equivalent. If it was useful for the loop to have an extra variable for storing the current product, then we can do the same thing with a helper function. We can have a 2-argument function that keeps track of the current product in addition to the value of i. Using that idea, I wrote the following variation of factorial:

        let factorial(n) =
            let rec helper(n, result) =
                match n with
                | 0 -> result
                | n -> helper(n - 1, result * n)
            in helper(n, 1)

They both compute factorial(n) in a similar manner, but the second one is more efficient. Think about what happens when we compute factorial(5) using the first version:

        factorial(5) =
        5 * factorial(4) =
        5 * 4 * factorial(3) =
        5 * 4 * 3 * factorial(2) =
        5 * 4 * 3 * 2 * factorial(1) =
        5 * 4 * 3 * 2 * 1 * factorial(0) =
        5 * 4 * 3 * 2 * 1 * 1 = 120

Notice how the computation expands as we make recursive calls. After we reach the base case, we'll have a lot of computing left to do on the way back out. But notice the pattern for the second version:

        factorial(5) =
        helper(5, 1) =
        helper(4, 5) =
        helper(3, 20) =
        helper(2, 60) =
        helper(1, 120) =
        helper(0, 120) = 120

There is no expansion to the computation. The key thing to notice is that once we reach the base case, we have the overall answer. There is no computation left as we come back out of the recursive calls. This is a classic example of tail recursion. By definition, a tail recursive function is one that performs no additional computation after the base case is reached.

This extra variable used to keep track of the product is sometimes refered to as an accumulator. Our efficient version of reverse used a similar approach to build up the reversed list. So we would refer to that parameter as an accumulator as well and it turns out that the more efficient version of reverse is also tail recursive.

It is well known that tail recursive functions are easily written as a loop. Functional languages like Scheme and OCaml optimize tail recursive calls by internally executing them as if they were loops (which avoids generating a deep stack of function calls).

I also mentioned that the versions of map, filter and reduce that I've shown are not tail-recursive. The standard operators like List.map, List.filter, List.fold_left and List.fold_right are written in a tail-recursive manner to make them more efficient.

Stuart Reges

Last modified: Tue Feb 13 10:34:57 PST 2024