CSE413 Notes for Monday, 1/22/24

I began by talking more about tail recursion. I said to consider a simple counting function:

        let rec f1(n) =
            match n with
            | 0 -> 0
            | x -> 2 + f1(x - 1)
This is a silly function to write because it just computes 2 * n, but it will allow us to perform an experiment. I then asked people to think about how we might write something like this with a loop. Someone said that we'd use some kind of counter, so it might look like this:

        int sum = 0;
        for (int i = 0; i < n; i++) {
            sum = sum + 2;
        }
Several times I've tried to make the point that you can turn this kind of loop code into a functional equivalent. If it was useful for the loop to have an extra variable for storing the current sum, then we can do the same thing with a helper function. We can have a 2-argument function that keeps track of the current sum in addition to the value of n. Using that idea, I wrote the following variation of f1:

        let f2(n) =
            let rec helper(n, sum) =
                match n with
                | 0 -> sum
                | x -> helper(x - 1, sum + 2)
            in helper(n, 0)
They both compute 2 * n in a similar manner, but they have very different behavior in the interpreter. The f1 function ran noticeably slower than f2, especially when we used very large input values like f1(5000000) vs f2(5000000). Why would that be? Think about what happens when we compute f1(5):

        f1(5) =
        2 + f1(4) =
        2 + 2 + f1(3) =
        2 + 2 + 2 + f1(2) =
        2 + 2 + 2 + 2 + f1(1) =
        2 + 2 + 2 + 2 + 2 + f1(0) =
        2 + 2 + 2 + 2 + 2 + 0 = 10
Notice how the computation expands as we make recursive calls. After we reach the base case, we'll have a lot of computing left to do on the way back out. But notice the pattern for f2:

        f2(5) =
        helper(5, 0) =
        helper(4, 2) =
        helper(3, 4) =
        helper(2, 6) =
        helper(1, 8) =
        helper(0, 10) = 10
There is no expansion to the computation. The key thing to notice is that once we reach the base case, we have the overall answer. There is no computation left as we come back out of the recursive calls. This is a classic example of tail recursion. By definition, a tail recursive function is one that performs no additional computation after the base case is reached.

As I mentioned in the previous lecture, it is well known that tail recursive functions are easily written as a loop. Someone pointed out that Scheme requires that this kind of conversion be done to speed up such computations. OCaml is obviously doing something similar to optimize the tail recursive call.

I also mentioned that the versions of map, filter and reduce that I've shown are not tail-recursive. I'm sure that the standard operators like List.map, List.filter, List.fold_left and List.fold_right are written in a tail-recursive manner to make them more efficient.

I then turned back to the binary search tree example (int_tree) that we were discussing in the previous lecture. I reviewed some of the functions we wrote now that we had a little more time to discuss them in detail. We had included this function to convert a list of ints into a binary search tree:

        let rec insert_all(lst) =
            match lst with
            | []    -> Empty
            | x::xs -> insert(x, insert_all(xs))
I said that for a list like [12, 38, 97], this function is computing:

        insert(12, insert(38, insert(97, Empty)));
This is a good example of where we can use a folding operation. This version is folding from right to left (first inserting the rightmost value 97, then 38, then 12). The reduce function we've looked at isn't powerful enough to capture this proces, but List.fold_right is able to handle this. I asked for its syntax in the interpreter:

    # List.fold_right;;
    - : ('a -> 'acc -> 'acc) -> 'a list -> 'acc -> 'acc = <fun>
The first argument is a function. We have a function called insert, but it has the wrong syntax because it is not curried. But we can use the function called curry that I have included in our utility file to convert it to curried form:

        # insert;;
    - : int * int_tree -> int_tree = <fun>
    # (curry insert);;
The first argument is a function. We have a function called insert, but it has the wrong syntax because it is not curried. But we can use the function called curry that I have included in our utility file to
        # insert;;
    - : int * int_tree -> int_tree = <fun>
    # (curry insert);;
The first argument is a function. We have a function called insert, but it has the wrong syntax because it is not curried. But we can use the function called curry that I have included in our utility file to convert it to curried form:

        # insert;;
        - : int * int_tree -> int_tree = <fun>
        # (curry insert);;
        - : int -> int_tree -> int_tree = <fun>
The fold_right function also takes a list and an initial value to use for the accumulator, so we can define a variation of insert_all by saying:

        let insert_all2(lst) = List.fold_right (curry insert) lst Empty
We saw that this function produced the same result as our original insert_all.

Then I asked people how we'd write a method called contains that takes a value n and a tree and that returns true if the value is in the tree and false otherwise. Someone quickly pointed out the base case for the empty tree that it doesn't contain anything:

        let rec contains(tree, n) =
            match tree with
            | Empty -> false
            ...
Remember that you typically want a different case for each of your different type constructors. The case above handles the empty tree, so we'll also need a case for a nonempty tree:

        let rec contains(tree, n) =
            match tree with
            | Empty                   -> false
            | Node(root, left, right) -> 
                ...
Someone said that if the root data is equal to n, then the answer is true. If not, someone said we could see if it is in either subtree:

        let rec contains(tree, n) =
            match tree with
            | Empty                   -> false
            | Node(root, left, right) -> 
                root = n || contains(left, n) || contains(right, n)
This code works, but it's not very efficient. It would potentially search the entire tree. Remember that we are working with a binary search tree. So the better thing to do is to check either the left subtree or the right subtree, but not both:

        let rec contains2(tree, n) =
            match tree with
            | Empty                   -> false
            | Node(root, left, right) ->
                if root = n then true
                else if n < root then contains2(left, n)
                else contains2(right, n)
This version turns out to be quite efficient, which I demonstrated in the OCaml interpreter by typing:

        let t = insert_all(random_numbers(1000000))
        filter((fun x -> contains2(t, x)), 1--100000)
The request for 100 thousand calls on contains2 executed almost without pause.


Stuart Reges
Last modified: Tue Feb 13 10:35:11 PST 2024