CSE341 Notes for Monday, 1/22/07

I spent some time discussing the concept of tail recursion . I said to consider a simple counting function:

        fun f1(n) =
            if n = 0 then 0
            else 2 + f1(n - 1);

This is a silly function to write because it just computes 2 * n, but it will allow us to perform an experiment. I then asked people to think about how we might write something like this with a loop. Someone said that we'd use some kind of counter, so it might look like this:

        int count = 0;
        for (int i = 0; i < n; i++)
            count += 2;

Several times I've tried to make the point that you can turn this kind of loop code into a functional equivalent. If it was useful for the loop to have an extra variable for storing the current count, then we can do the same thing with a helper function. We can have a 2-argument function that keeps track of the current count in addition to the value of n. Using that idea, I wrote the following variation of f1:

        fun f2(n) =
            let fun helper(n, count) =
                    if n = 0 then count
                    else helper(n - 1, count + 2)
            in helper(n, 0)
            end;

They both compute 2 * n in a similar manner, but they have very different behavior in the interpreter. The f1 function ran noticeably slower than f2, especially when we used very large input values like f1(5000000) vs f2(5000000). Why would that be? Think about what happens when we compute f1(5):

        f1(5) =
        2 + f1(4) =
        2 + 2 + f1(3) =
        2 + 2 + 2 + f1(2) =
        2 + 2 + 2 + 2 + f1(1) =
        2 + 2 + 2 + 2 + 2 + f1(0) =
        2 + 2 + 2 + 2 + 2 + 0 = 10

Notice how the computation expands as we make recursive calls. After we reach the base case, we'll have a lot of computing left to do on the way back out. But notice the pattern for f2:

        f2(5) =
        helper(5, 0) =
        helper(4, 2) =
        helper(3, 4) =
        helper(2, 6) =
        helper(1, 8) =
        helper(0, 10) = 10

There is no expansion to the computation. The key thing to notice is that once we reach the base case, we have the overall answer. There is no computation left as we come back out of the recursive calls. This is a classic example of tail recursion. By definition, a tail recursive function is one that performs no additional computation after the base case is reached.

It is well known that tail recursive functions are easily written as a loop. Someone pointed out that Scheme requires that this kind of conversion be done to speed up such computations. ML is obviously doing something similar to optimize the tail recursive call.

I also mentioned that the versions of map, filter and reduce that I've shown and that appear in the Ullman book are not tail-recursive. I sure that the standard operators like List.map, List.filter, List.foldl and List.foldr are written in a tail-recursive manner to make them more efficient.

I then turned back to the binary search tree example (intTree) that we were discussing in lecture on Friday. I reviewed some of the functions we wrote now that we had a little more time to discuss them in detail. We had included this function to convert a list of ints into a binary search tree:

        fun insertAll([]) = Empty
        |   insertAll(x::xs) = insert(x, insertAll(xs));

I said that for a list like [12, 38, 97], this function is computing:

        insert(12, insert(38, insert(97, Empty)));

This is a good example of where we can use List.foldl or List.foldr. The reduce function we've looked at isn't powerful enough to do this because reduce would require a function that reduces two ints to one int. The fold functions can be used in situations like these where we're using a function like insert that takes an int and a tree and returns a tree.

The expression above folds from the right (the rightmost value is used first for a call on insert), so we could rewrite this using List.foldr. In calling List.foldr we have to pass the function we are using to "fold" this into a single value (insert) and the value to use for the first folding operation (Empty). So we could have defined insertAll as follows:

        fun insertAll(lst) = List.foldr insert Empty lst;

or if you prefer to partially instantiate the function (which is the more elegant way to do it):

        val insertAll2 = List.foldr insert Empty;

This is a case where it doesn't particularly matter whether we fold from the left or the right, so we could replace List.foldr with List.foldl and the function would still work. In that case we'd be computing a value like this for the list [12, 38, 97]:

        insert(97, insert(38, insert(12, Empty)))

You can think of this as "apply the folding function from left to right, starting with the leftmost (first) value in the list."

Then I asked people how we'd write a method called contains that takes a value n and a tree and that returns true if the value is in the tree and false otherwise. Someone quickly pointed out the base case for the empty tree that it doesn't contain anything:

        fun contains(n, Empty) = false

Remember that you typically want a different case for each of your different type constructors. The case above handles the empty tree, so we'll also need a case for a nonempty tree:

        fun contains(n, Empty) = false
        |   contains(n, Node(root, left, right)) =
                ...

Someone said that if the root data is equal to n, then the answer is true:

        fun contains(n, Empty) = false
        |   contains(n, Node(root, left, right)) =
                if n = root then true
                ...

If not, someone said we could see if it is in either subtree:

        fun contains(n, Empty) = false
        |   contains(n, Node(root, left, right)) =
                if n = root then true
                else contains(n, left) orelse contains(n, right);

This code works, but it's not very efficient. It would potentially search the entire tree. Remember that we are working with a binary search tree. So the better thing to do is to check either the left subtree or the right subtree, but not both:

        fun contains(n, Empty) = false
        |   contains(n, Node(root, left, right)) =
                if n = root then true
                else if n < root then contains(n, left)
                else contains(n, right);

This version turns out to be quite efficient, which I demonstrated in the ML interpreter by typing:

        val t = insertAll(randList(1000000));
        filter(fn x => contains(x, t), 1--100000);

The request for 100 thousand calls on contains executed almost without pause.

Stuart Reges

Last modified: Fri Jan 26 07:52:23 PST 2007