let rec f1(n) =
match n with
| 0 -> 0
| x -> 2 + f1(x - 1)
This is a silly function to write because it just computes 2 * n, but it will
allow us to perform an experiment. I then asked people to think about how we
might write something like this with a loop. Someone said that we'd use some
kind of counter, so it might look like this:
int sum = 0;
for (int i = 0; i < n; i++) {
sum = sum + 2;
}
Several times I've tried to make the point that you can turn this kind of loop
code into a functional equivalent. If it was useful for the loop to have an
extra variable for storing the current sum, then we can do the same thing
with a helper function. We can have a 2-argument function that keeps track of
the current sum in addition to the value of n. Using that idea, I wrote the
following variation of f1:
let f2(n) =
let rec helper(n, sum) =
match n with
| 0 -> sum
| x -> helper(x - 1, sum + 2)
in helper(n, 0)
They both compute 2 * n in a similar manner, but they have very different
behavior in the interpreter. The f1 function ran noticeably slower than f2,
especially when we used very large input values like f1(5000000) vs
f2(5000000). Why would that be? Think about what happens when we compute
f1(5):
f1(5) =
2 + f1(4) =
2 + 2 + f1(3) =
2 + 2 + 2 + f1(2) =
2 + 2 + 2 + 2 + f1(1) =
2 + 2 + 2 + 2 + 2 + f1(0) =
2 + 2 + 2 + 2 + 2 + 0 = 10
Notice how the computation expands as we make recursive calls. After we reach
the base case, we'll have a lot of computing left to do on the way back out.
But notice the pattern for f2:
f2(5) =
helper(5, 0) =
helper(4, 2) =
helper(3, 4) =
helper(2, 6) =
helper(1, 8) =
helper(0, 10) = 10
There is no expansion to the computation. The key thing to notice is that once
we reach the base case, we have the overall answer. There is no computation
left as we come back out of the recursive calls. This is a classic example of
tail recursion. By definition, a tail recursive function is one that performs
no additional computation after the base case is reached.As I mentioned in the previous lecture, it is well known that tail recursive functions are easily written as a loop. Someone pointed out that Scheme requires that this kind of conversion be done to speed up such computations. OCaml is obviously doing something similar to optimize the tail recursive call.
I also mentioned that the versions of map, filter and reduce that I've shown are not tail-recursive. I'm sure that the standard operators like List.map, List.filter, List.fold_left and List.fold_right are written in a tail-recursive manner to make them more efficient.
I then turned back to the binary search tree example (int_tree) that we were discussing in the previous lecture. I reviewed some of the functions we wrote now that we had a little more time to discuss them in detail. We had included this function to convert a list of ints into a binary search tree:
let rec insert_all(lst) =
match lst with
| [] -> Empty
| x::xs -> insert(x, insert_all(xs))
I said that for a list like [12, 38, 97], this function is computing:
insert(12, insert(38, insert(97, Empty)));
This is a good example of where we can use a folding operation. This
version is folding from right to left (first inserting the rightmost
value 97, then 38, then 12). The reduce function we've looked at
isn't powerful enough to capture this proces, but List.fold_right is
able to handle this. I asked for its syntax in the interpreter:
# List.fold_right;;
- : ('a -> 'acc -> 'acc) -> 'a list -> 'acc -> 'acc = <fun>
The first argument is a function. We have a function called insert,
but it has the wrong syntax because it is not curried. But we can use
the function called curry that I have included in our utility file to
convert it to curried form:
# insert;;
- : int * int_tree -> int_tree = <fun>
# (curry insert);;
The first argument is a function. We have a function called insert,
but it has the wrong syntax because it is not curried. But we can use
the function called curry that I have included in our utility file to
# insert;;
- : int * int_tree -> int_tree = <fun>
# (curry insert);;
The first argument is a function. We have a function called insert,
but it has the wrong syntax because it is not curried. But we can use
the function called curry that I have included in our utility file to
convert it to curried form:
# insert;;
- : int * int_tree -> int_tree = <fun>
# (curry insert);;
- : int -> int_tree -> int_tree = <fun>
The fold_right function also takes a list and an initial value to use
for the accumulator, so we can define a variation of insert_all by
saying:
let insert_all2(lst) = List.fold_right (curry insert) lst Empty
We saw that this function produced the same result as our original
insert_all.Then I asked people how we'd write a method called contains that takes a value n and a tree and that returns true if the value is in the tree and false otherwise. Someone quickly pointed out the base case for the empty tree that it doesn't contain anything:
let rec contains(tree, n) =
match tree with
| Empty -> false
...
Remember that you typically want a different case for each of your different
type constructors. The case above handles the empty tree, so we'll also need a
case for a nonempty tree:
let rec contains(tree, n) =
match tree with
| Empty -> false
| Node(root, left, right) ->
...
Someone said that if the root data is equal to n, then the answer is
true. If not, someone said we could see if it is in either
subtree:
let rec contains(tree, n) =
match tree with
| Empty -> false
| Node(root, left, right) ->
root = n || contains(left, n) || contains(right, n)
This code works, but it's not very efficient. It would potentially search the
entire tree. Remember that we are working with a binary search tree. So the
better thing to do is to check either the left subtree or the right subtree,
but not both:
let rec contains2(tree, n) =
match tree with
| Empty -> false
| Node(root, left, right) ->
if root = n then true
else if n < root then contains2(left, n)
else contains2(right, n)
This version turns out to be quite efficient, which I demonstrated in the OCaml
interpreter by typing:
let t = insert_all(random_numbers(1000000))
filter((fun x -> contains2(t, x)), 1--100000)
The request for 100 thousand calls on contains2 executed almost without
pause.