CSE413 Notes for Wednesday, 1/24/24

We continued our discussion of the intTree example. I reviewed some of the code we wrote, pointing out that you'll be writing similar code for the next programming assignment.

As another example, I asked people how we could convert the binary search tree into a sorted list of ints. This took people a few minutes to figure out, but eventually we came to the realization that an inorder traversal of the tree will produce the values in sorted order, so we simply need to collapse it using an inorder traversal.

The easy case is to collapse an empty tree, which gives you an empty list:

        let rec collapse(tree) =
            match tree with
            | Empty -> []
            ...

As usual, we'll need a case for the nonempty tree:

        let rec collapse(tree) =
            match tree with
            | Empty                   -> []
            | Node(root, left, right) -> ...

In this case we want to recursively collapse the left and right subtrees and glue the two pieces together with the root data in the middle. Someone suggested doing it this way:

        let rec collapse(tree) =
            match tree with
            | Empty                   -> []
            | Node(root, left, right) ->
                collapse(left) @ root::collapse(right)

This code works well, but I have a slight preference in this case for expressing it as the appending of three different lists:

        let rec collapse(tree) =
            match tree with
            | Empty                   -> []
            | Node(root, left, right) ->
                collapse(left) @ [root] @ collapse(right)

The first version is better in the sense that it demonstrates an understanding of the difference between the cons operator(::) and the append operator(@), but I prefer the second because I conceive of the problem as putting together three different things. Both are perfectly fine ways to write the code.

We previously wrote a function to determine the height of a tree. A related notion is the depth of a given value stored in the tree. There are two different ways to compute this and I use a nonstandard definition.

Consider, for example, this tree:

What is the depth of the node with 9 in it? We compute it by finding the length of the path from the root to the node, but there are two things we could count: nodes or edges. If you're an edge counter (which is the current standard definition), you would say it has a depth of 2. if you're a node counter like me, you'd say it has a depth of 3. At least I have Don Knuth on my side. I'm a node counter because I want to have a good answer to the question, "What is the height of the empty tree?" For me, the answer is 0. For edge counters, it's either -1 or undefined or "I'm not sure."

In any event, I said that we'd implement it using my definition. Then I asked how we'd write a function to find the depth of a given value in a search tree. We usually start with an empty tree as the base case, but it's not clear what to return. What does it mean if you have gotten to an empty tree? It means the value wasn't found in the tree. Someone suggested we could return -1 in that case, the way we do for calls on indexOf in a language like Java when it doesn't find a value in a list:

        let rec depth_of(tree, n) =
            match tree with
            | Empty -> -1
            ...

If the value stored at the root is n, then this value has a depth of 1 (it appears in level 1 of the tree):

        let rec depth_of(tree, n) =
            match tree with
            | Empty                   -> -1
            | Node(root, left, right) ->
                if root = n then 1
            ...

What if it's not at the root? It's a binary search tree, so to be efficient, we should either search the left subtree or the right subtree, but not both:

        let rec depth_of(tree, n) =
            match tree with
            | Empty                   -> -1
            | Node(root, left, right) ->
                if root = n then 1
                else if n < root then 1 + depth_of(left, n)
                else 1 + depth_of(right, n)

I had a list of ints that I used to define a variable "test" that we could use for testing:

        let test = [40; 72; 15; 0; -8; 95; 103; 72; 272; 143; 413; 341]

Then I defined a variable "t" by calling the insert_all function we wrote in the previous lecture to produce a binary search tree. I asked people to predict what would be the root of the tree and people thought it would be 40. That would be true if the first value we inserted into the tree was 40, but we wrote that function in a classic recursive way where it made calls on insert as it backed out of the recursion:

        let rec insert_all(lst) =
            match lst with
            | []    -> Empty
            | x::xs -> insert(x, insert_all(xs))

So in effect we are calling:

        insert(40, insert(72, insert(15, insert(0, ..., insert(341, Empty)))))

So it's 341 that ended up being the overall root of this tree and 40 ended up fairly deep in the tree because it was inserted last.

        # let t = insert_all(test);;
                val t : int_tree =
                  Node (341,
                   Node (143,
                    Node (72,
                     Node (-8, Empty,
                      Node (0, Empty,
                       Node (15, Empty, Node (72, Node (40, Empty, Empty), Empty)))),
                     Node (103, Node (95, Empty, Empty), Empty)),
                    Node (272, Empty, Empty)),
                   Node (413, Empty, Empty))

I tried loading our definition for depth_of into the interpreter and using this variable t we could see that it worked fairly well:

        # depth_of(t, 341);;
        - : int = 1
        # depth_of(t, 143);;
        - : int = 2
        # depth_of(t, 413);;
        - : int = 2
        # depth_of(t, 40);;
        - : int = 8

But we ran into problems when we asked about a value not in the tree:

        # depth_of(t, 42);;
        - : int = 7

We were expecting a value of -1. How did we end up with 7? The answer is that our solution to depth_of descends the tree looking for the given value and then adds 1 to the result as it comes back out. This value of 42 would become the right child of the leaf node that has 40 in it. Remember that 40 had a depth of 8. So when we find that its right child is empty, we return a -1 and then add one to the result 8 different times to get an overall result of 7.

We could fix this by converting this into a tail recursive function with an accumulator so that when we reach an empty tree, there is nothing left to do but to return that special value -1. Another option would be to raise an exception. So we could imagine that we have a precondition on the function that they shouldn't call it if the value isn't in the tree. That doesn't seem like a very friendly thing to do, though. Someone said they could call contains before calling depth_of, but that means they have to search the tree twice: once to see if it's there, and a second time to find its depth.

OCaml offers a good alternative. This is a good place to use what is known as an option type. An option is appropriate when the answer is "0 or 1 of" something. In other words, sometimes there is an answer and sometimes not. What is the maximum value in an empty list? There isn't one. This is a similar case. What is the depth of a value n in the tree? Sometimes there is an answer and sometimes the answer is, "There isn't one."

The option type is defined as follows:

        type 'a option = None | Some of 'a

Notice the use of 'a, which means that it is polymorphic (you can have an int option, string option, float option, etc). Using these two constructors, I tried to rewrite our definition:

        let rec depth_of(tree, n) =
            match tree with
            | Empty                   -> None
            | Node(root, left, right) ->
                if root = n then Some 1
                else if n < root then 1 + depth_of(left, n)
                else 1 + depth_of(right, n)

This produced an error. It's the same problem we had with the attempt to return -1. We have to be consistent. The recursive calls on left and right are adding 1 to the result being returned. You can't add one to an option and we can't return an option in one case and not the other. And we can't extract the value from the option because we don't know that the recursive call will succeed. So we have to convert this into a tail recursive version by introducing a helper function with an accumulator that keeps track of the current depth:

        let depth_of(tree, n) =
            let rec helper(tree, depth) =
                match tree with
                | Empty                   -> None
                | Node(root, left, right) ->
                    if root = n then Some depth
                    else if n < root then helper(left, depth + 1)
                    else helper(right, depth+ 1)
            in helper(tree, 1)

Obviously we will sometimes want to turn an option result into an actual value. You can do so using the function Option.get. For example:

        # let result = depth_of(t, 40);;
        val result : int option = Some 8
        # let d = Option.get(result);;
        val d : int = 8

More often we find ourselves using pattern matching with the None and Some constructors.

As one final example I said that I wanted to write a function called at_depth that would take a tree and an int as parameters and that would return a list of the values at depth n in the given tree. Someone suggested that we could call filter to get us values at a particular depth. The question becomes what values to consider for the call on filter. I intially used the numbers 1 through a thousand:

        let at_depth(tree, n) = 
            filter((fun x -> depth_of(tree, x) = Some n), 1--1000);;

Notice the use of the Some option constructor in the comparison. We didn't have to call Option.get because we could instead use pattern matching. This sort of worked:

        - : int list list =
        [[341]; [143; 413]; [72; 272]; [103]; [95]; [15]; []; [40]; []; []]

This output is not correct. We are missing some values. It has no 0, for example, and no -8. That's because I filtered on the list 1--1000. I asked how we could get the values in the tree and someone mentioned that we could use our function collapse that we wrote earlier in the lecture:

        let at_depth(tree, n) = 
            filter((fun x -> depth_of(tree, x) = Some n), collapse(tree))

This version produced correct results:

        - : int list list =
                [[341]; [143; 413]; [72; 72; 272]; [-8; 103]; [0; 95]; [15]; []; [40];
                 []; []]

This lets us quickly see that there is one node at a depth of 1 (341), two at a depth of 2 (143 and 413), three at a depth of 3 including the duplicate 72, and so on. It seems impossible for there to be no nodes at a depth of 7 when we have the value 40 at a depth of 8. That is happening because of the duplicate 72. It is being counted as being at level 3 twice when really one of them is at level 3 and one is at level 7. There is no easy way to distinguish between these two occurrences of 72. The easiest fix would be to make the usual restriction that our binary search tree forbids duplicates and then we would have just a single occurrence of 72.

Stuart Reges

Last modified: Tue Feb 13 10:35:25 PST 2024