CSE413 Notes for Friday, 4/18/25

We turned to a new topic: defining your own types. You can introduce a synonym for an existing type, as in:

        # type int2 = int * int;;
        type int2 = int * int
This just gives you an option to refer to the type using a simpler identifier, as in:

        # let sum(x : int2) = fst(x) + snd(x);;
        val sum : int2 -> int = <fun>
Then we defined a type that corresponds to an enumerated type in Java, C, and C++. Suppose that you want to keep track of various colors and you'd like to have meaningful names for them. This is easy to do in OCaml:

        type color = Red | Blue | Green
This definition introduces a new type called "color". We use the vertical bar or pipe character ("|") to separate different possibilities for the type. This type has three possible forms. OCaml refers to the three identifiers as constructors, even though in this case they are very simple and don't require any data. OCaml requires that type names start with a lowercase letter and constructors start with an uppercase letter. You can ask about the constructors in the interpreter:

        # Red;;
        - : color = Red
You can also write functions that use these identifiers, as in:

        let rgb(c) =
            match c with
            | Red   -> (255, 0, 0)
            | Green -> (0, 255, 0)
            | Blue  -> (0, 0, 255)
This function returns a tuple of integers that correspond to standard RGB sequences for a given color (three integers in the range of 0 to 255 that represent the red, blue, and green components of each).

I then turned to a more complex example. I said that I wanted to explore the definition of a binary search tree in OCaml. I asked people what binary trees look like and someone said that they can be empty or they have a node with left and right subtrees. This becomes the basis of our type definition:

        type int_tree = Empty | Node of int * int_tree * int_tree
The name of the type is int_tree. It has two different forms. The first form uses the constructor Empty and has no associated data. The second form uses the constructor Node and takes a triple composed of the data for this node (an int), the left subtree and the right subtree. Notice how the keyword "of" is used to separate the constructor from the data type description.

Given this definition, we could make an empty tree or a tree of one node simply by saying:

        # Empty;;
        - : int_tree = Empty
        # Node(38, Empty, Empty);;
        - : int_tree = Node (38, Empty, Empty)
Notice that we use parentheses to enclose the arguments to the Node constructor. The Node constructor is similar to a function but has a slightly different status, as we'll see. In particular, we can use constructors in patterns, which makes our function definitions much clearer.

For example, we wrote the following function to insert a value into a binary search tree of ints.

        let rec insert(value, tree) =
            match tree with
            | Empty                   -> Node(value, Empty, Empty)
            | Node(root, left, right) ->
                if (value <= root) then Node(root, insert(value, left), right)
                else Node(root, left, insert(value, right))
        
If we are asked to insert a value into an empty tree, we simply create a leaf node with the value. Otherwise, we compare the value against the root and either insert it into the left or right subtrees. In a language like Java, we would think of the tree as being changed (mutated). In OCaml, we instead think of returning a new tree that includes the new value.

I had defined a variable with values to use for testing:

        let test = [12; 38; 97; 5]
I asked people if they wanted to see these values inserted from left-to-right or right-to-left. Someone said to insert left-to-right.
        # let t1 = Empty;;
        val t1 : int_tree = Empty
        # let t2 = insert(12, t1);;
        val t2 : int_tree = Node (12, Empty, Empty)
        # let t3 = insert(38, t2);;
        val t3 : int_tree = Node (12, Empty, Node (38, Empty, Empty))
        # let t4 = insert(97, t3);;
        val t4 : int_tree =
          Node (12, Empty, Node (38, Empty, Node (97, Empty, Empty)))
        # let t5 = insert(5, t4);;
        val t5 : int_tree =
          Node (12, Node (5, Empty, Empty),
           Node (38, Empty, Node (97, Empty, Empty)))
The tree being constructed looks like this:

              +----+
              | 12 |
              +----+
             /      \
        +---+       +----+
        | 5 |       | 38 |
        +---+       +----+
                          \
                          +----+
                          | 97 |
                          +----+
To insert a sequence of values, you can use list recursion calling the insert function repeatedly:

        let rec insert_all(lst) =
            match lst with
            | []    -> Empty
            | x::xs -> insert(x, insert_all(xs))
Then we wrote a function for finding the height of a tree. I mentioned that I'm using a slightly different definition for the height of a tree. In the usual definition, the empty tree has a height of -1. I prefer to define the height of the empty tree as 0, so this is returning a count of the number of levels in the tree:

        let rec height(t) =
            match t with
            | Empty                   -> 0
            | Node(root, left, right) -> max (height left) (height right) + 1
In writing this, we had to use parentheses slightly differently because the built-in max function is a curried function. Notice how we follow max by two parenthesized calls on height.

I pointed out that we are not using the value of "root" (the data stored at the root). This is a good place to use an anonymous variable, which you indicate with an underscore:

        let rec height(t) =
            match t with
            | Empty                -> 0
            | Node(_, left, right) -> max (height left) (height right) + 1
In the interpreter, I constructed a tree with ten thousand random values and asked for its height by saying:

        let t = insert_all(random_numbers(10000))
        height(t)
We found that the height was around 33 even though we haven't done anything special to balance the tree.

Then I spent some time discussing the code for insert_all.

        let rec insert_all(lst) =
            match lst with
            | []    -> Empty
            | x::xs -> insert(x, insert_all(xs))
I asked whether this will insert from left-to-right or right-to-left given a list like our testing list of [12; 38; 97; 5]. Someone said it will insert from right-to-left. In other words, it will compute:

        insert(12, insert(38, insert(97, insert(5 Empty))))

Stuart Reges
Last modified: Fri Apr 18 14:29:43 PDT 2025