# type int2 = int * int;;
type int2 = int * int
This just gives you an option to refer to the type using a simpler identifier,
as in:
# let sum(x : int2) = fst(x) + snd(x);;
val sum : int2 -> int = <fun>
Then we defined a type that corresponds to an enumerated type in Java, C, and
C++. Suppose that you want to keep track of various colors and you'd like to
have meaningful names for them. This is easy to do in OCaml:
type color = Red | Blue | Green
This definition introduces a new type called "color". We use the vertical bar
or pipe character ("|") to separate different possibilities for the type. This
type has three possible forms. OCaml refers to the three identifiers as
constructors, even though in this case they are very simple and don't require
any data. OCaml requires that type names start with a lowercase letter and
constructors start with an uppercase letter. You can ask about the
constructors in the interpreter:
# Red;;
- : color = Red
You can also write functions that use these identifiers, as in:
let rgb(c) =
match c with
| Red -> (255, 0, 0)
| Green -> (0, 255, 0)
| Blue -> (0, 0, 255)
This function returns a tuple of integers that correspond to standard RGB
sequences for a given color (three integers in the range of 0 to 255 that
represent the red, blue, and green components of each).I then turned to a more complex example. I said that I wanted to explore the definition of a binary search tree in OCaml. I asked people what binary trees look like and someone said that they can be empty or they have a node with left and right subtrees. This becomes the basis of our type definition:
type int_tree = Empty | Node of int * int_tree * int_tree
The name of the type is int_tree. It has two different forms. The first form
uses the constructor Empty and has no associated data. The second form uses
the constructor Node and takes a triple composed of the data for this node (an
int), the left subtree and the right subtree. Notice how the keyword "of" is
used to separate the constructor from the data type description.Given this definition, we could make an empty tree or a tree of one node simply by saying:
# Empty;;
- : int_tree = Empty
# Node(38, Empty, Empty);;
- : int_tree = Node (38, Empty, Empty)
Notice that we use parentheses to enclose the arguments to the Node
constructor. The Node constructor is similar to a function but has a slightly
different status, as we'll see. In particular, we can use constructors in
patterns, which makes our function definitions much clearer.For example, we wrote the following function to insert a value into a binary search tree of ints.
let rec insert(value, tree) =
match tree with
| Empty -> Node(value, Empty, Empty)
| Node(root, left, right) ->
if (value <= root) then Node(root, insert(value, left), right)
else Node(root, left, insert(value, right))
If we are asked to insert a value into an empty tree, we simply create a leaf
node with the value. Otherwise, we compare the value against the root and
either insert it into the left or right subtrees. In a language like Java, we
would think of the tree as being changed (mutated). In OCaml, we instead think
of returning a new tree that includes the new value.I had defined a variable with values to use for testing:
let test = [12; 38; 97; 5]
I asked people if they wanted to see these values inserted from left-to-right
or right-to-left. Someone said to insert left-to-right.
# let t1 = Empty;;
val t1 : int_tree = Empty
# let t2 = insert(12, t1);;
val t2 : int_tree = Node (12, Empty, Empty)
# let t3 = insert(38, t2);;
val t3 : int_tree = Node (12, Empty, Node (38, Empty, Empty))
# let t4 = insert(97, t3);;
val t4 : int_tree =
Node (12, Empty, Node (38, Empty, Node (97, Empty, Empty)))
# let t5 = insert(5, t4);;
val t5 : int_tree =
Node (12, Node (5, Empty, Empty),
Node (38, Empty, Node (97, Empty, Empty)))
The tree being constructed looks like this:
+----+
| 12 |
+----+
/ \
+---+ +----+
| 5 | | 38 |
+---+ +----+
\
+----+
| 97 |
+----+
To insert a sequence of values, you can use list recursion calling the insert
function repeatedly:
let rec insert_all(lst) =
match lst with
| [] -> Empty
| x::xs -> insert(x, insert_all(xs))
Then we wrote a function for finding the height of a tree. I mentioned that
I'm using a slightly different definition for the height of a tree. In the
usual definition, the empty tree has a height of -1. I prefer to define the
height of the empty tree as 0, so this is returning a count of the number of
levels in the tree:
let rec height(t) =
match t with
| Empty -> 0
| Node(root, left, right) -> max (height left) (height right) + 1
In writing this, we had to use parentheses slightly differently because the
built-in max function is a curried function. Notice how we follow max by two
parenthesized calls on height.I pointed out that we are not using the value of "root" (the data stored at the root). This is a good place to use an anonymous variable, which you indicate with an underscore:
let rec height(t) =
match t with
| Empty -> 0
| Node(_, left, right) -> max (height left) (height right) + 1
In the interpreter, I constructed a tree with ten thousand random values and
asked for its height by saying:
let t = insert_all(random_numbers(10000))
height(t)
We found that the height was around 33 even though we haven't done anything
special to balance the tree.Then I spent some time discussing the code for insert_all.
let rec insert_all(lst) =
match lst with
| [] -> Empty
| x::xs -> insert(x, insert_all(xs))
I asked whether this will insert from left-to-right or right-to-left given a
list like our testing list of [12; 38; 97; 5]. Someone said it will insert
from right-to-left. In other words, it will compute:
insert(12, insert(38, insert(97, insert(5 Empty))))
This is a good example of where we can use a folding operation. The reduce
function we've looked at isn't powerful enough to capture this proces, but
List.fold_right is able to handle this. I asked for its syntax in the
interpreter:
# List.fold_right;;
- : ('a -> 'acc -> 'acc) -> 'a list -> 'acc -> 'acc = <fun>
The first argument is a function. We have a function called insert,
but it has the wrong syntax because it is not curried. But we can use
the function called curry that I have included in our utility file to
convert it to curried form:
# insert;;
- : int * int_tree -> int_tree = <fun>
# (curry insert);;
- : int -> int_tree -> int_tree = <fun>
The fold_right function also takes a list and an initial value to use
for the accumulator, so we can define a variation of insert_all by
saying:
let insert_all2(lst) = List.fold_right (curry insert) lst Empty
We saw that this function produced the same result as our original
insert_all.What if we wanted to fold from left-to-right? I asked the interpreter for the general form of List.fold_left:
# List.fold_left;;
- : ('a -> 'b -> 'a) -> 'a -> 'b list -> 'a = <fun>
There is a problem here because when we fold from the left, it expects to call
a function that begins with our initial value. Using our test list of [12; 38;
97; 5], the first call it will make is insert(Empty, 12). But we wrote insert
to take the parameters in the other order (value first, tree second). Can we
fix it so that it takes the parameters in another order?This issue can come up in other contexts. For example, suppose you want to write a function to divide an int value by 2, as in:
let halve(n) = n / 2
What if we wanted to instead partially instantiate the division operator? The
problem is we want to provide the value 2 and have the other parameter be
unspecified, but they're in the wrong order. We can fix that by making a
function that switches the order of the parameters for a curried function like
this:
let switch f a b = f b a
Given this function, we can define halve more simply:
let halve = switch (/) 2
We made a call on map to verify that this is working:
# map(halve, 1--10);;
- : int list = [0; 1; 1; 2; 2; 3; 3; 4; 4; 5]
We can do something similar with the curried version of insert to switch the
order of its parameters:
let insert_all3 = List.fold_left (switch (curry insert)) Empty
Notice that this is a partially instantiated function because List.fold_left
ends with the list to process. This version produced the tree we had seen
before that is obtained by processing the values from left to right:
# insert_all3(test);;
- : int_tree =
Node (12, Node (5, Empty, Empty), Node (38, Empty, Node (97, Empty, Empty)))