CSE341 Notes for Monday, 4/19/10

Then we turned to a new topic: data types. This moves us into chapter 6 of the Ullman book. The chapter begins by showing how to use the keyword "type" to introduce a type synonym. For example, if you found yourself often dealing with a tuple of 4 ints, you could say:

        type int4 = int * int * int * int;

This sets up a synonym int4 that stands for int * int * int * int. That means that you can then say things like:

        fun f(x:int4) = ...

The Ullman book also describes how to define polymorphic synonyms. The more interesting use of a type is to use the keyword "datatype" to define a set of constructors for a type. For example, you could define a color class by saying:

        - datatype color = Red | Blue | Green;
        datatype color = Blue | Green | Red

We again use the vertical bar or pipe character ("|") to separate different possibilities for the type. This type has three possible forms. This is the ML equivalent of an enum type. This definition introduces a new type called "color". by convention, we use lowercase letters for the first letter of a type. It also introduces three constructors called Red, Blue and Green. You can find out about them in the interpreter:

        - Red;
        val it = Red : color

You can also write functions that use these identifiers, as in:

        - fun f(x) = x = Red;
        val f = fn : color -> bool

This function is a predicate that tells you whether or not a certain function is Red. It has fairly predictable results:

        - f(Red);
        val it = true : bool
        - f(Blue);
        val it = false : bool
        - f(Green);
        val it = false : bool
        - f(Yellow);
        stdIn:10.3-10.9 Error: unbound variable or constructor: Yellow

I showed another example that involved assigning each color a tuple of integers that correspond to standard RGB sequences (three integers in the range of 0 to 255 that represent the red, blue, and green components of each):

        fun rgb(Red) = (255, 0, 0)
        |   rgb(Blue) = (0, 0, 255)
        |   rgb(Green) = (0, 255, 0);

I mentioned that ML has a construct known as a case expression that was described in chapter 5 of the textbook. The pattern matching that we are using in function definitions like rgb is really just syntatic sugar for a case expression. The function definition above is converted into the following equivalent definition:

        fun rgb(c) =
            case c of
        	Red => (255, 0, 0)
              | Blue => (0, 0, 255)
              | Green => (0, 255, 0);

I am not a fan of the case expression, so I don't use it a lot, although other ML programmers like it.

I pointed out that the bool type in ML involves one of these datatype definitions:

        datatype bool = true | false;

And the if/else construct is really just another case of syntactic sugar. For an if/else of this form:

       if e1 then e2 else e3

ML replaces this with the following case expression:

       case e1 of
           true => e2
         | false => e3

I then turned to a more complex example. I said that I wanted to explore the definition of a binary search tree in ML. Ullman uses the example in the book, but he does it with curried functions and makes it polymorphic. I am going to keep it simple by having uncurried functions and a simple tree of ints.

I asked people what binary trees look like and someone said that they can be empty or they have a node with left and right subtrees. This becomes the basis of our type definition:

        datatype intTree = Empty | Node of int * intTree * intTree;

The name of the type is intTree. It has two different forms. The first form uses the constructor Empty and has no associated data. The second form uses the constructor Node and takes a triple composed of the data for this node (an int), the left subtree and the right subtree. Notice how the keyword "of" is used to separate the constructor from the data type description.

Given this definition, we could make an empty tree or a tree of one node simply by saying:

        - Empty;
        val it = Empty : intTree
        - Node(38, Empty, Empty);
        val it = Node (38,Empty,Empty) : intTree

Notice that we use parentheses to enclose the arguments to the Node constructor. The Node constructor is similar to a function, as the ML interpreter will verify:

        - Node;
        val it = fn : int * intTree * intTree -> intTree

It has a slightly different status, as we'll see. In particular, we can use constructors in patterns, which makes our function definitions much clearer.

For example, we wrote the following function to insert a value into a binary search tree of ints.

        fun insert(n, Empty) = Node(n, Empty, Empty)
        |   insert(n, Node(root, left, right)) =
                  if n <= root then Node(root, insert(n, left), right)
                  else Node(root, left, insert(n, right));

If we are asked to insert a value into an empty tree, we simply create a leaf node with the value. Otherwise, we compare the value against the root and either insert it into the left or right subtrees. In a language like Java, we would think of the tree as being changed (mutated). In ML, we instead think of returning a new tree that includes the new value.

To insert a sequence of values, you can use list recursion calling the insert function repeatedly:

        fun insertAll([]) = Empty
        |   insertAll(x::xs) = insert(x, insertAll(xs));

Then we wrote a function for finding the height of a tree. I mentioned that I'm using a slightly different definition for the height of a tree. In the usual definition, the empty tree has a height of -1. I prefer to define the height of the empty tree as 0, so this is returning a count of the number of levels in the tree:

    fun height(Empty) = 0
    |   height(Node(root, left, right)) = 1 + Int.max(height(left), height(right));

I pointed out that we are not using the value of "root" (the data stored at the root). This is a good place to use an anonymous variable, which you indicate with an underscore:

    fun height(Empty) = 0
    |   height(Node(_, left, right)) = 1 + Int.max(height(left), height(right));

In the interpreter, I constructed a tree with 1,000,000 random values and asked for its height by saying:

        val x = insertAll(randList(100000));
        height(x);

We found that the height was around 50 even though we haven't done anything special to balance the tree.

Stuart Reges

Last modified: Mon May 10 08:32:37 PDT 2010