CSE341 Notes for Monday, 4/20/09

We spent some time finishing up the intTree example from the previous lecture:

        datatype intTree = Empty | Node of int * intTree * intTree;

        fun insert(n, Empty) = Node(n, Empty, Empty)
        |   insert(n, Node(root, left, right)) =
                  if n <= root then Node(root, insert(n, left), right)
                  else Node(root, left, insert(n, right));
This defines a binary tree of int values and the insert function implements the standard binary search tree insertion algorithm. To insert a sequence of values, you can use list recursion calling the insert function repeatedly:

        fun insertAll([]) = Empty
        |   insertAll(x::xs) = insert(x, insertAll(xs));
Then we wrote a function for finding the height of a tree. I mentioned that I'm using a slightly different definition for the height of a tree. In the usual definition, the empty tree has a height of -1. I prefer to define the height of the empty tree as 0, so this is returning a count of the number of levels in the tree:

    fun height(Empty) = 0
    |   height(Node(root, left, right)) = 1 + Int.max(height(left), height(right));
I pointed out that we are not using the value of "root" (the data stored at the root). This is a good place to use an anonymous variable, which you indicate with an underscore:

    fun height(Empty) = 0
    |   height(Node(_, left, right)) = 1 + Int.max(height(left), height(right));
In the interpreter, I constructed a tree with 100,000 random values and asked for its height by saying:

        val x = insertAll(randList(100000));
        depth(x);
We found that the depth was around 40 even though we haven't done anything special to balance the tree.

Then I turned to a new topic: the option type. It solves a certain problem that comes up in programming. Consider the problem of reading data from a file. Generally there is data to read, but what about when you reach the end of the file? What should be returned in that case? I asked people if they knew what happens when you call the Scanner class' method nextLine when end of file is true. Someone pointed out that it throws an exception. I asked if people knew what the built-in System.in variable returns when you call its read method. Nobody seemed to know. I said that the read method returns a value of type int. When you reach end of file, the method returns -1 as a way to say, "There is no more legal input to read." There is another reading class in the Java class libraries known as BufferedReader that has a readLine method. It returns null when you attempt to read beyond the end of a file.

None of these approaches is particularly elegant. What we want is the ability to return different things in different cases. If reading succeeds, we return a value. If it fails, we return a value that would correspond to "nothing". This is what the option type is used for in ML.

As Ullman explains in chapter 4, the TextIO structure has a function called openIn that can be used to open a text file. Once you do that, you can use the function readLine to read individual lines. The return type of readLine is a "string option". That means that sometimes it returns a string, sometimes it doesn't. The two constructors that are used for the option type are NONE and SOME. In fact, you can ask about these in the ML interpreter:

        - NONE;
        val it = NONE : 'a option
        - SOME;
        val it = fn : 'a -> 'a option
The constructor NONE is like our color constants. It doesn't have a value associated with it. But the SOME constructor takes a value of type 'a and returns a 'a option. Here are some examples:

        - SOME 3;
        val it = SOME 3 : int option
        - SOME "hello";
        val it = SOME "hello" : string option
        - SOME 45.8;
        val it = SOME 45.8 : real option
It takes a while for people to get used to the option type because languages like Java don't have anything that is like it. To extract a value from an option, you can call the function valOf. So you might say:

        - val x = SOME 82;
        val x = SOME 82 : int option
        - valOf(x);
        val it = 82 : int
We often don't need the valOf function because we instead use pattern matching to define a function that operates on an option, as in: fun f(NONE) = 0 | f(SOME n) = 2 * n; The option type is predefined in ML, but it's definition is fairly simple:

        datatype 'a option = NONE | SOME of 'a;
Then I turned to another new topic. I showed the following code:

        val x = 3;
        fun f(n) = x * n;
        f(8);
        val x = 5;
        f(8);
We found that the function uses the binding of x to 3 that exists when the function is defined. Changing the binding for x does not change the behavior of the function. My question is, how does that work? How does ML manage to figure that out?

The answer involves understanding two important concepts:

So in ML we really should think of function definitions as being a pair of things: some code to be evaluated when the function is called and an environment to use in executing that code. This pair has a name. We refer to this as the closure of a function.

Remember that functions can have free variables in them, as in our function f that refers to a variable x that is not defined inside the function. The idea of a closure is that we attach a context to the code in the body of the function to "close" all of these stray references.

We explored some examples to understand the difference between a val declaration that fully evaluates the code included in the declaration versus a function definition that delays evaluating the code used in the definition. For example, I included some sequence expressions with calls on print to show that val declarations are fully evaluated.

        - val x = 3;
        val x = 3 : int
        - val y = (print("hello\n"); 2 * x);
        hello
        val y = 6 : int
        - fun f1(n) = (print("hello\n"); 2 * x);
        val f1 = fn : 'a -> int
        - f1(3);
        hello
        val it = 6 : int
        - f1(10);
        hello
        val it = 6 : int
For a val definition, the print is performed when you type in the definition (indicating that ML is evaluating the code at that time). For the function definition, the print happens only when the function is called, indicating that ML delayed evaluating the expression until individual calls are made.


Stuart Reges
Last modified: Mon Apr 20 14:45:17 PDT 2009