CSE341 Notes for Friday, 2/2/07

I launched into a new topic. I said that in the next couple of lectures we would discuss two important concepts from ML known as structures and signatures. I started with structures.

When you write a large amount of code you find yourself having trouble managing the name space. For example, what if you load two different ML files that each have a function with a particular name? The second file loaded wipes out the definition from the first file. And what if you have a utility like toString that you want to implement for many different types? A structure provides a way to establish independent namespaces. For example, ML has a function called Real.toString that is part of a structure called Real and also a function called Int.toString that is part of a structure called Int.

The basic form of a structure is:

structure <name> = struct <definitions> end The definitions can include the various elements we have been discussing:

function definitions
val declarations
exception definitions
type definitions, as in "type int3 = int * int * int"
datatype definitions

I said that we would be looking at an example that I got from Dan Grossman. We will define a structure called Rational that provides a datatype and functions for manipulating rational numbers. We are going to go through several different versions to better understand the details of structures and signatures.

So our overall structure will look like this:

        structure Rational =
        struct
            (* we'll fill in definitions here *)
        end;

We began with a datatype definition. Rational numbers are numbers that can be expressed as a ratio of two integers, so it would make sense to store them as a tuple of two ints. But most of us don't think of integers like 23 as a tuple. We know that 23 is a rational number, but we don't like to think of it as being "23 divided by 1". So we included two different cases, one for whole numbers and one for rationals that we need to express as a fraction:

        structure Rational =
        struct
            datatype rational = Whole of int | Fraction of int * int
        end;

I asked if there are any illegal rational numbers. Someone mentioned that numbers like "5 divided by 0" aren't legal rational numbers. So I included an exception that we can raise if any of our functions encounter an illegal rational number:

        structure Rational =
        struct
            datatype rational = Whole of int | Fraction of int * int
            exception NotARational
        end;

If we were providing a full-blown implementation, we'd include functions for adding, subtracting, multiplying and dividing such numbers. For our purposes, we'll implement just an add method as a way to explore these issues.

The signature for the add function is that it takes a tuple of rationals (rational * rational) and it returns a rational. Because there are two forms for a rational, we end up with four total cases for add:

        fun add(Whole(i), Whole(j)) = ?
        |   add(Whole(i), Fraction(c, d)) = ?
        |   add(Fraction(a, b), Whole(j)) = ?
        |   add(Fraction(a, b), Fraction(c, d)) = ?

It took us a while to work out the math, but together we filled in the following definitions:

        fun add(Whole(i), Whole(j)) = Whole(i + j)
        |   add(Whole(i), Fraction(c, d)) = Fraction(i * d + c, d)
        |   add(Fraction(a, b), Whole(j)) = Fraction(a + j * b, b)
        |   add(Fraction(a, b), Fraction(c, d)) =
                Fraction(a * d + c * b, b * d)

Someone pointed out that we should be reducing the result we get when we add two fractional numbers together. For example, if you were to add the rational numbers 1/8 with 1/4, our function will produce the rational number 12/32. Really this should be reduced to its lowest terms of 3/8.

How do we reduce a rational number? We find the greatest common divisor of the numerator and the denominator. And how do we find the GCD of two numbers? Someone suggested using Euclid's algorithm, but people didn't seem to remember the details. The idea behind Euclid's algorithm is that if you're trying to find the gcd(32, 12), then you can subtract multiples of 12 from 32 and get the same answer:

        gcd(32, 12) = gcd(32 - 12, 12) = gcd (32 - 2 * 12, 12) = gcd(8, 12)

I won't try to prove why this works, but I briefly mentioned that you know this is true because of something known as the Fundamental Theorem of Arithmetic. If a number goes evenly into both 32 and 12, then it has to go evenly into (32 - 12) or (32 - 2 * 12).

So how do we quickly eliminate all of the multiples of 12 from 32? We can call mod:

        gcd(32, 12) = gcd(32 mod 12, 12) = gcd(8, 12)

We used this as a starting point for a gcd definition:

        fun gcd(x, y) = gcd(x mod y, y);

Of course, this won't work without a base case. So what's our base case? That's a bit tricky, but it turns out that the best base case is to stop when y becomes 0 because then the gcd is simply x (the gcd of 0 and any number n is n):

        fun gcd(x, y) =
            if y = 0 then x
            else gcd(x mod y, y);

This almost works. The problem is that the mod trick works well only if x is greater than y. This function makes a series of infinite calls that don't make any progress:

        gcd(32, 12) = gcd(8, 12) = gcd(8, 12) = gcd(8, 12) = ...

We have a nice alternative here. We know that the expression (x mod y) will always be less than y. So we can simply turn around the order of the two arguments in the recursive call:

        fun gcd(x, y) =
            if y = 0 then x
            else gcd(y, x mod y);

This works well for positive numbers. Someone asked what happens if x is less than y initially. If that happens, then we basically end up calling gcd(y, x) and that starts the process. It's going to turn out that we'll have to do some work to fix this for negative numbers, but it works for nonnegatives just fine.

Using gcd, we were able to write a function to reduce a rational to its simplest form:

        fun reduce(Whole(i)) = Whole(i)
        |   reduce(Fraction(a, b)) =
                let val d = gcd(a, b)
                in if b = d then Whole(a div d)
                   else Fraction(a div d, b div d)
                end

We then spent some time writing a toString function for rationals:

        fun toString(Whole(i)) = Int.toString(i)
        |   toString(Fraction(a, b)) = Int.toString(a) ^ "/" ^ Int.toString(b)

Putting this all together, we ended up with the following overall structure:

        structure Rational =
        struct
            datatype rational = Whole of int | Fraction of int * int
            exception NotARational
        
            fun gcd(x, y) =
                if y = 0 then x
        	else gcd(y, x mod y)
        
            fun reduce(Whole(i)) = Whole(i)
            |   reduce(Fraction(a, b)) =
        	    let val d = gcd(a, b)
                    in if b = d then Whole(a div d)
        	       else Fraction(a div d, b div d)
                    end
        
            fun add(Whole(i), Whole(j)) = Whole(i + j)
            |   add(Whole(i), Fraction(c, d)) = Fraction(i * d + c, d)
            |   add(Fraction(a, b), Whole(j)) = Fraction(a + j * b, b)
            |   add(Fraction(a, b), Fraction(c, d)) = 
       	                                reduce(Fraction(a * d + c * b, b * d))
        
            fun toString(Whole(i)) = Int.toString(i)
            |   toString(Fraction(a, b)) = Int.toString(a) ^ "/" ^ Int.toString(b)
        end;

We still have many issues left to discuss. For example, should we reduce before we print? After all, we haven't done anything to prevent someone from constructing a rational number that is not in reduced form. We will explore many of these issues on Monday.

At this point I had just a few minutes left and I realized that it might not be fair to give out the programming assignment I had in mind because I hadn't yet talked about signatures. I gave people three choices:

I could give out the 50-point assignment I had been planning, but it would mean that people would have to learn some things on their own.
I could wait until Monday and give a 25-point assignment that would be somewhat easier.
I could cancel the assignment and try to find a way to include this on the midterm.

The vote was overwhelmingly in favor of the first choice. The Ullman book has coverage of both structures (section 8.2) and signatures (8.2.1), so you can read that over and how the two work together (8.2.2). We'll discuss it more on Monday.

For the assignment, you actually don't have to know much about signatures because I provide a skeleton file that has the signature you are supposed to use. A signature is like a Java interface. It lists a set of definitions without, in general, providing details about the implementation. So we'll be looking at a signature for rationals:

        signature RATIONAL =
        sig
           datatype rational = Whole of int | Fraction of int * int
           exception NotARational
           val add : rational * rational -> rational
           val toString : rational -> string
        end

The signature lists the parts of a structure that should be exposed to a client. We can add a simple notation in the header of the structure using the symbol ":>". This is similar to saying that a class implements an interface in Java:

        structure Rational :> RATIONAL =
        struct
            ...
        end;

We say that the Rational structure is restricted to the definitions contained in the RATIONAL signature. For example, the structure has functions called gcd and reduce, but these functions will not be available outside the structure. They are like private methods in a Java class. In Java you label each individual method as public or private. In ML we use a signature to describe a set of definitions that should be exposed to clients of a structure. A structure that is restricted to that signature is required to have all of those elements and anything not mentioned in the signature is considered to be private to the implementation (not visible outside the structure).

After introducing the signature above with the restriction on the Rational structure, we get the following message in the interpreter if we try to access the gcd or reduce functions:

        - Rational.gcd;
        stdIn:1.14-2.13 Error: unbound variable or constructor: gcd in path
        Rational.gcd
        - Rational.reduce;
        stdIn:1.1-2.2 Error: unbound variable or constructor: reduce in path
        Rational.reduce

Even if we open the structure, these functions will not be visible. We'll discuss this more in Monday's lecture. For the assignment, the skeleton file includes the signature and the notation to restrict the structure to the signature. This will guarantee that any helper functions you write to implement the structure will be hidden from clients of the structure.

Stuart Reges

Last modified: Sat Feb 3 20:06:50 PST 2007