When you write a large amount of code you find yourself having trouble managing the name space. For example, what if you load two different ML files that each have a function with a particular name? The second file loaded wipes out the definition from the first file. And what if you have a utility like toString that you want to implement for many different types? A structure provides a way to establish independent namespaces. For example, ML has a function called Real.toString that is part of a structure called Real and also a function called Int.toString that is part of a structure called Int.
The basic form of a structure is:
So our overall structure will look like this:
structure Rational = struct (* we'll fill in definitions here *) end;We began with a datatype definition. Rational numbers are numbers that can be expressed as a ratio of two integers, so it would make sense to store them as a tuple of two ints. But most of us don't think of integers like 23 as a tuple. We know that 23 is a rational number, but we don't like to think of it as being "23 divided by 1". So we included two different cases, one for whole numbers and one for rationals that we need to express as a fraction:
structure Rational = struct datatype rational = Whole of int | Fraction of int * int end;I asked if there are any illegal rational numbers. Someone mentioned that numbers like "5 divided by 0" aren't legal rational numbers. So I included an exception that we can raise if any of our functions encounter an illegal rational number:
structure Rational = struct datatype rational = Whole of int | Fraction of int * int exception NotARational end;If we were providing a full-blown implementation, we'd include functions for adding, subtracting, multiplying and dividing such numbers. For our purposes, we'll implement just an add method as a way to explore these issues.
The signature for the add function is that it takes a tuple of rationals (rational * rational) and it returns a rational. Because there are two forms for a rational, we end up with four total cases for add:
fun add(Whole(i), Whole(j)) = ? | add(Whole(i), Fraction(c, d)) = ? | add(Fraction(a, b), Whole(j)) = ? | add(Fraction(a, b), Fraction(c, d)) = ?It took us a while to work out the math, but together we filled in the following definitions:
fun add(Whole(i), Whole(j)) = Whole(i + j) | add(Whole(i), Fraction(c, d)) = Fraction(i * d + c, d) | add(Fraction(a, b), Whole(j)) = Fraction(a + j * b, b) | add(Fraction(a, b), Fraction(c, d)) = Fraction(a * d + c * b, b * d)Someone pointed out that we should be reducing the result we get when we add two fractional numbers together. For example, if you were to add the rational numbers 1/8 with 1/4, our function will produce the rational number 12/32. Really this should be reduced to its lowest terms of 3/8.
How do we reduce a rational number? We find the greatest common divisor of the numerator and the denominator. And how do we find the GCD of two numbers? Someone suggested using Euclid's algorithm, but people didn't seem to remember the details. The idea behind Euclid's algorithm is that if you're trying to find the gcd(32, 12), then you can subtract multiples of 12 from 32 and get the same answer:
gcd(32, 12) = gcd(32 - 12, 12) = gcd (32 - 2 * 12, 12) = gcd(8, 12)I won't try to prove why this works, but I briefly mentioned that you know this is true because of something known as the Fundamental Theorem of Arithmetic. If a number goes evenly into both 32 and 12, then it has to go evenly into (32 - 12) or (32 - 2 * 12).
So how do we quickly eliminate all of the multiples of 12 from 32? We can call mod:
gcd(32, 12) = gcd(32 mod 12, 12) = gcd(8, 12)We used this as a starting point for a gcd definition:
fun gcd(x, y) = gcd(x mod y, y);Of course, this won't work without a base case. So what's our base case? That's a bit tricky, but it turns out that the best base case is to stop when y becomes 0 because then the gcd is simply x (the gcd of 0 and any number n is n):
fun gcd(x, y) = if y = 0 then x else gcd(x mod y, y);This almost works. The problem is that the mod trick works well only if x is greater than y. This function makes a series of infinite calls that don't make any progress:
gcd(32, 12) = gcd(8, 12) = gcd(8, 12) = gcd(8, 12) = ...We have a nice alternative here. We know that the expression (x mod y) will always be less than y. So we can simply turn around the order of the two arguments in the recursive call:
fun gcd(x, y) = if y = 0 then x else gcd(y, x mod y);This works well for positive numbers. Someone asked what happens if x is less than y initially. If that happens, then we basically end up calling gcd(y, x) and that starts the process. It's going to turn out that we'll have to do some work to fix this for negative numbers, but it works for nonnegatives just fine.
Using gcd, we were able to write a function to reduce a rational to its simplest form:
fun reduce(Whole(i)) = Whole(i) | reduce(Fraction(a, b)) = let val d = gcd(a, b) in if b = d then Whole(a div d) else Fraction(a div d, b div d) endWe then spent some time writing a toString function for rationals:
fun toString(Whole(i)) = Int.toString(i) | toString(Fraction(a, b)) = Int.toString(a) ^ "/" ^ Int.toString(b)Putting this all together, we ended up with the following overall structure:
structure Rational = struct datatype rational = Whole of int | Fraction of int * int exception NotARational fun gcd(x, y) = if y = 0 then x else gcd(y, x mod y) fun reduce(Whole(i)) = Whole(i) | reduce(Fraction(a, b)) = let val d = gcd(a, b) in if b = d then Whole(a div d) else Fraction(a div d, b div d) end fun add(Whole(i), Whole(j)) = Whole(i + j) | add(Whole(i), Fraction(c, d)) = Fraction(i * d + c, d) | add(Fraction(a, b), Whole(j)) = Fraction(a + j * b, b) | add(Fraction(a, b), Fraction(c, d)) = reduce(Fraction(a * d + c * b, b * d)) fun toString(Whole(i)) = Int.toString(i) | toString(Fraction(a, b)) = Int.toString(a) ^ "/" ^ Int.toString(b) end;We still have many issues left to discuss. For example, should we reduce before we print? After all, we haven't done anything to prevent someone from constructing a rational number that is not in reduced form. We will explore many of these issues on Monday.
At this point I had just a few minutes left and I realized that it might not be fair to give out the programming assignment I had in mind because I hadn't yet talked about signatures. I gave people three choices:
For the assignment, you actually don't have to know much about signatures because I provide a skeleton file that has the signature you are supposed to use. A signature is like a Java interface. It lists a set of definitions without, in general, providing details about the implementation. So we'll be looking at a signature for rationals:
signature RATIONAL = sig datatype rational = Whole of int | Fraction of int * int exception NotARational val add : rational * rational -> rational val toString : rational -> string endThe signature lists the parts of a structure that should be exposed to a client. We can add a simple notation in the header of the structure using the symbol ":>". This is similar to saying that a class implements an interface in Java:
structure Rational :> RATIONAL = struct ... end;We say that the Rational structure is restricted to the definitions contained in the RATIONAL signature. For example, the structure has functions called gcd and reduce, but these functions will not be available outside the structure. They are like private methods in a Java class. In Java you label each individual method as public or private. In ML we use a signature to describe a set of definitions that should be exposed to clients of a structure. A structure that is restricted to that signature is required to have all of those elements and anything not mentioned in the signature is considered to be private to the implementation (not visible outside the structure).
After introducing the signature above with the restriction on the Rational structure, we get the following message in the interpreter if we try to access the gcd or reduce functions:
- Rational.gcd; stdIn:1.14-2.13 Error: unbound variable or constructor: gcd in path Rational.gcd - Rational.reduce; stdIn:1.1-2.2 Error: unbound variable or constructor: reduce in path Rational.reduceEven if we open the structure, these functions will not be visible. We'll discuss this more in Monday's lecture. For the assignment, the skeleton file includes the signature and the notation to restrict the structure to the signature. This will guarantee that any helper functions you write to implement the structure will be hidden from clients of the structure.