CSE341 Notes for Wednesday, 1/10/07

I mentioned that now that we've finished the first three chapters of the Ullman book, I'm likely to start skipping around a bit in the book. For example, I wanted to talk about polymorphism and the Ullman book discusses it, but it glosses over it in chapter 3 and it goes more in depth than we need in chapter 5, so we're going to do something a little in between those two.

I started by asking people what polymorphism means. As with most computer science concepts, you'll find a good description at wikipedia for polymorphism, but I asked people for their sense of it. Someone mentioned that inheritance gives you polymorphism because you might have two different methods that might be called depending upon the type of object you have. I said that's right. You know from the root words "poly" and "morph" that polymorphsim means "many forms". The idea is that a single function call could take one of several forms (could call one of several different methods depending upon the type of objects you are using).

Java has another kind of polymorphism with types like ArrayList<E>. The "E" is a type parameter and this has traditionally been referred to as parametric polymorphism, although the more modern term is simply generic types or generics. The idea is that you can write a single class that can be used for many different types. We write one definition for ArrayList<E>, but then can define an ArrayList<String> or an ArrayList<Point> or whatever.

ML was the first language to have this kind of polymorphism. C++ was another major language that tried to implement this with what are known as templates (although it's rather clumsy in C++). Java now has generics as of Java 5 and generics were added to Microsoft's C# programming language as well.

In an earlier lecture I showed how to write a function that would switch tuples in a list and turn a (string * int) list into an (int * string) list. We used this helper function to do so:

        fun switch(a:string, b:int) = (b, a);

I included the types for a and b because I didn't want to talk about polymorphsm at that time. But now we do want to talk about it, so I typed the function in the interpreter without the types:

        fun switch(a, b) = (b, a);

The interpreter responded by saying:

        val switch = fn : 'a * 'b -> 'b * 'a

The 'a and 'b are like the E in ArrayList<E> in Java. They are type parameters. ML is telling us that we can provide this function with any tuple whatsoever and that the result is a tuple in the opposite order. So we can feed it a string/int combination as we did before:

        switch("hello", 3);

which returns an int * string, or we can feed it an int/int combination:

        switch(3, 18);

which returns an int * int, or we can feed it a real and a list: switch(3.8, [2, 3]); which returns an int list * real. The function is polymorphic in that it can take any combination of types 'a and 'b.

I then talked about how to implement a function called member that would return true or false depending upon whether a particular value is a member of a list. I asked what kind of lists would make it easy to answer this question and someone said an empty list, in which case the answer is false, so we began with:

        fun member(v, []) = false

What else would be easy? Someone said that if the list begins with v, then we'd know its a member, so we tried saying:

        fun member(v, []) = false
        |   member(v, v::vs) = true;

When I loaded this definition into ML, we got an error message:

        Error: duplicate variable in pattern(s): v

Pattern matching is limited in what it can handle. In particular, it can't figure out this kind of match where the same variable is used twice. But we can do the same kind of thing ourselves with an if/else:

        fun member(v, []) = false
        |   member(v, x::xs) =
                if x = v then true
                else ...

I asked people what to fill in for the else part and they said we'd recursively search for v in the xs:

        fun member(v, []) = false
        |   member(v, x::xs) =
                if x = v then true
                else member(v, xs);

This is a correct implementation of the function. But when we loaded it into ML, we got a strange response:

wed.sml:7.11 Warning: calling polyEqual
val member = fn : ''a * ''a list -> bool

The warning is not a problem. It's just letting us know that because we used an equality comparison in the function, it was using something called "polyEqual" (polymorphic equals). As a result, our type parameter is ''a instead of 'a. ML uses the two quotes as a way to indicate that this type parameter is slightly more constrained than usual. In particular, we are limited to types that support tests on equality. We were able to call member with lists of ints, lists of strings, even lists of tuples that had ints and strings. But we werent' able to call it on a list of reals. It generated this error message:

    stdIn:1.1-5.4 Error: operator and operand don't agree [equality type required]
      operator domain: ''Z * ''Z list
      operand:         real * real list
      in expression:
        member (3.8,4.7 :: nil)

The key thing to pay attention to here is "equality type required". The type real is not an equality type. I asked if anyone could think of other types that can't be compared for equality and someone said functions. We verified that by asking the interpreter whether:

        member = member;

We got a similar message about an equality type being required. This is a case where ML should be able to figure out that the functions are equal, but in general, this is an undecidable problem. In other words, ML couldn't have a general purpose solution to this even if it wanted to because it is not possible to always determine whether two functions are equivalent in behavior.

Someone also asked if we couldn't apply "Boolean Zen" to this function and eliminate the if/else, which we can:

        fun member(v, []) = false
        |   member(v, x::xs) = (x = v) orelse member(v, xs);

Then we spent some time writing a function called stutter that would turn a string like "hello" into the string "hheelloo". Everyone knew that we'd begin by exploding the string into a list of characters, but then how to process the characters? This is a great place for a helper function. I suggested that we use a let construct to make it local to the function we're writing:

        fun stutter(str) =
                let fun helper(?) = ?
                in ? involving explode(str)
                end;

So what kind of helper function do we want? Someone said that we should write something that stutters a list:

        fun stutter(str) =
                let fun helper([]) = []
                    |   helper(x::xs) = x::x::helper(xs)
                in ? involving explode(str)
                end;

All that is left is to write the expression to include after "in". I said that your procedural instincts might lead you to think in terms of a sequence of actions to perform:

explode the string
call the helper function
implode the result

It's not bad to think of it this way, but you have to remember that a functional version ends up being written inside out. The first step becomes the innermost expression (exploding the string), this is then passed to the helper function and this in turn is passed to implode. So it's almost as if the functional code reads in backwards order to the procedural:

        fun stutter(str) =
                let fun helper([]) = []
                    |   helper(x::xs) = x::x::helper(xs)
                in implode(helper(explode(str)))
                end;

This is something you'll get used to as you program more in functional languages.

Someone asked if this helper function doesn't deserve to stand on its own rather than being embedded inside a let. I said that's fair and it would certainly be reasonable to do so. In general, I'm not going to require people to use a let construct for helper functions. It's more a matter of personal taste.

As our final example, I asked people how to write a function that will determine whether or not an integer is prime. People talked about some basic ideas. It should be odd. But what about 2? That's the one and only even prime, so maybe we could handle it separately:

        fun prime(2) = true
        |   prime(n) = ? it's odd and ...

What about negatives? By convention we don't consider them prime. So we can eliminate lots of possibilities by saying:

        fun prime(2) = true
        |   prime(n) = n > 1 andalso n mod 2 = 1 andalso ?

To complete this, we have to say that it has no factors other than 1 and itself. I asked people how they'd solve it with a loop. They said they'd start a variable i at 3 and test whether the number is divisible by i. If not, they'd increment i by 2. I said that often if you can conceive of something in that way as a loop, you can translate it into a helper function where the loop control variable(s) are parameters:

        fun prime(2) = true
        |   prime(n) = 
                let fun noFactors(m) = something with call on noFactors(m + 2)
                in n > 1 andalso n mod 2 = 1 andalso ?
                end;

We can use an if/else or a boolean expression to complete this. If n is divisible by the current m, then it's not prime. Otherwise we explore m + 2:

        fun prime(2) = true
        |   prime(n) = 
                let fun noFactors(m) =
                    n mod m <> 0 andalso noFactors(m + 2)
                in n > 1 andalso n mod 2 = 1 andalso ?
                end;

But how do we make it stop? We could stop when m becomes n, but we can do better. We can stop when m gets to the square root of m. We aren't supposed to call Math.sqrt, but we can test it by seeing if m * m is greateter than n, in which case we'd know we have a prime (because we've explore all possible factors up to and including the square root of n):

        fun prime(2) = true
        |   prime(n) = 
                let fun noFactors(m) =
                    if m * m > n then true
                    else n mod m <> 0 andalso noFactors(m + 2)
                in n > 1 andalso n mod 2 = 1 andalso ?
                end;

The final change is to call the helper function starting with 3 in the main expression (which is a parallel of our loop initialization of it to 3):

        fun prime(2) = true
        |   prime(n) = 
                let fun noFactors(m) =
                    if m * m > n then true
                    else n mod m <> 0 andalso noFactors(m + 2)
                in n > 1 andalso n mod 2 = 1 andalso noFactors(3)
                end;

Stuart Reges

Last modified: Sun Jan 21 10:22:52 PST 2007