CSE413 Notes for Wednesday, 1/10/24

I started by asking people what polymorphism means. As with most computer science concepts, you'll find a good description at wikipedia for polymorphism, but I asked people for their sense of it.

Someone mentioned that inheritance gives you polymorphism because you might have two different methods that might be called depending upon the type of object you have. I said that's right. That form of polymorphism is known as subtyping. A class may define a method, but various subclasses might override that method (toString, for example, is often overriden). You know from the root words "poly" and "morph" that polymorphsim means "many forms". The idea is that a single method call could take one of several forms (could call one of several different methods depending upon the type of objects you are using).

Java has another kind of polymorphism with types like ArrayList<E>. The "E" is a type parameter and this has traditionally been referred to as parametric polymorphism, although the more modern term is simply generic types or generics. The idea is that you can write a single class that can be used for many different types. We write one definition for ArrayList<E>, but then can define an ArrayList<String> or an ArrayList<Point> or whatever.

OCaml's parent language ML was the first language to have this kind of polymorphism. C++ was another major language that tried to implement this with what are known as templates (although it's rather clumsy in C++). Java now has generics as of Java 5 and generics were added to Microsoft's C# programming language as well.

We started with a function called switch that can be used to reverse the order of two values in a pair (a 2-tuple):

        let switch(a, b) = (b, a)

The interpreter responded by saying:

        val switch : 'a * 'b -> 'b * 'a = <fun>

The 'a and 'b are like the E in ArrayList<E> in Java. They are type parameters. ML is telling us that we can provide this function with any 2-tuple whatsoever and that the result is a tuple in the opposite order. So we can feed it a string/int combination as we did before:

        switch("hello", 3)

which returns an int * string, or we can feed it an int/int combination:

        switch(3, 18)

which returns an int * int, or we can feed it a float and a list: switch(3.8, [2; 3]) which returns an int list * float. The function is polymorphic in that it can take any combination of types 'a and 'b.

I then mentioned that I had written three utility functions that I plan to use in varioius examples and perhaps in homework. The first introduces an infix operator --:

        let rec (--) x y =
            if x > y then []
            else x::(x + 1--y)

This code is similar to the range function from homework 1. It allows you to request lists of sequential integers using a convenient notation, as in:

        
        # 1--10;;
        - : int list = [1; 2; 3; 4; 5; 6; 7; 8; 9; 10]
        # 5--20;;
        - : int list = [5; 6; 7; 8; 9; 10; 11; 12; 13; 14; 15; 16; 17; 18; 19; 20]
        # -3--6;;
        - : int list = [-3; -2; -1; 0; 1; 2; 3; 4; 5; 6]

I also include functions called explode and implode that allow you to convert a string to a list of char values and to go in the other direction from a list of character values to a string, as in:

        # explode("hello");;
        - : char list = ['h'; 'e'; 'l'; 'l'; 'o']
        # implode(['c'; 'o'; 'm'; 'p'; 'u'; 't'; 'e'; 'r']);;
        - : string = "computer"

I then talked about how to implement a function called member that would return true or false depending upon whether a particular value is a member of a list. I asked what kind of lists would make it easy to answer this question and someone said an empty list, in which case the answer is false, so we began with:

        let rec member(value, lst) =
            match lst with
            | []    -> false

What else would be easy? Someone said that if the list begins with the value we are searching for, then we'd know its a member, so we tried saying:

        let rec member(value, lst) =
            match lst with
            | []          -> false
            | value::rest -> true

If this second case fails, then we would know that the list doesn't begin with value and we would want to return the result of searching for the value in the rest of the list, so we added one more case:

        let rec member(value, lst) =
            match lst with
            | []          -> false
            | value::rest -> true
            | x::xs       -> member(value, xs)

We got an interesting response from the interpreter:

        Warning 11: this match case is unused.

It would be nice if we could write the function this way, but pattern matching is not that powerful. We can't use the variable twice and expect OCaml to figure out that we are looking for equality. In effect, we have a duplicate pattern above. The use of "value" in the middle case is not something that is matched to the parameter "value". Instead it is introduced as a new binding for value and that pattern would match any nonempty list.

So we need to do the equality test ourselves:

        let rec member(value, lst) =
            match lst with
            | []          -> false
            | x::xs       -> x = value || member(value, xs)

Then we spent some time writing a function called stutter that would turn a string like "hello" into the string "hheelloo". Everyone knew that we'd begin by exploding the string into a list of characters, but then how do we process the characters? This is a great place for a helper function. I suggested that we use a let construct to make it local to the function we're writing:

        let stutter(str) =
            let rec helper(??) =
                ...
            in ??

So what kind of helper function do we want? Someone said that we should write something that stutters a list:

        let stutter(str) =
            let rec helper(lst) =
                match lst with
                | []    -> []
                | x::xs -> x::x::helper(xs)
            in

All that is left is to write the expression to include after "in". I said that your procedural instincts might lead you to think in terms of a sequence of actions to perform:

explode the string
call the helper function
implode the result

It's not bad to think of it this way, but you have to remember that a functional version ends up being written inside out. The first step becomes the innermost expression (exploding the string), this is then passed to the helper function and this in turn is passed to implode. So it's almost as if the functional code reads in backwards order to the procedural:

        let stutter(str) =
            let rec helper(lst) =
                match lst with
                | []    -> []
                | x::xs -> x::x::helper(xs)
            in implode(helper(explode(str)))

This is something you'll get used to as you program more in functional languages.

Someone asked if this helper function doesn't deserve to stand on its own rather than being embedded inside a let. I said that's fair and it would certainly be reasonable to do so. In general, I'm not going to require people to use a let construct for helper functions. It's more a matter of personal taste.

As our next example, I asked people how to write a function that will determine whether or not an integer is prime. People talked about some basic ideas. It should be odd. But what about 2? That's the one and only even prime, so maybe we could handle it separately:

        let is_prime(n) =
           n = 2 || n mod 2 <> 0...

What about negatives? By convention we don't consider them prime. So we can eliminate lots of possibilities by saying:

        let is_prime(n) =
           n = 2 || (n > 2 && n mod 2 <> 0)...

To complete this, we have to say that it has no factors other than 1 and itself. I asked people how they'd solve it with a loop. They said they'd start a variable i at 3 and test whether the number is divisible by i. If not, they'd increment i by 2. I said that often if you can conceive of something in that way as a loop, you can translate it into a helper function where the loop control variable(s) are parameters:

        let is_prime(n) =
            let rec no_factors(m) =
                something will a call on no_factors(m + 2)
            in n = 2 || (n > 2 && n mod 2 <> 0 && no_factors(3))
        fun prime(2) = true

We can use an if/else or a boolean expression to complete this. If n is divisible by the current m, then it's not prime. Otherwise we explore m + 2:

        let is_prime(n) =
            let rec no_factors(m) =
                n mod m <> 0 && no_factors(m + 2)
            in n = 2 || (n > 2 && n mod 2 <> 0 && no_factors(3))
        fun prime(2) = true

But how do we make it stop? We could stop when m becomes n, but we can do better. We can stop when m gets to the square root of m. We can test it by seeing if m * m is greater than n, in which case we'd know we have a prime (because we've explored all possible factors up to and including the square root of n):

        let is_prime(n) =
            let rec no_factors(m) =
                m * m > n || (n mod m <> 0 && no_factors(m + 2))
            in n = 2 || (n > 2 && n mod 2 <> 0 && no_factors(3))

Then we talked about how to write a function that would return the list obtained by merging two sorted lists into one sorted list. We had base cases for one or the other list being empty:

        let rec merge(lst1, lst2) =
            match (lst1, lst2) with
            | ([], ys) -> ys
            | (xs, []) -> xs
            ...

Notice that in this case we are matching a tuple with the match expression rather than a simple variable. We considered a case where both lists are empty, but we concluded that the first case takes care of that (actually either case takes care of it, but the first case is the one that will end up handling it). We then considered the case where each list has at least one value:

        let rec merge(lst1, lst2) =
            match (lst1, lst2) with
            | ([], ys)       -> ys
            | (xs, [])       -> xs
            | (x::xs, y::ys) -> ...

Someone said we test whether x is less than y. Given that test, we either put x or y at the front of the answer and we recurse on the tail of the list with the smaller value and the complete other list:

        let rec merge(lst1, lst2) =
            match (lst1, lst2) with
            | ([], ys) -> ys
            | (xs, []) -> xs
            | (x::xs, y::ys) -> if x < y then x::merge(xs, y::ys)
                                 else y::merge(x::xs, ys)

This version of the function worked fine.

Then we talked about how to implement the merge sort algorithm. In doing so, we can use the split function discussed in the prior lecture. Recall that it takes a list as a parameter and it returns a tuple of two lists, each with half of the values from the original list.

In the general case, we split the list, sort the two sublists and then merge the two sorted lists. We used a let expression to introduce variables for the two lists that come back from a call on split:

        let rec merge_sort(lst) =
            let (lst1, lst2) = split(lst)
            in ...

If you think procedurally, you might think of it as three more steps:

sort the first sublist
sort the second sublist
merge the two together

We want to express this in a more functional way using a single expression:

        let rec merge_sort(lst) =
            let (lst1, lst2) = split(lst)
            in merge(merge_sort(lst1), merge_sort(lst2))

We tried running this version of the code and found that it didn't work. It went into infinite recursion. The problem is that eventually you will get down to a 1-element list and the split function returns a tuple with the same 1-element list along with an empty list. So we end up recursively sorting the same 1-element list over and over. We also never told it what to do with an empty list. So we added an extra test using an if/else expression to define our base case as a list with fewer than 2 element which is sorted already:

        let rec merge_sort(lst) =
            if (List.length(lst) <= 1) then lst
            else
                let (lst1, lst2) = split(lst)
                in merge(merge_sort(lst1), merge_sort(lst2))

This version worked fine.

Stuart Reges

Last modified: Tue Feb 13 10:34:07 PST 2024