CSE341 Notes for Monday, 4/6/09

As a warmup we discussed how to write a function to return the values from a list that appear in odd positions. I said that we'd assume that we are using one-based indexing, so the values in odd positions are the first, third, fifth, and so. Using patterns, we wrote the following function for odds:

        fun odds([]) = []
        |   odds([x]) = [x]
        |   odds(x::y::zs) = x::odds(zs);

Then we turned to a related but tougher problem: how do we split a list into two lists: those in odd positions and those in even positions. Since it has to return two things, we'll assume it returns a tuple. For example, the call split([1, 8, 6, 4, 9]) should return ([1, 6, 9], [8, 4]). We included a case for the empty list:

        fun split([]) = ([], [])

and then I asked people to think about the general case where we have at least two values:

        fun split([]) = ([], [])
        |   split(x::y::zs) = ?

I asked how we write this. Someone suggested that we make a recursive call passing in zs. That seems like a good idea. But then what do we do with the result? The result is a tuple. That's not particularly easy to work with. We could store the tuple in a variable using let:

        fun split([]) = ([], [])
        |   split(x::y::zs) = 
                let val result = split(zs)
                in ?
                end;

This would work, but we'd still have to use functions like #1 and #2 to pull apart the result. We can do even better by using a pattern that mentions the two parts of the tuple:

        fun split([]) = ([], [])
        |   split(x::y::zs) = 
                let val (M, N) = split(zs)
                in ?
                end;

This will recursively split the list and then bind the variables M and N to be the two parts of the resulting tuple. The overall result in this case is a new tuple that puts x at the front of M and y at the front of N:

        fun split([]) = ([], [])
        |   split(x::y::zs) = 
                let val (M, N) = split(zs)
                in (x::M, y::N)
                end;

When we tried to load this into the interpreter, we got the warning that the matches are not exhaustive. I said that for functions, you want to pay close attention to this. Maybe it's okay because you have a precondition that a certain case won't happen, but then be sure you've thought about it. In this case, when we tried to split a list, we got an error:

        uncaught exception Match [nonexhaustive match failure]

The problem is that we need a case in the original for a one-element list:

        fun split([]) = ([], [])
        |   split([x]) = ([x], [])>
        |   split(x::y::zs) = 
                let val (M, N) = split(zs)
                in (x::M, y::N)
                end;

This version of the function worked fine.

Then we talked about how to write a function that would return the list obtained by merging two sorted lists into one sorted list. We had base cases for one or the other list being empty:

        fun merge(L, []) = L
        |   merge([], M) = M
        ...

We considered a case where both lists are empty, but we concluded that the first case takes care of that (actually either case takes care of it, but the first case is the one that will end up handling it). We then considered the case where each list has at least one value:

        fun merge(L, []) = L
        |   merge([], M) = M
        |   merge(x::xs, y::ys) =
        ...

Someone said we test whether y is greater than x. Given that test, we either put x or y at the front of the answer:

        fun merge(L, []) = L
        |   merge([], M) = M
        |   merge(x::xs, y::ys) =
                if y > x then x::merge(xs, y::ys)
        	else y::merge(x::xs, ys)

This version of the function worked fine, but it seems a bit awkward to have to compute "x::xs" or "y::ys" in making the recursive call. ML gives an alternative with an "as" clause:

        fun merge(L, []) = L
        |   merge([], M) = M
        |   merge(L as x::xs, M as y::ys) =
                if y > x then x::merge(xs, M)
        	else y::merge(L, ys)

Then we talked about how to implement the merge sort algorithm. As usual, we have an empty list case:

        fun mergeSort([]) = []
        ...

In the general case, we split the list, sort the two sublists and then merge the two sorted lists. We used a val declaration to introduce variables for the two lists that come back from a call on split:

        fun mergeSort([]) = []
        |   mergeSort(lst) =
                let val (M, N) = split(lst)
                in ...
                end

If you think procedurally, you might think of it as three more steps:

sort the first sublist
sort the second sublist
merge the two together

Ullman seems to be thinking that way because he includes two extra val declarations:

        fun mergeSort([]) = []
        |   mergeSort(L) =
                let val (M, N) = split(L)
                    val list1 = mergeSort(M)
                    val list2 = mergeSort(N)
                in merge(list1, list2)
                end

I think it's better to express this in a more functional way using a single expression:

        fun msort([]) = []
        |   msort(L) =
                let val (M, N) = split(L)
        	in merge(msort(M), msort(N))
        	end

We tried running this version of the code and found that it didn't work. It went into infinite recursion. The problem is that eventually you will get down to a 1-element list and the split function returns a tuple with the same 1-element list along with an empty list. So we end up recursively sorting the same 1-element list over and over. For this function, we need a special case for the 1-element list:

        fun msort([]) = []
        fun msort([x]) = [x]
        |   msort(L) =
                let val (M, N) = split(L)
        	in merge(msort(M), msort(N))
        	end

This version worked fine.

Then we wrote a function called qsort that uses the quicksort algorithm to sort a list. Initially we limited ourselves to sorting lists of ints. We began with our usual question about an easy list to work with and someone said we should have a case for empty list, so we began with:

        fun qsort([]) = []
        |   qsort(x::xs) = ?

Then I asked if anyone remembered how quicksort works. Someone mentioned that the main steps are to:

Pick a value from the list that we refer to as the pivot.

Split the list into 2 parts: values less than the pivot and values greater than the pivot. I said that this step is often referred to as partitioning the list and the two parts are often referred to as partitions. I mentioned that we have to include the possibility of values equal to the pivot, although it doesn't matter which partition we put them into.

Quicksort the two partitions.

Put the pieces together.

One nice thing about quicksort is that putting the pieces together isn't difficult. We know that the values in the first partition all come before the values in the second partition, so it's just a matter of gluing the pieces together.

I said that we'd keep things simple by using the first value in the list as our pivot. This isn't an ideal choice, especially if the list is already sorted, but it will work well for the randomized lists we want to work with. So I changed the variable names in our code to reflect this choice:

        fun qsort([]) = []
        |   qsort(pivot::rest) = ?

The first step in solving this is to partition the list, so I asked people how to implement this in ML and people seemed to be stumped. That's not surprising because we're just starting to learn ML and this is a nontrivial computation. I said that you don't have to abandon your programming instincts from procedural programming. So how would you solve it in a language like Java if you were asked to work with a linked list?

By thinking through that, we developed the following pseudocode for partitioning the list:

         partition 1 = empty list
         partition 2 = empty list
         for (each value in list)
            if (value <= pivot)
                move it to partition 1
            } else {
                move it to partition 2
            }
          }
          finish up

I said that you can convert this iterative solution to a recursive solution without much trouble. With a loop you use a set of local variables to keep track of the current state of your computation. Here there are three such local variables: the list that we are partitioning, the first partition, and the second partition. Local variables like these become parameters to a helper function. Our helper function is supposed to partition the list, so we decided to call it "partition". Since the loop involves three variables, two of which are initialized to be empty lists, we write partition as a helper function of three parameters and we include empty lists for two of the parameters in the initial call:

        fun qsort([]) = []
        |   qsort(pivot::rest) =
                let fun partition(list, part1, part2) = ?
                in partition(rest, [], [])
                end;

Notice how the call after the word "in" exactly parallels the situation before our loop begins executing. Our three state variables are the list of values to partition and two variables for storing the two partitions which are initially empty.

In our pseudocode we continue until the overall list becomes empty, at which time we "finish up" the computation. We can include this as one of the cases for our helper function:

        fun qsort([]) = []
        |   qsort(pivot::rest) =
                let fun partition([], part1, part2) = finish up
                    |   partition(list, part1, part2) = ?
                in partition(rest, [], [])
                end;

Compare this initial case for partition versus the initial call for partition. We go from having these three values for our computation:

        (original list, [], [])

to having this set of values for our computation:

        ([], partition 1, partition 2)

In other words, we go from having all of the values stored in the original list and having two empty partitions to having an empty list of values and two partitions that have been filled in. Now we just have to describe how we go from one to the other. In our pseudocode, each time through the loop we handled one element of the original list, either moving it to partition 1 or moving it to partition 2. We can do the same thing with our helper function. So first we need to indicate that in the second case for partition, we want to process one value from the original list. We do so by replacing the "list" above with "x::xs":

        fun qsort([]) = []
        |   qsort(pivot::rest) =
                let fun partition([], part1, part2) = finish up
                    |   partition(x::xs, part1, part2) = ?
                in partition(rest, [], [])
                end;

We had an if/else in the loop and we can use an if/else here:

        fun qsort([]) = []
        |   qsort(pivot::rest) =
                let fun partition([], part1, part2) = finish up
                    |   partition(x::xs, part1, part2) =
                            if x <= pivot then (x goes in 1st partition)
                            else (x goes in 2nd partition)
                in partition(rest, [], [])
                end;

If xx belongs in the first partition, then we want to go from having this set of values:

        (x::xs, part1, part2)

to having this set of values:

        (xs, x::part1, part2)

In other words, we move x from the list of values to be processed into the first partition. In the loop, we'd then come around the loop for the next iteration. In a recursive solution, we simply make a recursive call with those new values:

        fun qsort([]) = []
        |   qsort(pivot::rest) =
                let fun partition([], part1, part2) = finish up
                    |   partition(x::xs, part1, part2) =
                            if x <= pivot then partition(xs, x::part1, part2)
                            else (value goes in 2nd partition)
                in partition(rest, [], [])
                end;

In the second case we do something similar, moving value into the second partition and using a recursive call to continue the computation:

        fun qsort([]) = []
        |   qsort(pivot::rest) =
                let fun partition([], part1, part2) = finish up
                    |   partition(x::xs, part1, part2) =
                            if x <= pivot then partition(xs, x::part1, part2)
                            else partition(xs, part1, x::part2)
                in partition(rest, [], [])
                end;

The only thing we had left to fill in was the "finish up" part. Remember that reaching that point in the code is like finishing the loop in our pseudocode. We started out by saying that we need to quicksort the two partitions. That needs to be part of what we do in the "finish up" part:

        fun qsort([]) = []
        |   qsort(pivot::rest) =
                let fun partition([], part1, part2) = need to: qsort(part1), qsort(part2)
                    |   partition(x::xs, part1, part2) =
                            if x <= pivot then partition(xs, x::part1, part2)
                            else partition(xs, part1, x::part2)
                in partition(rest, [], [])
                end;

The qsort function returns a list, so what do we do with the two sorted lists that come back from these recursive calls? We glue them together with append:

        fun qsort([]) = []
        |   qsort(pivot::rest) =
                let fun partition([], part1, part2) = qsort(part1) @ qsort(part2)
                    |   partition(x::xs, part1, part2) =
                            if x <= pivot then partition(xs, x::part1, part2)
                            else partition(xs, part1, x::part2)
                in partition(rest, [], [])
                end;

This is not quite right, though, because we haven't accounted for the pivot. We didn't add it to either of our partitions. That's, in general, a good thing, because it guarantees that each of the partitions we recursively pass to qsort will be smaller than the original list. But it means we have to manually place the pivot into the result. It belongs right in the middle of the two partitions. We could use a cons operator ("::") to indicate this, but I think it's a little clearer in this case to show that we are gluing three pieces together, one of which is the pivot itself:

        fun qsort([]) = []
        |   qsort(pivot::rest) =
                let fun partition([], part1, part2) =
                            qsort(part1) @ [pivot] @ qsort(part2)
                    |   partition(x::xs, part1, part2) =
                            if x <= pivot then partition(xs, x::part1, part2)
                            else partition(xs, part1, x::part2)
                in partition(rest, [], [])
                end;

At that point we were done. This is a working version of quicksort. I loaded it in the ML interpreter and we managed to sort some short lists. I had only a few minutes left at that point, so I turned to the new idea I wanted to introduce: higher-order functions. In ML, functions are first class data values just like ints, strings, reals, and lists. That means that you can pass functions as arguments to other functions. A function that takes another function as an argument is called a higher-order function.

So I turned back to our qsort function. I modified the function so that it takes a comparison function as an argument and I replaced the "<=" comparison with a call on the function passed as an argument. This required several changes because each recursive call and each pattern had to be updated:

        fun qsort(f, []) = []
        |   qsort(f, pivot::rest) =
                let fun partition([], part1, part2) =  
        		qsort(f, part1) @ [pivot] @ qsort(f, part2)
                    |   partition(x::xs, part1, part2) =
        		    if f(value, pivot) then partition(xs, x::part1, part2)
        		    else partition(xs, part1, x::part2)
        	in partition(rest, [], [])
        	end

We were still able to do the original sorting by saying:

        qsort(op <=, test);

But now we had the flexibility to sort it backwards:

        qsort(op >=, test);

And to define our own comparison function that sorts on magnitude:

        fun lessMagnitude(x, y) = abs(x) < abs(y);
        qsort(lessMagnitude, test);

So we can easily change the definition of ordering to have this sort in a different way.

Stuart Reges

Last modified: Mon Apr 6 11:07:51 PDT 2009