fun qsort([]) = [] | qsort(x::xs) = ?Then I asked if anyone remembered how quicksort works. Someone mentioned that the main steps are to:
One nice thing about quicksort is that putting the pieces together isn't difficult. We know that the values in the first partition all come before the values in the second partition, so it's just a matter of gluing the pieces together.
I said that we'd keep things simple by using the first value in the list as our pivot. This isn't an ideal choice, especially if the list is already sorted, but it will work well for the randomized lists we want to work with. So I changed the variable names in our code to reflect this choice:
fun qsort([]) = [] | qsort(pivot::rest) = ?The first step in solving this is to partition the list, so I asked people how to implement this in ML and people seemed to be stumped. That's not surprising because we're just starting to learn ML and this is a nontrivial computation. I said that you don't have to abandon your programming instincts from procedural programming. So how would you solve it in a language like Java if you were asked to work with a linked list?
By thinking through that, we developed the following pseudocode for partitioning the list:
given some list list1 = empty list list2 = empty list while (list is not empty) if (first value in list <= pivot) move it to list1 } else { move it to list2 } } finish upI said that you can convert this iterative solution to a recursive solution without much trouble. With a loop you use a set of local variables to keep track of the current state of your computation. Here there are three such local variables: the list that we are partitioning, the first partition, and the second partition. Local variables like these become parameters to a helper function. Our helper function is supposed to partition the list, so we decided to call it "partition". Since the loop involves three variables, two of which are initialized to be empty lists, we write partition as a helper function of three parameters and we include empty lists for two of the parameters in the initial call:
fun qsort([]) = [] | qsort(pivot::rest) = let fun partition(list, list1, list2) = ? in partition(rest, [], []) end;Notice how the call after the word "in" exactly parallels the situation before our loop begins executing. Our three state variables are the list of values to partition and two variables for storing the two partitions which are initially empty.
In our pseudocode we continue until the overall list becomes empty, at which time we "finish up" the computation. We can include this as one of the cases for our helper function:
fun qsort([]) = [] | qsort(pivot::rest) = let fun partition([], list1, list2) = finish up | partition(list, list1, list2) = ? in partition(rest, [], []) end;Compare this initial case for partition versus the initial call for partition. We go from having these three values for our computation:
(original list, [], [])to having this set of values for our computation:
([], list1, list2)In other words, we go from having all of the values stored in the original list and having two empty partitions to having an empty list of values and two partitions that have been filled in. Now we just have to describe how we go from one to the other. In our pseudocode, each time through the loop we handled one element of the original list, either moving it to partition 1 or moving it to partition 2. We can do the same thing with our helper function. So first we need to indicate that in the second case for partition, we want to process one value from the original list. We do so by replacing the "list" above with "x::xs":
fun qsort([]) = [] | qsort(pivot::rest) = let fun partition([], list1, list2) = finish up | partition(x::xs, list1, list2) = ? in partition(rest, [], []) end;We had an if/else in the loop and we can use an if/else here:
fun qsort([]) = [] | qsort(pivot::rest) = let fun partition([], list1, list2) = finish up | partition(x::xs, list1, list2) = if x <= pivot then (x goes in 1st partition) else (x goes in 2nd partition) in partition(rest, [], []) end;If x belongs in the first partition, then we want to go from having this set of values:
(x::xs, list1, list2)to having this set of values:
(xs, x::list1, list2)In other words, we move x from the list of values to be processed into the first partition. In the loop, we'd then come around the loop for the next iteration. In a recursive solution, we simply make a recursive call with those new values:
fun qsort([]) = [] | qsort(pivot::rest) = let fun partition([], list1, list2) = finish up | partition(x::xs, list1, list2) = if x <= pivot then partition(xs, x::list1, list2) else (value goes in 2nd partition) in partition(rest, [], []) end;In the second case we do something similar, moving the value into the second partition and using a recursive call to continue the computation:
fun qsort([]) = [] | qsort(pivot::rest) = let fun partition([], list1, list2) = finish up | partition(x::xs, list1, list2) = if x <= pivot then partition(xs, x::list1, list2) else partition(xs, list1, x::list2) in partition(rest, [], []) end;The only thing we had left to fill in was the "finish up" part. Remember that reaching that point in the code is like finishing the loop in our pseudocode. We started out by saying that we need to quicksort the two partitions. That needs to be part of what we do in the "finish up" part:
fun qsort([]) = [] | qsort(pivot::rest) = let fun partition([], list1, list2) = need to: qsort(list1), qsort(list2) | partition(x::xs, list1, list2) = if x <= pivot then partition(xs, x::list1, list2) else partition(xs, list1, x::list2) in partition(rest, [], []) end;The qsort function returns a list, so what do we do with the two sorted lists that come back from these recursive calls? We glue them together with append:
fun qsort([]) = [] | qsort(pivot::rest) = let fun partition([], list1, list2) = qsort(list1) @ qsort(list2) | partition(x::xs, list1, list2) = if x <= pivot then partition(xs, x::list1, list2) else partition(xs, list1, x::list2) in partition(rest, [], []) end;This is not quite right, though, because we haven't accounted for the pivot. We didn't add it to either of our partitions. That's, in general, a good thing, because it guarantees that each of the partitions we recursively pass to qsort will be smaller than the original list. But it means we have to manually place the pivot into the result. It belongs right in the middle of the two partitions. We could use a cons operator ("::") to indicate this, but I think it's a little clearer in this case to show that we are gluing three pieces together, one of which is the pivot itself:
fun qsort([]) = [] | qsort(pivot::rest) = let fun partition([], list1, list2) = qsort(list1) @ [pivot] @ qsort(list2) | partition(x::xs, list1, list2) = if x <= pivot then partition(xs, x::list1, list2) else partition(xs, list1, x::list2) in partition(rest, [], []) end;At that point we were done. This is a working version of quicksort. I loaded it in the ML interpreter and we managed to sort some short lists. I had only a few minutes left at that point, so I turned to the new idea I wanted to introduce: higher-order functions. In ML, functions are first class data values just like ints, strings, reals, and lists. That means that you can pass functions as arguments to other functions. A function that takes another function as an argument is called a higher-order function.
So I turned back to our qsort function. I modified the function so that it takes a comparison function as an argument and I replaced the "<=" comparison with a call on the function passed as an argument. This required several changes because each recursive call and each pattern had to be updated:
fun qsort(f, []) = [] | qsort(f, pivot::rest) = let fun partition([], list1, list2) = qsort(f, list1) @ [pivot] @ qsort(f, list2) | partition(x::xs, list1, list2) = if f(value, pivot) then partition(xs, x::list1, list2) else partition(xs, list1, x::part2) in partition(rest, [], []) endWe were still able to do the original sorting by saying:
qsort(op <=, test);But now we had the flexibility to sort it backwards:
qsort(op >=, test);So we can easily change the definition of ordering to have this sort in a different way.
Then I asked people how to write a function that would return a list in reverse order. For example, the call rev([1, 2, 3, 4]) should return [4, 3, 2, 1]. This can be easily written using the concatenation operator:
fun rev([]) = [] | rev(x::xs) = rev(xs) @ [x]This version works, but it is inefficient. To explain why, I spent a few minutes discussing how the :: and @ operators work in ML.
Consider the following val bindings:
val x = [1, 2, 3]; val y = 0::x; val z = x @ [4];ML stores lists internally as a linked list of nodes that each have data and a reference to the next node (these correspond to the head and the tail). So the structure is similar to what we get with standard Java linked list nodes:
public class ListNode { public int data; public ListNode next; ... }So when we execute:
val x = [1, 2, 3];ML creates a list of 3 nodes:
x --> [1] --> [2] --> [3]What happens when we execute the second binding?
val y = 0::x;The :: operator is often referred to as "cons," which is short for "construct." In other words, it constructs a new list element:
y --> [0] --> ??This new list element has 0 as the head, but what does it use for the tail? ML could make a copy of the list that x refers to, but that's not what happens. instead, it sets up a link that shares the memory set aside for x:
x --> [1] --> [2] --> [3] ^ | y --> [0] --+In the Java universe this would be a bad idea. If you share part of the list, then you're likely to end up with a confusing outcome. For example, what if you use the variable y to traverse the list and you change the second and third values in the list. The second and third values in the list that y refers to are the first and second values in the list that x refers to. So in that case, changing y would change x as well. Normally we'd want to avoid that kind of interference.
This is where the concept of mutable state comes into play. In ML, lists are immutable. Once you have constructed a list element with a particular head and tail, you can never change them. So it's not dangerous to allow this kind of sharing because ML prevents this kind of interference. This is an example where the choice to prevent mutation has made it easier for ML to be efficient about the allocation of space.
You can simulate this in Java as well. To get immutable lists in Java, you'd make the fields final:
public class ListNode { public final int data; public final ListNode next; ... }But what about the final binding?
val z = x @ [4];This is a case where ML can't make use of sharing. The variable x refers to a list that ends with 3. If you tried to change it to instead point to a new list element storing 4, then you'd be damaging the original list. So this is a case where ML has no choice but to make a copy of the contents of x:
z --> [1] --> [2] --> [3] --> [4]A simple rule of thumb to remember is that the :: operator always executes in O(1) time (constant time) because it always constructs exactly one list element while the @ operator runs in O(n) time where n is the length of the first list because that first list has to be copied.
So let's return to the inefficient version of the reversing function:
fun rev([]) = [] | rev(x::xs) = rev(xs) @ [x]Because we are using the @ operator, we are going to be created lots of list copies. In effect, to reverse a list of n elements, we're going to make a copy of a list of length 1, and a copy of a list of length 2, and a copy of a list of length 3, and so on, ending with making a copy of a list of length n-1. That will require O(n2) time.
We can do better than that. I asked people how you'd approach it iteratively. We came up with this pseudocode:
list [1, 2, 3, 4] result = [] while (list is not empty) { x = remove first element result = x::result; }As I mentioned in the previous lecture, you can translate an iterative process like this in a functional equivalent by thinking about the different states that this computation goes through. There are two different variables involved here: list and result. Here's how they change as you iterate through the loop:
list result ---------------------------- [1, 2, 3, 4] [] [2, 3, 4] [1] [3, 4] [2, 1] [4] [3, 2, 1] [] [4, 3, 2, 1]Instead of having two variables that change in value each time you iterate through the loop (the mutable state approach), you can instead have a function of two arguments where each time you call the function you compute the next pair of values to use in the computation. So we'll write this using a helper function:
fun rev2(lst) = let fun loop(list, result) = ?? in ?? endThe loop starts with list being the overall list and result being empty. In the functional version, we make this the initial call on the helper function:
fun rev2(lst) = let fun loop(list, result) = ?? in loop(lst, []) endThe loop ends when list becomes empty, in which case the answer is stored in result, so this becomes one of the cases for our helper function:
fun rev2(lst) = let fun loop([], result) = result ... in loop(lst, []) endNow we just need a case for the other iterations. In the pseudocode, we pulled an x off the front of the list and moved it to result. We can accomplish this with a pattern of x::xs for the list and by moving x into the result in our recursive call:
fun rev2(lst) = let fun loop([], result) = result | loop(x::xs, result) = loop(xs, x::result) in loop(lst, []) endWe saw that this version worked and ran in a reasonable amount of time even for lists with a million values.
Ullman describes this example on pages 84-88 of the book. He mentions in the text that this second approach is known as a difference list.