CSE143 Notes for Wednesday, 5/4/11

I said that I wanted to talk about how to put things into sorted order. Before I began with a particular challenge in mind. I think it would be interesting for a company like Microsoft to take someone who has just graduated from the CS Department with a Bachelors degree and tell them that they have one hour to produce a "well behaved" sorting routine. It would be up to them to decide what "well behaved" meant. My guess is that many of our graduates would not necessarily do that well in the challenge.

There are some classic sorting techniques that work well for short lists. For example, in the SortedIntList class we wrote in assignment #1, we were inserting values into a sorted list and shifting values over to make room for the new value being inserted. This is a technique known as insertion sort. Another technique involves scanning the list to find the smallest value and then moving it to the front. Then you scan the remaining values for the next smallest element and move it to the second spot. You continue in this way, scanning the entire list to select the next value for the sorted list. This technique is known as selection sort.

Both insertion sort and selection sort turn out to be O(n2) sorting techniques. They work fine for small values of n, but they become far too expensive for large values of n.

I said that we were going to explore something known as "merge sort." The idea of merge sort is to divide the list in half, then sort each half, and then merge the two sorted halves back together. This is going to end up being a recursive method, so this process of splitting and merging is performed at several different levels. We discussed the following specific case of a merge sort of 8 values:

             [13, 42, 97, -3, 53, 18, 92, 50]
                  /                       \
          [13, 42, 97, -3]         [53, 18, 92, 50]
             /          \            /          \
         [13, 42]    [97, -3]    [53, 18]    [92, 50]
           /   \       /   \      /   \       /   \
        [13]  [42]  [97]  [-3]  [53]  [18]  [92]  [50]
          \    /      \    /      \    /      \    /
         [13, 42]   [-3, 97]    [18, 53]    [50, 92]
             \         /            \          /
          [-3, 13, 42, 97]         [18, 50, 53, 92]
                  \                       /
             [-3, 13, 18, 42, 50, 53, 92, 97]
        
Then I turned to the computer to code this. We started by writing a particular method that will be helpful for the overall task. To make things easier, let's assume we are sorting a series of String values stored in a LinkedList<String>. The LinkedList class is part of the collections framework. It has very fast add/remove/peek at both the front and back of the list. It implements a standard interface known as Queue<E>. This is similar to the Queue interface we used for the Sieve assignment but with slightly different names. The enqueue method is simply called "add" and the dequeue method is called "remove". We have the isEmpty and size methods along with a method called "peek" that allows us to peek at the front of the queue without actually removing it.

So I said let's begin by writing a method that takes three lists as arguments. The first list will be empty and that will be where we want to store our result. The other two lists will each contain a sorted list of values. The idea is to merge the two sorted lists into one list. By the time we're done, we want all of the values to be in the first list in sorted order and we want the other two lists to be empty.

So the header for our method would look like this:

        public static void mergeInto(Queue<String> result,
                                     Queue<String> list1,
                                     Queue<String> list2) {
So we have two sorted lists and we want to merge them together. How do we do it? A good wrong answer is to say that we glue the two lists together and call a sorting routine. We want to take advantage of the fact that the two lists are already sorted.

Someone said that we'd want to look at the first value in each list and pick the smaller one. Then we'd move that value into our result.

        if (list1.peek() <= list2.peek())
            move front of list1 to result
This has the right logic, but we have to work out the syntax. We can't compare String objects this way (str1 <= str2). Instead, we have to take advantage of the fact that String implements the Comparable interface. That means that it has a method called compareTo that allows you to compare one String to another. The compareTo method returns an integer that indicates how the values compare (a negative means "less", 0 means "equal", a positive means "greater"). We also have to fill in how to move something from the front of one of the two lists into our result. We "dequeue" from list1 by calling remove and we enqueue into result by calling "add". So our code becomes:

        if (list1.peek().compareTo(list2.peek()) <= 0)
            result.add(list1.remove());
And what do we do if the first value of list1 is not less than or equal to the first value of list2? Then we'd want to take from the other list:

        if (list1.peek().compareTo(list2.peek()) <= 0)
            result.add(list1.remove());
        else
            result.add(list2.remove());
Of course, this just says how to handle one value. We want to keep doing this as long as their are values left to compare, so we need this inside a loop. So we want to continue while both lists are nonempty:

        !list1.isEmpty() && !list2.isEmpty()
So using this as a loop test our code becomes:

        while (!list1.isEmpty() && !list2.isEmpty()) {
            if (list1.peek().compareTo(list2.peek()) <= 0)
                result.add(list1.remove());
            else
                result.add(list2.remove());
        }
So this loop will take values from one list or the other while they both have something left to compare. Eventually one of the lists will become empty. Then what? Suppose it's the second list that becomes empty first. What do we do with the values left in the first list? Every one of them is larger than the values in the second list and they're in sorted order. So all we have to do is transfer them from the first list to the result (similar to the Sieve transferring primes after it processed a value greater than or equal to the square root of the maximum n).

        while (!list1.isEmpty())
            result.add(list1.remove());
This is the right code to execute if the second list is the one that has gone empty. But what if it's the first list that has gone empty? Then you'd want to do a corresponding transfer from the second list:

        while (!list2.isEmpty())
            result.add(list2.remove());
You might think that we need an if/else to figure out whether it's the first case or the second case, but it turns out that an if/else would be redundant. The loops already have tests to see if the list is empty. So we can simply execute both loops. What will end up happening is that one list will have something left in it, so one of these loops will execute, and the other list will be empty, in which case the other loop doesn't have any effect.

So the final code is as follows:

        public static void mergeInto(Queue<String> result,
                                     Queue<String> list1,
                                     Queue<String> list2) {
            while (!list1.isEmpty() && !list2.isEmpty()) {
                if (list1.peek().compareTo(list2.peek()) <= 0)
                    result.add(list1.remove());
                else
                    result.add(list2.remove());
            }
            while (!list1.isEmpty())
                result.add(list1.remove());
            while (!list2.isEmpty())
                result.add(list2.remove());
        }
Then I said that we should turn our attention to how to sort a Queue<String>. So we're trying to write a method that looks like this:

        public static void sort(Queue<String> list) {
            ...
        }
If we want to think recursively, we can begin by thinking about base cases. What would be an easy list to sort? Someone said an empty list. That's certainly true. An empty list doesn't need to be sorted at all. Then someone mentioned that a list of 1 element also doesn't need to be sorted. I said, "In a country of one, you cannot be weird, you are the norm" and people looked a little puzzled, but they seemed to get it eventually. If there is only one thing, there is nothing else around to be out of order with it.

This is one of those cases where we don't have to do anything in the base case. So we can write a simple if statement with a test for the recursive case:

        public static void sort(Queue<String> list) {
            if (list.size() > 1) {
                ...
            }
        }
Then I said that we should think about how we could split such a list into two lists. We'd need some variables:

        Queue<String> half1 = new LinkedList<String>();
        Queue<String> half2 = new LinkedList<String>();
How many things should end up in each list? Someone said list.size() divided by 2. That's almost right, but we have to worry about the size being odd. An easy way to make this work is to set one of the sizes to list.size() divided by 2 and to set the other to list.size() minus the first one:

        int size1 = list.size() / 2;
        int size2 = list.size() - size1;
So now it's just a matter of transferring items from the list to the two new lists. We can do so with simple for loops (very similar to the stack/queue code we wrote for the midterm):

        for (int i = 0; i < size1; i++)
            half1.add(list.remove());
        for (int i = 0; i < size2; i++)
            half2.add(list.remove());
So where does that leave us? We have two lists, each with half of the items from the original list. That means that our original list is now empty. And we also know that we have a way to merge two sorted lists together (the method mergeInto that we wrote earlier). But will that work? Unfortunately not. These two lists aren't necessarily sorted. They're just the first and second half of the original list.

We were on the verge of despair and it was clear that time was running out when I told people not to give up hope, that recursion would come to our rescue. We've reached a point where we have two lists, each with half of the items from the original list. We need them to be sorted. If only we had a method for sorting a list, then we could call it. But we have such a method. We're writing it! So we sort these two sublists:

        sort(half1);
        sort(half2);
And once the two are sorted, we can merge them together putting them back into the original list using the method we wrote a minute ago:

        mergeInto(list, half1, half2);
And that's the entire method. We're done. Putting all the pieces together, we ended up with:

        public static void sort(Queue<String> list) {
            if (list.size() > 1) {
                Queue<String> half1 = new LinkedList<String>();
                Queue<String> half2 = new LinkedList<String>();
                int size1 = list.size() / 2;
                int size2 = list.size() - size1;
                for (int i = 0; i < size1; i++)
                    half1.add(list.remove());
                for (int i = 0; i < size2; i++)
                    half2.add(list.remove());
                sort(half1);
                sort(half2);
                mergeInto(list, half1, half2);
            }
        }
This code is all included in handout #17. I then ran my testing program to demonstrate how it performs. In the main method, I constructed a list of 100 thousand Strings and I ran this sorting technique versus the built-in sorting facility that the programmers at Sun have developed (Collections.sort). Ours was not nearly as fast, taking 4 to 6 times longer to run. But I argued that I thought that being within a factor of 6 was pretty good. After all, Sun has had years to get their code just right and we had spent half an hour putting together our version. I also pointed out that people perceive recursion and linked lists as being slow and yet being within a factor of 6 is a pretty good result.

I also mentioned that merge sort is a stable sort. A stable sort has the property that it preserves the relative order of data values that are considered equal. For example, suppose that you have a list of student data and you sort it by name. Then you sort it a second time by year-in-school. For the second sort, there will be many values that are considered equal (lots of freshmen, lots of sophomores, lots of juniors, and so on). A stable sort would preserve the order within these groups, which means you would end up with the freshmen grouped together and in alphabetical order, the sophomores grouped together and in alphabetical order, and so on. Excel uses a stable sorting algorithm and the sorting routine in the Java class libraries is also a stable sort. Our merge sort is also stable.

I then briefly described why this sort is so fast. The overall list is split into two halves, the halves are sorted, and then we merge the two back together. Ignore for a moment the amount of work necessary to sort the two sublists. What you're left with is splitting the list in two and merging the sorted lists back together. That will require something like n steps (really 2n steps, but the important part is the "n"). So how much work is done sorting the two sublists? They also are split in half, sorted and merged. Again ignoring the amount of work done in sorting the smaller lists, the total amount of work done in splitting and merging is on the order of n steps. The same thing at the next level down. And so on. So the question becomes, how many levels are there? We start with n items at the first level, n / 2 at the next level, n / 4 at the next level, and so on. How many times would you have to divide n by 2 to get down to a list of length 1 (our base case)? By definition, it is the log to the base 2 of n. So the total number of levels will be on the order of log n. And each level requires work on the order of n. So the total work will be n times log n, or O(n log n). This is much faster than O(n^2).


Stuart Reges
Last modified: Thu May 5 14:41:36 PDT 2011