CSE143 Notes for Wednesday, 2/8/06

I said that I wanted to talk about how to put things into sorted order and I began with a particular challenge in mind. I think it would be interesting for a company like Microsoft to take someone who has just graduated from the CS Department with a Bachelors degree and tell them that they have 45 minutes to produce a "well behaved" sorting routine. It would be up to them to decide what "well behaved" meant. My guess is that many of our graduates would not necessarily do that well in the challenge.

So we started by talking about what "well behaved" might mean. Someone mentioned that it should run fast even if the list is long. Someone else mentioned that it shouldn't require too much memory. The other requirement is that it has to be something you can code in 45 minutes. So it can't be overly complex.

So we started talking about different sorting techniques. Someone mentioned that there is something called quicksort and I agreed that quicksort is generally a fast way to sort things. Writing a quicksort in just 45 minutes isn't easy, so I said I wanted to find some other way.

Then someone mentioned an idea that is known as "insertion sort." You start with one item. Then you take a second item and you put it either in front of or in back of the current item so that they pair is sorted. Then you insert a third item where it belongs to have all three sorted. And so on. This is similar to the SortedIntList class that we wrote as assignment 1. It will put a series of integers into sorted order because each time you call add, it makes sure to insert the value in just the right place in the array so as to preserve sorted order. How fast is it? It involves potentially shifting values over to make room for the new value being inserted. It works well for small lists, but not for large lists.

The worst case scenario is when you always insert at the front of the list, in which case you have to shift everything over. The result would be that the first value can be inserted directly, the second value would require shifting one value over, the third value would require shifting two values over, the fourth value would require shifting three values over, and so on. If you use a dot to represent each different value that has to be shifted over, you'd see that the dots form a pattern like this (where each line shows how much work is done for each successive insert):

        .
        . .
        . . .
        . . . .
        . . . . .
        . . . . . .
In general, the total number of dots will be in the neighborhood of 1/2 of n^2. That's in the worst case. On average, we'd expect to have to shift about half of the values, so on average we'd expect this to be closer to 1/4 of n^2. But remember that in terms of characterizing the complexity of an operation, we don't care about multiples like 1/4. What matters is that this grows at the same rate as n^2. We'd say that this technique is an O(n^2) algorithm.

So insertion sort is O(n^2). Is there something better? Someone recommended a different technique that is known as bubble sort. I asked for a show of hands and it seemed that at least 10% of the class had seen bubble sort. In involves going through the list, comparing adjacent elements and swapping them if they are out of order. Will the list be sorted after doing this once on the list? No, but we'd know that the largest element has "bubbled" to the end of the list (like underwater bubbles floating to the surface). So on a second pass we could get the second largest element in the correct spot. And on a third pass we could get the third largest element in the correct spot. Each time we'd be dealing with a shorter list, so if we were to use dots to describe the pattern, they would look like this:

        . . . . . . . . [etc] .
          . . . . . . . [etc] .
            . . . . . . [etc] .
              . . . . . [etc] .
                . . . . [etc] .
In other words, the dot pattern here is the upper half of an n-by-n square, which means that this is also an O(n^2) sort.

I then described something called selection sort. The idea is to look at all n items and select the smallest, which you move to the front of the list. Then look at the remaining (n - 1) items and pick the smallest of those and put it in the second position. Then look at the remaining (n - 2) items and pick the smallest which you move to the third position. And so on.

The problem here is that you have to look at so many values on each pass of the algorithm. Initially you have to look at all n values. Then you have to look at (n - 1), then (n - 2), then (n - 3), and so on. So we get the same dot pattern as bubble sort, which means it's another O(n^2) sort.

I asked for other suggestions and someone suggested "merge sort." I said that is exactly the sort I want to talk about. So how does it work? People had some ideas about it, but nobody quite knew how it worked. So I said let's do the following. Let's start by writing a particular method that will be helpful for the overall task. To make things easier, let's assume we are sorting a series of String values stored in a LinkedList<String>. The LinkedList class is part of the collections framework. It has very fast insert/remove/peek at both the front and back of the list. The methods that perform these operations are known as addFirst/addLast, removeFirst/removeLast and getFirst/getLast, respectively.

So I said let's begin by writing a method that takes three lists as arguments. The first list will be empty and that will be where we want to store our result. The other two lists will each contain a sorted list of values. The idea is to merge the two sorted lists into one list. By the time we're done, we want all of the values to be in the first list in sorted order and we want the other two lists to be empty.

So the header for our method would look like this:

        public static void mergeInto(LinkedList<String> result,
                                     LinkedList<String> list1,
                                     LinkedList<String> list2) {
So we have two sorted lists and we want to merge them together. How do we do it? A good wrong answer is to say that we glue the two lists together and call a sorting routine. We want to take advantage of the fact that the two lists are already sorted.

Someone said that we'd want to look at the first value in each list and pick the smaller one. Then we'd move that value into our result. I said that's absolutely right. And we'd want to keep doing that as long as there are values in either of two lists. So we begin with code that looks like this:

        while (!list1.isEmpty() && !list2.isEmpty()) {
            // move smaller value from the two lists to result
        }
So how do we figure out which value is smaller? They are LinkedList objects, so we can call getFirst on each to get the first item from each one. So we want to say something like this:

        while (!list1.isEmpty() && !list2.isEmpty()) {
            if (list1.getFirst() <= list2.getFirst())
                move front of list1 to result
            else
                move front of list2 to result
        }
This has the right logic, but we have to work out the syntax. We can't compare String objects this way (str1 <= str2). Instead, we have to take advantage of the fact that String implements the Comparable interface. That means that it has a method called compareTo that allows you to compare one String to another. The compareTo method returns an integer that indicates how the values compare (a negative means "less", 0 means "equal", a positive means "greater"). We also have to fill in how to move something from the front of one of the two lists into our result. Remember that the LinkedList class has a removeFirst method. This method is like "dequeue" for a queue. It both removes the value and returns it. We can put something at the end of the result by calling addLast, which is very similar to "enqueue". So our code becomes:

        while (!list1.isEmpty() && !list2.isEmpty()) {
            if (list1.getFirst().compareTo(list2.getFirst()) <= 0)
                result.addLast(list1.removeFirst());
            else
                result.addLast(list2.removeFirst());
        }
So this loop will take values from one list or the other while they both have something left to compare. Eventually one of the lists will become empty. Then what? Suppose it's the second list that becomes empty first. What do we do with the values left in the first list? Every one of them is larger than the values in the second list and they're in sorted order. So all we have to do is transfer them from the first list to the result (similar to the Sieve transferring primes after it processed a value greater than or equal to the square root of the maximum n).

        while (!list1.isEmpty())
            result.addLast(list1.removeFirst());
This is the right code to execute if the second list is the one that has gone empty. But what if it's the first list that has gone empty? Then you'd want to do a corresponding transfer from the second list:

        while (!list2.isEmpty())
            result.addLast(list2.removeFirst());
You might think that we need an if/else to figure out whether it's the first case or the second case, but it turns out that an if/else would be redundant. The loops already have tests to see if the list is empty. So we can simply execute both loops. What will end up happening is that one list will have something left in it, so one of these loops will execute, and the other list will be empty, in which case the other loop doesn't have any effect.

So the final code is as follows:

        public static void mergeInto(LinkedList<String> result,
                                     LinkedList<String> list1,
                                     LinkedList<String> list2) {
            while (!list1.isEmpty() && !list2.isEmpty()) {
                if (list1.getFirst().compareTo(list2.getFirst()) <= 0)
                    result.addLast(list1.removeFirst());
                else
                    result.addLast(list2.removeFirst());
            }
            while (!list1.isEmpty())
                result.addLast(list1.removeFirst());
            while (!list2.isEmpty())
                result.addLast(list2.removeFirst());
        }
Then I said that we should turn our attention to how to sort a LinkedList<String>. So we're trying to write a method that looks like this:

        public static void sort(LinkedList<String> list) {
            ...
        }
If we want to think recursively, we can begin by thinking about base cases. What would be an easy list to sort? Someone said an empty list. That's certainly true. An empty list doesn't need to be sorted at all. Then someone mentioned that a list of 1 element also doesn't need to be sorted. I said, "In a country of one, you cannot be weird, you are the norm" and people looked a little puzzled, but they seemed to get it eventually. If there is only one thing, there is nothing else around to be out of order with it.

This is one of those cases where we don't have to do anything in the base case. So we can write a simple if statement with a test for the recursive case:

        public static void sort(LinkedList<String> list) {
            if (list.size() > 1) {
                ...
            }
        }
Then I said that we should think about how we could split such a list into two lists. We'd need some variables:

        LinkedList<String> half1 = new LinkedList<String>();
        LinkedList<String> half2 = new LinkedList<String>();
How many things should end up in each list? Someone said list.size() divided by 2. That's almost right, but we have to worry about the size being odd. An easy way to make this work is to set one of the sizes to list.size() divided by 2 and to set the other to list.size() minus the first one:

        int size1 = list.size() / 2;
        int size2 = list.size() - size1;
So now it's just a matter of transferring items from the list to the two new lists. We can do so with simple for loops (very similar to the stack/queue code we wrote for the midterm):

        for (int i = 0; i < size1; i++)
            half1.addLast(list.removeFirst());
        for (int i = 0; i < size2; i++)
            half2.addLast(list.removeFirst());
So where does that leave us? We have two lists, each with half of the items from the original list. That means that our original list is now empty. And we also know that we have a way to merge two sorted lists together (the method mergeInto that we wrote earlier). But will that work? Unfortunately not. These two lists aren't necessarily sorted. They're just the first and second half of the original list.

We were on the verge of despair and it was clear that time was running out when I told people not to give up hope, that recursion would come to our rescue. We've reached a point where we have two lists, each with half of the items from the original list. We need them to be sorted. If only we had a method for sorting a list, then we could call it. But we have such a method. We're writing it! So we sort these two sublists:

        sort(half1);
        sort(half2);
And once the two are sorted, we can merge them together putting them back into the original list using the method we wrote a minute ago:

        mergeInto(list, half1, half2);
And that's the entire method. We're done. Putting all the pieces together, we ended up with:

        public static void sort(LinkedList<String> list) {
            if (list.size() > 1) {
                LinkedList<String> half1 = new LinkedList<String>();
                LinkedList<String> half2 = new LinkedList<String>();
                int size1 = list.size() / 2;
                int size2 = list.size() - size1;
                for (int i = 0; i < size1; i++)
                    half1.addLast(list.removeFirst());
                for (int i = 0; i < size2; i++)
                    half2.addLast(list.removeFirst());
                sort(half1);
                sort(half2);
                mergeInto(list, half1, half2);
            }
        }
This code is all included in handout #19. I had already downloaded the program to the computer in the classroom and I ran it to show people how it performs. In the main method, I constructed a list of 100 thousand Strings and I ran this sorting technique versus the built-in sorting facility that the programmers at Sun have developed (Collections.sort). Ours was not nearly as fast, taking almost 5 times longer to run. But I argued that I thought that being within a factor of 5 was pretty good. After all, Sun has had years to get their code just right and we had spent half an hour putting together our version. I also pointed out that people perceive recursion and linked lists as being slow and yet being within a factor of 5 is a pretty good result.

I mentioned that merge sort is a stable sort. A stable sort has the property that it preserves the relative order of data values that are considered equal. For example, suppose that you have a list of student data and you sort it by name. Then you sort it a second time by year-in-school. For the second sort, there will be many values that are considered equal (lots of freshmen, lots of sophomores, lots of juniors, and so on). A stable sort would preserve the order within these groups, which means you would end up with the freshmen grouped together and in alphabetical order, the sophomores grouped together and in alphabetical order, and so on. Excel uses a stable sorting algorithm and the sorting routine in the Java class libraries is also a stable sort. Our merge sort is also stable.

I then briefly described why this sort is so fast. The overall list is split into two halves, the halves are sorted, and then we merge the two back together. Ignore for a moment the amount of work necessary to sort the two sublists. What you're left with is splitting the list in two and merging the sorted lists back together. That will require something like n steps (really 2n steps, but the important part is the "n"). So how much work is done sorting the two sublists? They also are split in half, sorted and merged. Again ignoring the amount of work done in sorting the smaller lists, the total amount of work done in splitting and merging is on the order of n steps. The same thing at the next level down. And so on. So the question becomes, how many levels are there? We start with n items at the first level, n / 2 at the next level, n / 4 at the next level, and so on. How many times would you have to divide n by 2 to get down to a list of length 1 (our base case)? By definition, it is the log to the base 2 of n. So the total number of levels will be on the order of log n. And each level requires work on the order of n. So the total work will be n times log n, or O(n log n). This is much faster than O(n^2).


Stuart Reges
Last modified: Sun Feb 12 19:39:18 PST 2006