The time has finally come! So far this quarter we’ve mostly been doing set up, but now we can work towards our main goals for the quarter! Today we will be looking at our first algorithm design technique!
An algorithm design technique is sort of a design pattern for algorithms. It’s an intentional way to approach building an algorithm, intended to be generalizable to many different problems. As we’re looking at algorithm design techniques throughout the quarter, it’s important to keep in mind that there are many of these, and none are universal. In general, there is no algorithm for writing algorithms, but there are strategies that often work.
The first algorithm design strategy of the quarter is called
the Divide and Conquer
technique. To very briefly
summarize, divide and conquer algorithms work by breaking up a
large problem into smaller versions of itself, solving each
smaller version recursively, and then reassembling those solutions
into a final answer. You’ve actually already seen some examples of
this technique before! In this reading we will look at those
examples, then highlight the patterns they use that we can then
generalize for future algorithms.
The classic example of a divide and conquer algorithm is merge sort. As a refresher, mergesort works by:
The key insight that mergesort leverages is that merging two already-sorted lists is easier than sorting a list from scratch, then applying that idea recursively. Because the algorithm is recursive, we also need to have a base case, in this case it is when the list is of size 0 or 1, as those lists cannot be split in half, and they are conveniently already sorted!
A divide and conquer algorithm consists of four parts:
As we saw with mergesort, divide and conquer algorithms are helpful in the situation that combining solutions from smaller subproblems is easier than solving the larger superproblem from scratch.
Once we have identified each of these components, our algorithm will generally behave as follows:
We can now reverse engineer
mergesort by identifying the
Base Case, Divide, and
Combine steps of divide and conquer algorithms,
then plugging them into that generic psuedocode! (We do not need
to identify the conquer step, as that is always the same, just
recursively solve all subproblems) In this case, those three steps
for mergesort are:
Quicksort, which you saw in CSE373, is another example of a divide and conquer algorithm! As a reminder, here’s the behavior of quicksort:
In this case, our three steps are:
Notice that even though quicksort has all the same divide-and-conquer components as mergesort, the computational work is not distributed the same way between them. In mergesort the work of comparing and reodering elements occurred in the combine step. In quicksort, that occurred in the divide step. By comparison, the divide step of mergesort was really simple (just split by the middle index), and the combine step of quicksort was trivial (literally just return the answer, because it was already finished).
These two algorithms are at extreme
ends of the workload
distribution. When discussing divide and conquer, we’ll see some
algorithms like mergesort where the majority of the work is done
when combining, we’ll see some like quicksort where the majority
of the work is done when dividing, and we’ll see some where there
is meaningful work done in both steps. All of these will still fit
the divide and conquer paradigm!
As a reminder, when demonstrating the correctness of any algorithm, we need to show:
Because divide and conquer algorithms are expressed recursively, we will have slightly different approaches to showing some of these.
Showing soundness of a divide and conquer algorithm should follow the same idea as the algorithms we have seen thus far. We simply need to defend how we know that steps that have risks of producing errors will never have those errors occur (e.g. justifying the denominators are never 0, or that indices are always in range).
Showing termination will be a bit different. Before, when looking at iterative code (i.e. code that used for loops and while loops), showing termination meant justifying that the loop condition would eventually become false, so we’d exit the loop. For recursive code, termination happens when a base case is reached. Therefore, to justify termination, we need to show that every chain of subproblems will eventually reach a base case. Usually we do this by showing that all inputs that are larger than the base case threshold are split into subproblems that are strictly smaller than the original. This guarantees that we’re always making progress towards the base case, and so we will reach it eventually.
Showing validity is somewhat similar to what we saw with loop invariants, we just have a bit of a recursive twist on it. To show validity we will show two things:
This achieves the goal of demonstrating validity because step 2
shows us that we always combine small correct answers into larger
correct answers, and step 1 shows us that the smallest answers are
always correct. Combining these insights together results in the
conclusion answers start being correct in the base cases, and
we always combine correct answers into larger correct answers, so
all our answers must be correct.
As an aside, for those with
prior background in proof writing, you may recognize this proof
technique as a proof by structural induction. If you have not
heard this term before, no worries, there’s no expectation that
you know it for this course.
Let’s apply these to a proof of correctness for mergesort (skipping soundess for brevity).
Termination: If the list is of size 0 or 1 then the algorithm clearly terminates. If the list is size n>1 then we will split it into two lists, each of which is no larger than \lceil\frac{n}{2}\rceil. When n>1 we are guaranteed that \lceil\frac{n}{2}\rceil < n, so we will eventually reach a base case.
Validity: Lists of size 0 or 1 only have 1 permutation, and so they are guaranteed to be sorted. This means that our base case produces the correct answer. Assuming our recursive calls correctly sort the left and right lists, we can then conclude that the merge step produces a correctly sorted list overall. First, we observe that all elements are present in the final list because its elements will be the union of those in the left and right sublists, therefore making it a permutation of the orginal list. Second, we can conclude all elements in the final list are in the correct order. By assumption that the lists are sorted in the conquer step, we conclude that each sublist is itself in the correct order. Since we only ever extract elements from the sublists in order, the only possible way to have two elements in the wrong order is if they were from different sublists. Because we always added the elements from the sublists in their correct order, this cannot happen either. Therefore our algorithm correctly produces a sorted permutation of the list.
Again due to the recursive nature of divide and conquer algorithms, we’re going to need a slightly different approach in expressing their running times as well. Ultimately, our goal is still the same. We want to identify a function whose input is the size of the algorithm’s input, and whose output is the worst case count of operations. Because the algorithm is recursive, it will be easiest to make our running time function recursive initially, then convert it into an equivalent non-recurse (i.e. closed) form. A recursive function is called a recurrence relation.
In general, the recurrence relation for the running time will have the form T(n)=aT(\frac{n}{b})+f(n), where:
With all of these components, we can understand the recurrence
relation T(n)=aT(\frac{n}{b})+f(n) as saying
the amount of time it takes to run the algorithm on an input of
size n is equal to however long
it take to run that algorithm on a copies that are each of size \frac{n}{b}, plus f(n) additional work
.
Applying this to mergesort we get:
And so the recurrence relation for the running time of mergesort is T(n)=2T(\frac{n}{2})+n.
From here we need to convert T(n) into a non-recursive function that
gives us the same answer. There are ways to do this from
scratch
, but for this course will will just use a super handy
tool called the Master Theorem.
Master Theorem: Suppose T(n)=aT(\frac{n}{b})+O(n^k):
Now let’s apply this to the mergesort relation T(n)=2T(\frac{n}{2})+n. The master theorem now applies with a=2, b=2, and k=1. so we want to compare a with b^k, which are 2 and 2^1. Bacause a=b^k, case 2 applies, giving us a running time of O(n \log n).