Link

Recursive Algorithm Analysis

Complete the Reading Quiz by 3:00pm before lecture.

Table of contents

  1. Binary search
  2. Common recurrences
  3. Supplemental Material

As a case study, let’s analyze the runtime for the binary search algorithm on a sorted array. We’ve chosen this algorithm because it is commonly used in practice, and employs recursion to progressively narrow down which half of the array an element resides in. The binarySearch method below returns the index of x (the item to search for), or -1 if x is not in the sorted array. Note that we use a private recursive method to keep track of what portion of the array (lo to hi) is currently being considered (you may recall from CSE 143 that this arrangement is sometimes called a “public/private recursive pair”). Trivia: a bug in Java’s binary search was discovered in 2006.

public static int binarySearch(int[] sorted, int x) {
    return binarySearch(sorted, x, 0, sorted.length);
}

private static int binarySearch(int[] sorted, int x, int lo, int hi) {
    if (lo > hi)
        return -1;
    int mid = (lo + hi) / 2;
    if (x < sorted[mid])
        return binarySearch(sorted, x, lo, mid - 1);
    else if (x > sorted[mid])
        return binarySearch(sorted, x, mid + 1, hi);
    else
        return mid;
}
What is a good cost model for binary search?

We can count the number of comparisons, the number of calls to binarySearch, or the number of times we compute mid. In the general case, all three will be essentially the same count: for almost every call to binarySearch, mid is computed once, and one comparison is made. In the very first or very last call, these numbers might be off by one – but that’s exactly the kind of “messy constant” we ignore because it won’t affect the end result of our analysis.

What is the best-case order of growth of the runtime?

In the best case, x is the immediate middle item in the sorted array. This takes constant time since it’s not necessary to recurse on either the left or right sides.

What is the worst-case order of growth of the runtime?

We don’t really have the tools to be precise, but we know that each recursive call operates on half the problem. The problem continues to be divided in half until we’re only examining one item.

This pattern of repeatedly dividing in half is mathematically defined by the base-2 logarithmic function, log2 (sometimes written as lg). When no log base is specified, computer scientists typically assume base 2. This assumption doesn’t affect asymptotic analysis because all log bases are within a constant multiplicative factor of each other.

Our goal is to prove that the worst-case order of growth of the runtime of binarySearch is in Theta(log2 N). In order to solve this, we’ll need a new idea known as a recurrence relation, which models the runtime of a recursive algorithm using mathematical functions.

Goal
Give the runtime in terms of N = hi - lo + 1, the size of the current recursive subproblem.

A recurrence relation, like a recursive function call, has two parts: the non-recursive work (represented by constants in the case of binary search) and the recursive work. To model our recurrence, we define a function T(N) as the maximum number of comparisons (remember, this is a worst-case analysis) to search a sorted subarray of length N. We can define the runtime of binary search using the following recurrence. (Assume floor division for N / 2 to keep the math simple.)

Binary search
T(N) = T(N / 2) + c for N > 1
T(1) = d

Note that we define T(N) and T(1) separately. This is because T(N) works in the general (recursive) case, but since its value recursively depends on T itself, we need T(1) as a base case to “stop” the recursion, similar to how binarySearch has a base case.

We use c to represent constant time spent on non-recursive work, such as comparing lo < hi, computing mid, and comparing x with sorted[mid]. d is another constant, used specifically for non-recursive work in the base case, which takes a different amount of constant time to compare lo < hi and immediately return -1. The time spent on recursive work is then modeled by T(N / 2) because a recursive call to binarySearch will examine either the lower half or upper half of the remaining N items.

We can solve this recurrence relation and find a closed-form solution by unrolling the recurrence: plugging the recurrence back into itself until the base case is reached.

  • T(N) = T(N / 2) + c
  • T(N) = T(N / 4) + c + c
  • T(N) = T(N / 8) + c(3)
  • T(N) = T(N / N) + c(log2 N)
  • T(N) = d + c(log2 N)

Common recurrences

Each of the common recurrences below has a geometric visualization that can be used to provide a visual proof of its corresponding order of growth. We’ll develop some of this visual intuition in class, but it’s helpful to walk through each of these recurrences to try and understand why it corresponds to that order of growth. It’s sometimes necessary to use one of the familiar summations introduced earlier to simplify a solution. We recommend that you keep these common recurrences handy as you continue working with algorithmic analysis.

RecurrenceOrder of Growth
T(N) = T(N / 2) + clog2 N
T(N) = 2T(N / 2) + cN
T(N) = T(N - 1) + cN
T(N) = T(N - 1) + cNN2
T(N) = 2T(N - 1) + c2N
T(N) = T(N / 2) + cNN
T(N) = 2T(N / 2) + cNN log2 N

Supplemental Material

Although we only described how to use recurrences for a recursive algorithm here, recurrences can be used to model all kinds of algorithms, both recursive and iterative.

Consider how we could also express an iterative analysis using recurrences. As an example, suppose we have an iterative algorithm that examines every index i in an array using a for loop. This could be converted to a recursive problem by reformulating the iteration as a recursion: looping over an array of size N is equivalent to examining the first element and then recursively examining the remaining array of length N-1. In such a recurisvely-formulated-iterative problem, T(N) would be defined in terms of T(N - 1).


Reading Quiz