Asymptotic Analysis

Runtime analysis process
Big-theta
Big-theta notation
Big-oh
Big-oh notation
Big-omega notation
dup1
dup2
isPrime

Runtime analysis process

Recall that modeling the time complexity of an algorithm involves categorizing the factors that affect runtime. One factor was selected as the asymptotic variable, leaving all other remaining factors for case analysis.

The asymptotic variable is the factor that best represents the overall size of the input. We want to compare algorithms in terms of how they handle massive amounts of data since the size of the dataset if often what’s most directly responsible for longer computation time. The asymptotic variable is typically given in the problem by the phrases, “with respect to”, “in terms of”, or “as a function of”.
All other factors fall under the umbrella of case analysis. Even with a large input, depending on the other factors, the algorithm might be able to finish computation early if we can take advantage of the structure of the data. We will focus on the best case and worst case, which subsumes all of the remaining factors down to just the particular combinations of factors that result in the best (fastest) running time and worst (slowest) running time.

Consider the following dup1 algorithm for determining if an array A contains any duplicate values by checking every possible pair of values until a duplicate is found.

static boolean dup1(int[] A) {
    int N = A.length;
    for (int i = 0; i < N; i += 1)
        for (int j = i + 1; j < N; j += 1)
            if (A[i] == A[j])
                return true;
    return false;
}

In these questions, we’ll walk through how to characterize the time complexity for dup1 in the best case and worst case with respect to N, the length of the array.

Question 1

If the asymptotic variable is N, describe an input array that would result in dup1 executing only 2 less-than < operations.

Explanation

Asymptotic analysis suggests that we’re analyzing the runtime as the length of the array tends towards infinity, but it doesn’t say anything about the values in the array (case analysis)!

Question 2

Explain why this answer to the above question isn’t quite right.

An array of length 1 would only execute 2 less-than < operations. We’ll check i < N twice before returning false since there are no pairs of elements to compare.

Explanation

While this example does indeed only execute 2 less-than < operations, it does not agree with the asymptotic variable. By stating the asymptotic variable is the length of the array, we’re assuming a valid answer must describe a very long array, which doesn’t agree with a 1-element long example.

Question 3

In the above questions, we came up with a best case order of growth of the runtime for dup1 with respect to N, the length of the array.

Give the worst case order of growth of the runtime for dup1 with respect to N, the length of the array.

Explanation

The worst case runtime is fairly subtle. Make sure that your runtime argument takes into consideration that the number of iterations of j reduces as i gets bigger!

Big-theta

Big-theta notation

In formalizing the time complexity of an algorithm, we introduced the concept of order of growth as a relation between the size of the input and the time it takes to evaluate the algorithm on that input. In this slide, we’ll formally define three common asymptotic notation: Θ( ⋅ ), O( ⋅ ), and Ω( ⋅ ). These notation provide more precise mathematical definitions for our concept of order of growth.

The worst case order of growth of the runtime for dup1 is quadratic with respect to N.

In big-theta notation, we could let R(N) represent the runtime as a function of N and rephrase the above statement as R(N) ∈ Θ(N²).

R(N) ∈ Θ(f(N)) means there exist positive constants k₁ and k₂ such that k₁ ⋅ f(N) ≤ R(N) ≤ k₂ ⋅ f(N) for all values of N greater than some N₀.

In other words, we choose an order of growth f(N) and pick two constants k₁ and k₂ so that the runtime is bounded above and below for “very large N.” For example, we can bound 40sin (N) + 4N² above by 5N² (blue) and below by 3N² (green). This holds for all N greater than 6.

Question 1

Identify a big-theta bound for R(N) = (4N² + 3Nln N) / 2 by finding a simple f(N) and corresponding k₁ and k₂ in the online graphing calculator.

Explanation

You can definitely come up with tighter bounds for any of the constants k₁, k₂, N₀. But that’s not necessary since the big-theta definition doesn’t require anything more specific.

Question 2

Identify all valid big-theta bounds for R(N) = (4N² + 3Nln N) / 2.

Big-oh

Big-oh notation

R(N) ∈ O(f(N)) means there exists a positive constant k₂ such that R(N) ≤ k₂ ⋅ f(N) for all values of N greater than some N₀.

Big-oh notation is similar to big-theta notation, but only describes a loose upper bound. For example, we can bound 40sin (N) + 4N² above by 5N⁴ (blue). This holds for all N greater than 2.

Question 1

Identify all valid big-oh bounds for R(N) = (4N² + 3Nln N) / 2.

Big-omega notation

R(N) ∈ Ω(f(N)) means there exists a positive constant k₁ such that k₁ ⋅ f(N) ≤ R(N) for all values of N greater than some N₀.

If big-oh notation represented the upper bound only, big-omega notation represents the lower bound only. For example, we can bound 40sin (N) + 4N² below by 30N (green). This holds for all N greater than 7.

Question 1

Identify all valid big-omega bounds for R(N) = (4N² + 3Nln N) / 2.

dup1

Earlier, we walked through the runtime analysis process for dup1.

In the best case, the order of growth of the runtime for dup1 is constant.
In the worst case, the order of growth of the runtime for dup1 is quadratic with respect to N, the length of the array.

How might we express these descriptions using big-theta, big-oh, and big-omega notation?

Question 1

Which of these asymptotic bounds are valid descriptions for the best case runtime for dup1?

Question 2

Which of these asymptotic bounds are valid descriptions for the worst case runtime for dup1?

Question 3

Which of these asymptotic bounds are valid descriptions for the overall (considering best case, worst case, and everything in between) runtime for dup1?

Explanation

Because the Ω( ⋅ ) and O( ⋅ ) bounds do not agree, a Θ( ⋅ ) bound does not exist.

dup2

An algorithm for determining if an array contains duplicates can be more efficient than dup1 if the given array is already sorted. With a sorted array, any duplicate values must be found in neighboring array indices.

static boolean dup2(int[] sorted) {
    int N = sorted.length;
    for (int i = 0; i < N - 1; i += 1)
        if (sorted[i] == sorted[i + 1])
            return true;
    return false;
}

Question 1

Which of these asymptotic bounds are valid descriptions for the best case runtime for dup2?

Question 2

Which of these asymptotic bounds are valid descriptions for the worst case runtime for dup2?

Question 3

Which of these asymptotic bounds are valid descriptions for the overall (considering best case, worst case, and everything in between) runtime for dup2?

isPrime

Consider the isPrime algorithm for checking if an integer is a prime number. Prime numbers are integers greater than 1 that are only divisible by 1 and the prime number itself.

static boolean isPrime(int N) {
    if (N <= 1)
        return false;
    for (int i = 2; i < N; i += 1)
        if (N % i == 0)
            return false;
    return true;
}

A graph of the runtime shows how the time complexity depends on the value of N. Notice how even numbers (divisible by 2) are fast to compute but odd numbers can take much longer to determine primality.

isPrime

Question 1

True/False: Based on our definitions for asymptotic notation, the runtime for isPrime with respect to the parameter N has a separate best case and worst case analysis.

Explanation

Case analysis does not apply here because we have already chosen N as the asymptotic variable. All of the asymptotic notation state that the bound must hold “for all values of N greater than N₀.” Asymptotic analysis does not differentiate between odd and even numbers.

In contrast, dup1 and dup2 had other factors that allowed for us to perform separate case analysis by considering the possible values in the array (not just the asymptotic variable, the array length).

Question 2

Give a tight, simplified asymptotic (overall) runtime bound for isPrime with respect to the parameter N.

Give a big-theta bound if it exists.
Otherwise, give the tightest-possible big-oh and big-omega bounds.

Asymptotic Analysis

Table of contents

Runtime analysis process

Question 1

Question 2

Question 3

Big-theta

Big-theta notation

Question 1

Question 2

Big-oh

Big-oh notation

Question 1

Big-omega notation

Question 1

dup1

Question 1

Question 2

Question 3

dup2

Question 1

Question 2

Question 3

isPrime

Question 1

Question 2