# Asymptotic Analysis

## Table of contents

- Runtime analysis process
- Big-theta
- Big-theta notation
- Big-oh
- Big-oh notation
- Big-omega notation
- dup1
- dup2
- isPrime

## Runtime analysis process

Recall that modeling the time complexity of an algorithm involves categorizing the factors that affect runtime. One factor was selected as the asymptotic variable, leaving all other remaining factors for case analysis.

- The
**asymptotic variable**is the factor that best represents the overall size of the input. We want to compare algorithms in terms of how they handle massive amounts of data since the size of the dataset if often what’s most directly responsible for longer computation time. The asymptotic variable is typically given in the problem by the phrases, “with respect to”, “in terms of”, or “as a function of”. - All other factors fall under the umbrella of
**case analysis**. Even with a large input, depending on the other factors, the algorithm might be able to finish computation early if we can take advantage of the structure of the data. We will focus on the**best case**and**worst case**, which subsumes all of the remaining factors down to just the particular combinations of factors that result in the best (fastest) running time and worst (slowest) running time.

Consider the following `dup1`

algorithm for determining if an array `A`

contains any duplicate values by checking every possible pair of values until a duplicate is found.

```
static boolean dup1(int[] A) {
int N = A.length;
for (int i = 0; i < N; i += 1)
for (int j = i + 1; j < N; j += 1)
if (A[i] == A[j])
return true;
return false;
}
```

In these questions, we’ll walk through how to characterize the time complexity for `dup1`

in the best case and worst case with respect to `N`

, the length of the array.

### Question 1

If the asymptotic variable is `N`

, describe an input array that would result in `dup1`

executing only 2 less-than `<`

operations.

## Explanation

Asymptotic analysis suggests that we’re analyzing the runtime as the length of the array tends towards infinity, but it doesn’t say anything about the values in the array (case analysis)!

### Question 2

Explain why this answer to the above question isn’t quite right.

An array of length 1 would only execute 2 less-than

`<`

operations. We’ll check`i < N`

twice before returning false since there are no pairs of elements to compare.

## Explanation

While this example does indeed only execute 2 less-than `<`

operations, it does not agree with the asymptotic variable. By stating the asymptotic variable is the length of the array, we’re assuming a valid answer must describe a very long array, which doesn’t agree with a 1-element long example.

### Question 3

In the above questions, we came up with a **best case** order of growth of the runtime for `dup1`

with respect to `N`

, the length of the array.

Give the **worst case** order of growth of the runtime for `dup1`

with respect to `N`

, the length of the array.

*Θ*(1)*Θ*(log*N*)*Θ*(*N*log*N*)*Θ*(*N*^{2})*Θ*(*N*^{2}log*N*)*Θ*(*N*^{3})

## Explanation

The worst case runtime is fairly subtle. Make sure that your runtime argument takes into consideration that the number of iterations of `j`

reduces as `i`

gets bigger!

## Big-theta

## Big-theta notation

In formalizing the time complexity of an algorithm, we introduced the concept of **order of growth** as a relation between the size of the input and the time it takes to evaluate the algorithm on that input. In this slide, we’ll formally define three common asymptotic notation: *Θ*( ⋅ ), *O*( ⋅ ), and *Ω*( ⋅ ). These notation provide more precise mathematical definitions for our concept of order of growth.

The worst case order of growth of the runtime for

`dup1`

is quadratic with respect to`N`

.

In big-theta notation, we could let *R*(*N*) represent the runtime as a function of *N* and rephrase the above statement as *R*(*N*) ∈ *Θ*(*N*^{2}).

R(N) ∈Θ(f(N)) means there exist positive constantsk_{1}andk_{2}such thatk_{1}⋅f(N) ≤R(N) ≤k_{2}⋅f(N) for all values ofNgreater than someN_{0}.

In other words, we choose an order of growth *f*(*N*) and pick two constants *k*_{1} and *k*_{2} so that the runtime is bounded above and below for “very large *N*.” For example, we can bound 40sin (*N*) + 4*N*^{2} above by 5*N*^{2} (blue) and below by 3*N*^{2} (green). This holds for all *N* greater than 6.

### Question 1

Identify a big-theta bound for *R*(*N*) = (4*N*^{2} + 3*N*ln *N*) / 2 by finding a simple *f*(*N*) and corresponding *k*_{1} and *k*_{2} in the online graphing calculator.

## Explanation

You can definitely come up with tighter bounds for any of the constants *k*_{1}, *k*_{2}, *N*_{0}. But that’s not necessary since the big-theta definition doesn’t require anything more specific.

### Question 2

Identify all valid big-theta bounds for *R*(*N*) = (4*N*^{2} + 3*N*ln *N*) / 2.

*Θ*(1)*Θ*(log*N*)*Θ*(*N*)*Θ*(*N*log*N*)*Θ*(*N*^{2}*Θ*(*N*^{2}log*N*)*Θ*(*N*^{3})

## Big-oh

## Big-oh notation

R(N) ∈O(f(N)) means there exists a positive constantk_{2}such thatR(N) ≤k_{2}⋅f(N) for all values ofNgreater than someN_{0}.

Big-oh notation is similar to big-theta notation, but only describes a loose upper bound. For example, we can bound 40sin (*N*) + 4*N*^{2} above by 5*N*^{4} (blue). This holds for all *N* greater than 2.

### Question 1

Identify all valid big-oh bounds for *R*(*N*) = (4*N*^{2} + 3*N*ln *N*) / 2.

*O*(1)*O*(log*N*)*O*(*N*)*O*(*N*log*N*)*O*(*N*^{2}*O*(*N*^{2}log*N*)*O*(*N*^{3})

## Big-omega notation

R(N) ∈Ω(f(N)) means there exists a positive constantk_{1}such thatk_{1}⋅f(N) ≤R(N) for all values ofNgreater than someN_{0}.

If big-oh notation represented the upper bound only, big-omega notation represents the lower bound only. For example, we can bound 40sin (*N*) + 4*N*^{2} below by 30*N* (green). This holds for all *N* greater than 7.

### Question 1

Identify all valid big-omega bounds for *R*(*N*) = (4*N*^{2} + 3*N*ln *N*) / 2.

*Ω*(1)*Ω*(log*N*)*Ω*(*N*)*Ω*(*N*log*N*)*Ω*(*N*^{2}*Ω*(*N*^{2}log*N*)*Ω*(*N*^{3})

## dup1

Earlier, we walked through the runtime analysis process for `dup1`

.

- In the
**best case**, the order of growth of the runtime for`dup1`

is constant. - In the
**worst case**, the order of growth of the runtime for`dup1`

is quadratic with respect to`N`

, the length of the array.

How might we express these descriptions using big-theta, big-oh, and big-omega notation?

### Question 1

Which of these asymptotic bounds are valid descriptions for the **best case** runtime for `dup1`

?

*O*(1)*O*(*N*^{2})*Ω*(1)*Ω*(*N*^{2})*Θ*(1)*Θ*(*N*^{2})

### Question 2

Which of these asymptotic bounds are valid descriptions for the **worst case** runtime for `dup1`

?

*O*(1)*O*(*N*^{2})*Ω*(1)*Ω*(*N*^{2})*Θ*(1)*Θ*(*N*^{2})

### Question 3

Which of these asymptotic bounds are valid descriptions for the **overall** (considering best case, worst case, and everything in between) runtime for `dup1`

?

*O*(1)*O*(*N*^{2})*Ω*(1)*Ω*(*N*^{2})*Θ*(1)*Θ*(*N*^{2})

## Explanation

Because the *Ω*( ⋅ ) and *O*( ⋅ ) bounds do not agree, a *Θ*( ⋅ ) bound does not exist.

## dup2

An algorithm for determining if an array contains duplicates can be more efficient than `dup1`

if the given array is already sorted. With a sorted array, any duplicate values must be found in neighboring array indices.

```
static boolean dup2(int[] sorted) {
int N = sorted.length;
for (int i = 0; i < N - 1; i += 1)
if (sorted[i] == sorted[i + 1])
return true;
return false;
}
```

### Question 1

Which of these asymptotic bounds are valid descriptions for the **best case** runtime for `dup2`

?

*O*(1)*O*(*N*)*O*(*N*^{2})*Ω*(1)*Ω*(*N*)*Ω*(*N*^{2})*Θ*(1)*Θ*(*N*)*Θ*(*N*^{2})

### Question 2

Which of these asymptotic bounds are valid descriptions for the **worst case** runtime for `dup2`

?

*O*(1)*O*(*N*)*O*(*N*^{2})*Ω*(1)*Ω*(*N*)*Ω*(*N*^{2})*Θ*(1)*Θ*(*N*)*Θ*(*N*^{2})

### Question 3

Which of these asymptotic bounds are valid descriptions for the **overall** (considering best case, worst case, and everything in between) runtime for `dup2`

?

*O*(1)*O*(*N*)*O*(*N*^{2})*Ω*(1)*Ω*(*N*)*Ω*(*N*^{2})*Θ*(1)*Θ*(*N*)*Θ*(*N*^{2})

## isPrime

Consider the `isPrime`

algorithm for checking if an integer is a prime number. **Prime numbers** are integers greater than 1 that are only divisible by 1 and the prime number itself.

```
static boolean isPrime(int N) {
if (N <= 1)
return false;
for (int i = 2; i < N; i += 1)
if (N % i == 0)
return false;
return true;
}
```

A graph of the runtime shows how the time complexity depends on the value of `N`

. Notice how even numbers (divisible by 2) are fast to compute but odd numbers can take much longer to determine primality.

### Question 1

True/False: Based on our definitions for asymptotic notation, the runtime for `isPrime`

with respect to the parameter `N`

has a separate best case and worst case analysis.

## Explanation

Case analysis **does not** apply here because we have already chosen `N`

as the asymptotic variable. All of the asymptotic notation state that the bound must hold “for all values of *N* greater than *N*_{0}.” Asymptotic analysis does not differentiate between odd and even numbers.

In contrast, `dup1`

and `dup2`

had other factors that allowed for us to perform separate case analysis by considering the possible values in the array (not just the asymptotic variable, the array length).

### Question 2

Give a tight, simplified asymptotic (overall) runtime bound for `isPrime`

with respect to the parameter `N`

.

- Give a big-theta bound if it exists.
- Otherwise, give the tightest-possible big-oh and big-omega bounds.