Algorithm Analysis II

Algorithm Analysis II
Three strategies for modeling runtime: finding an exact count, extrapolating from examples, and geometric arguments.
Kevin Lin, with thanks to many others.
1
Ask questions anonymously on Piazza. Look for the pinned Lecture Questions thread.

Feedback from the Reading Quiz
2

Overall Asymptotic Runtime Bound for dup1
Give an overall asymptotic runtime bound for R as a combination of Θ, O, and/or Ω notation. Take into account both the best and the worst case runtimes (Rbest and Rworst).
3
Q
Demo
Q1: Give an overall asymptotic runtime bound for R as a combination of Θ, O, and/or Ω notation. Take into account both the best and the worst case runtimes (Rbest and Rworst).



Comprehending. Understanding the implementation details of a program.
Modeling. Counting the number of steps in terms of N, the size of the input.
Case Analysis. How certain conditions affect the program execution.
Asymptotic Analysis. Describing what happens for very large N, as N→∞.
Formalizing. Summarizing the final result in precise English or math notation.
Runtime Analysis Process
4
boolean dup1(int[] A)
Consider every pair
Array contains a duplicate at front
Array contains no duplicate items
Constant time
Quadratic time
Best: Θ(1)
Worst: Θ(N2)
Overall: Ω(1) and O(N2)
Worst case
Best case
?: When do we need to consider doing case analysis? What does a large value of N say about int[] A? What does N not say about int[] A?

Modeling Iteration
5

Worst Case Order of Growth: Exact Count of == Operations
6
int N = A.length; // N == 6
for (int i = 0; i < N; i += 1)
  for (int j = i + 1; j < N; j += 1)
    if (A[i] == A[j])
      return true;
return false;
==
==
==
==
==
==
==
==
==
==
==
==
==
==
==
0
1
2
3
4
5
0   1   2   3   4   5
i
j
“The worst case order of growth of the runtime for dup1 is N2.”

Worst Case Order of Growth: Geometric Argument
7
int N = A.length; // N == 6
for (int i = 0; i < N; i += 1)
  for (int j = i + 1; j < N; j += 1)
    if (A[i] == A[j])
      return true;
return false;
==
==
==
==
==
==
==
==
==
==
==
==
==
==
==
0
1
2
3
4
5
0   1   2   3   4   5
i
j
“The worst case order of growth of the runtime for dup1 is N2.”
Area of right triangle of side length N - 1.
Order of growth of area is N2.

8
Print Party: Attempt 1
Find a simple f(N) such that the runtime R(N) ∈ Θ(f(N)).
1
log N
N
N log N
N2
Other
void printParty(int N) {
  for (int i = 1; i <= N; i *= 2) {
    for (int j = 0; j < i; j += 1) {
      System.out.println("hello");
    }
  }
}
Q
Note that there’s only one case. No separate case analysis!
Q1: Find a simple f(N) such that the runtime R(N) ∈ Θ(f(N)).




?: How do we know that there’s only one case to consider?

printParty: Find a simple f(N) such that the runtime R(N) ∈ Θ(f(N)).
9

Print Party: Attempt 2
10
void printParty(int N) {
  for (int i = 1; i <= N; i *= 2)
    for (int j = 0; j < i; j += 1)
      System.out.println("hello");
}
0
1
2
3
4
5
0   1   2   3   4   5
i
j
Find a simple f(N) s.t. the runtime R(N) ∈ Θ(f(N)).
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
1
3
3
7
7
7
7
15
15
15
15
15
15
15
15
31
31
31
C(N):
N :
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
1
3
3
7
7
7
7
15
15
15
15
15
15
15
15
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
1
3
3
7
7
7
7
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
1
3
3
7
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
1
3
3
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
1
3
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
1
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
Let the cost model C(N) be the number of calls to println for a given N. This is our representative operation for figuring out the runtime.

?: For each N, predict C(N).

Print Party: Attempt 2
11
void printParty(int N) {
  for (int i = 1; i <= N; i *= 2)
    for (int j = 0; j < i; j += 1)
      System.out.println("hello");
}
Find a simple f(N) s.t. the runtime R(N) ∈ Θ(f(N)).
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
1
3
3
7
7
7
7
15
15
15
15
15
15
15
15
31
31
31
C(N):
N :
Q
1
log N
N
N log N
N2
Other
Q1: Find a simple f(N) such that the runtime R(N) ∈ Θ(f(N)).

printParty: Find a simple f(N) such that the runtime R(N) ∈ Θ(f(N)).
12

N
C(N)
½N
2N
1
1
0.5
2
4
1 + 2 + 4 = 7
2
14
7
1 + 2 + 4 = 7
3.5
14
8
1 + 2 + 4 + 8 = 15
4
16
…
27
1 + 2 + 4 + 8 + 16 = 31
13.5
54
185
… + 64 + 128 = 255 
92.5
370
715
… + 256 + 512 = 1023
357.5
1430
13
?: Describe the relationship between C(N), ½N, and 2N. How does this relationship relate to Big-Theta?

Repeat After Me…
There is no magic shortcut for these problems (except in a few well-behaved cases).
We’ll expect you to know these two summations since they’re common patterns.



Strategies.
Find the exact count of steps.
Write out examples.
Use a geometric argument–visualizations!
14
Demo
Numerical Linear Algebra (Lloyed N. Trefethen, David Bau, III/SIAM)
Real world programs are often messy and difficult to model.

?: What’s different between these two summations?




?: How did we apply these strategies to analyze printParty?

Analyzing Recursion
15

Informal Recursion Analysis
Find a simple f(N) such that the runtime R(N) ∈ Θ(f(N)).
Inspect the example and give the order of growth of the runtime as a function of N.
1
log N
N
N2
2N
16
Q
public static int f3(int n) {
  if (n <= 1)
    return 1;
  return f3(n-1) + f3(n-1);
}
3
2
2
1
1
1
1
3
2
2
1
1
1
1
4
?: What does each node represent in the tree on the right?




Q1: Find a simple f(N) such that the runtime R(N) ∈ Θ(f(N)).

f3: Find a simple f(N) such that the runtime R(N) ∈ Θ(f(N)).
17

Recursion and Exact Counts
Find a simple f(N) such that the runtime R(N) ∈ Θ(f(N)).
Another approach: Count number of calls to f3, given by C(N).
18
public static int f3(int n) {
  if (n <= 1)
    return 1;
  return f3(n-1) + f3(n-1);
}
3
2
2
1
1
1
1
3
2
2
1
1
1
1
4

Recursion and Exact Counts
Find a simple f(N) such that the runtime R(N) ∈ Θ(f(N)).
Approach 2: Count number of calls to f3, given by C(N).



Give a simple, exact expression for C(N).
19
public static int f3(int n) {
  if (n <= 1)
    return 1;
  return f3(n-1) + f3(n-1);
}
Q
3
2
2
1
1
1
1
3
2
2
1
1
1
1
4
?: What is the exact value of the last term in the sum for C(N)?




Q1: Give a simple, exact expression for C(N).

Recursion and Exact Counts: Solving for C(N)
As long as Q is a power of 2, then
20
A
Since Q = 2N - 1

Recursion and Exact Counts
Find a simple f(N) such that the runtime R(N) ∈ Θ(f(N)).
Approach 2: Count number of calls to f3, given by C(N).



Since work for each call is constant.
21
public static int f3(int n) {
  if (n <= 1)
    return 1;
  return f3(n-1) + f3(n-1);
}
3
2
2
1
1
1
1
3
2
2
1
1
1
1
4
?: How would R(N) change if the work for each call was N rather than constant?

Recursion and Recurrences
Find a simple f(N) such that the runtime R(N) ∈ Θ(f(N)).
Approach 3: Count number of calls to f3, given by a “recurrence relation” for C(N).



More mathematical and out of scope for CSE 373 this quarter.
22
3
2
2
1
1
1
1
3
2
2
1
1
1
1
4
public static int f3(int n) {
  if (n <= 1)
    return 1;
  return f3(n-1) + f3(n-1);
}

Out-of-Scope Recurrence Solution
23
A

Modeling Recursion: Tree Method
24

The Merge Operation
Given two sorted arrays, the merge operation combines them into a single sorted array by successively copying the smallest item from the two arrays into a target array.
25
2
3
4
5
6
7
8
10
11
2
3
6
10
11
4
5
7
8
2
3
4
5
6
7
8
10
11
2
3
6
10
11
2
3
4
5
6
7
8
10
2
3
4
5
6
7
8
2
3
6
10
11
2
3
4
5
6
7
8
4
5
7
8
2
3
4
5
6
7
2
3
4
5
6
7
4
5
7
8
2
3
4
5
6
2
3
6
10
11
2
3
4
5
6
2
3
4
5
2
3
4
5
4
5
7
8
2
3
4
2
3
4
4
5
7
8
2
3
2
3
2
3
6
10
11
2
2
2
3
6
10
11
?: What is a cost model that we can use to evaluate the runtime of the merge operation?

How does the runtime of merge grow with respect to N, the total number of items?
26

Single-Merge Selection Sort
Merging can give us an improvement over Θ(N2) selection sort.
Selection sort the left half of array.
Selection sort the right half of array.
Merge the sorted halves.

For N = 64, the total runtime is ~2112 AU (arbitrary units).
Merge: ~64 AU
Sort: 2(~1024 AU)
27
N=64
~4096 AU
SS
N=64
N=32
N=32
~1024 AU
~64 AU
~1024
SS
SS
M
The runtime of plain selection sort for N = 64 items is ~4096 AU.

?: How does that runtime compare to single-merge selection sort? Give a mathematical argument based on asymptotic analysis of the variable N.




?: How could we improve the runtime even further?

How does the runtime of single-merge selection sort grow with respect to N, the total number of items?
28

Two Merge Layers
For N = 64, the total runtime is ~1152 AU.
Merge: ~64 AU+ 2(~32 AU)
Sort: 4(~256 AU)
29
Sorting Algorithm
Runtime (AU)
Selection Sort
~4096 AU
One Merge Layer
~2112 AU
Two Merge Layers
~1152 AU
N=64
N=32
N=32
~32
~64
~32
M
M
M
16
16
16
16
SS
SS
SS
SS
~256
?: About how much does two merge layers improve runtime over one merge layer? Over standard selection sort?

Merge Sort
Merge sort algorithm merges every layer.
If array is of size 1, return.
Merge sort the left half.
Merge sort the right half.
Merge the two sorted halves.
For N = 64, the total runtime is ~384 AU.
Top layer: ~64 AU
Second layer: 2(~32 AU) = ~64 AU
Third layer: 4(~16 AU) = ~64 AU
ith layer: 2i - 1(~64 AU / 2i - 1) = ~64 AU
30
N=64
N=32
N=32
~32
~64
~32
M
M
M
16
16
16
16
M
M
M
M
~16
8
8
M
M
~8
···
···
?: How does the call tree for merge sort differ from the example we saw in f3?




?: How do these differences affect our runtime analysis?

How does the runtime of merge sort grow with respect to N, the total number of items?
31

32
Linear vs. Linearithmic (N log N) vs. Quadratic Runtimes
Algorithm Design (Jon Kleinberg, Éva Tardos/Pearson Education)
?: How large of a difference is there between N and N log N? Between N log N and N2?

Counting Calls vs. Work-per-Layer
f3. C(N): Count number of calls.


Since work for each call is constant, Θ(1).
33
3
2
2
1
1
1
1
3
2
2
1
1
1
1
4
Merge sort. C(N): Count array accesses.
Work for each call is not the same. However, the work per layer is the same.
N=64
N=32
N=32
~32
~64
~32
M
M
M
16
16
16
16
M
M
M
M
~16
···
···
# of layers
# of layers
?: What is the difference in terminology between “the same” and “constant”?




?: When should we count calls? When should we analyze the work-per-layer?

Comprehending. Understanding the implementation details of a program.
Modeling. Counting the number of steps in terms of N, the size of the input.
Case Analysis. How certain conditions affect the program execution.
Asymptotic Analysis. Describing what happens for very large N, as N→∞.
Formalizing. Summarizing the final result in precise English or math notation.
Runtime Analysis Process
34
boolean dup1(int[] A)
Consider every pair
Array contains a duplicate at front
Array contains no duplicate items
Constant time
Quadratic time
Best: Θ(1)
Worst: Θ(N2)
Overall: Ω(1) and O(N2)
Worst case
Best case
Today, we mostly focused on breaking down the modeling process with three additional strategies.

?: What were the three strategies we studied?

Summary
Theoretical analysis of algorithm performance requires careful thought.
There are no magic shortcuts for analyzing code. Use strategies!
Find the exact count of steps.
Write out examples.
Use a geometric argument–visualizations!
Many runtime problems you’ll do in this class resemble one of the 3 problems from today. See textbook, study guide, and discussion for more practice.

Going from N2 to N log N is an enormous difference.
35
This topic has one of the highest skill ceilings of all topics in the course.

In a software development job, the key use of runtime analysis is to be able to (1) evaluate trade-offs in time and space between different algorithms and (2) identify performance bugs by comparing the theoretical runtime analysis vs. the real-world running time.