Link

Algorithm Analysis I

Complete the Reading Quiz by noon before lecture.

Table of contents

  1. Characterizing Runtime
  2. Counting Steps
    1. dup1
    2. dup2

In the lecture on Stacks and Queues, we briefly reviewed big-O notation to compare and contrast the running time of different data structures. Much of this course relies on comparing and contrasting different data structures and their implementation details, which is why we need to develop a more rigorous foundation for evaluating a program’s execution cost. Execution cost can be broken down into two categories.

Time complexity
How much time does it take for your program to execute?
Space complexity
How much memory does your program require?

In this course, we’ll mainly be determining the time complexity of different algorithms, which is also known as running time (or runtime) analysis.

Characterizing Runtime

Suppose we’re trying to determine if a sorted array contains duplicate values. Here are two ways to solve the problem.

dup1
Consider every pair, returning true if any match!
dup2
Take advantage of the sorted nature of our array.
  • We know that if there are duplicates, they must be next to each other.
  • Compare neighbors: return true first time you see a match! If no more items, return false.

We can see that dup1 seems like it’s doing a lot more unnecessary, redundant work than dup2. But how much more work? Ideally, we want our characterization to be simple and mathematically rigorous while also clearly demonstrating the superiority of dup2 over dup1.

Counting Steps

One characterization of runtime is by counting steps, or the number of operations executed by a program.

  1. Look at your code and the various operations that it uses (i.e. assignments, incrementations, etc.).
  2. Count the number of times each operation is performed.

dup1

Let’s count the number of steps executed as a result of calling dup1 on an array of size N = 10000.

public static boolean dup1(int[] A) {
  for (int i = 0; i < A.length; i += 1) {
    for (int j = i + 1; j < A.length; j += 1) {
      if (A[i] == A[j]) {
         return true;
      }
    }
  }
  return false;
}
How many times is the operation i = 0 executed?

i = 0 is only initialized once at the beginning of the nested for loops.

The analysis gets more complicated due to the if statement. In the best case, the program could exit early if a duplicate is found near the beginning of the array. In the worst case, the program could continue until the return false statement at the end if the array does not contain any duplicates.

What is the least and most number of times that the operation j = i + 1 is executed?

1 to 10000 times.

This process gets tedious very quickly. Double check that the counts in the table below match what you expect.

OperationCount N = 10000
less-than <2 to 50,015,001
increment += 10 to 50,005,000
equals to ==1 to 49,995,000
array accesses2 to 99,990,000

Not only is computing these counts tedious, but it doesn’t tell us about how the algorithm scales as N, the size of the array, increases. Rather than setting N = 10000, we can instead determine the count in terms of N.

OperationCount N = 10000Symbolic Count
i = 011
j = i + 11 to 10,0001 to N
less-than <2 to 50,015,0012 to (N2 + 3N + 2) / 2
increment += 10 to 50,005,0000 to (N2 + N) / 2
equals to ==1 to 49,995,0001 to (N2 - N) / 2
array accesses2 to 99,990,0002 to N2 - N

dup2

Try to come up with rough estimates for the symbolic and exact counts for at least one of the operations for dup2, and check that the rest of the counts match what you expect.

OperationCount N = 10000Symbolic Count
i = 011
less-than <  
increment += 1  
equals to ==  
array accesses  
public static boolean dup2(int[] A) {
  for (int i = 0; i < A.length - 1; i += 1) {
    if (A[i] == A[i + 1]) {
      return true;
    }
  }
  return false;
}
Solution for dup2
OperationCount N = 10000Symbolic Count
i = 011
less-than <1 to 100001 to N
increment += 10 to 99990 to N - 1
equals to ==1 to 99991 to N - 1
array accesses2 to 199982 to 2N - 2

dup2 is better! But why?

An answer
It takes fewer operations to accomplish the same goal.
Better answer
Algorithm scales better in the worst case: (N2 + 3N + 2) / 2 vs. N.
Even better answer
Parabolas grow faster than lines.

While the even better answer the same idea as the better answer, it provides a more general geometric intuition. As the size of the array (N) grows, the parabolic N2-time algorithm will take much longer to execute than the linear N-time algorithm.


Reading Quiz