Algorithm Analysis I

Complete the Reading Quiz by noon before lecture.

Characterizing Runtime
Counting Steps
1. dup1
2. dup2

In the lecture on Stacks and Queues, we briefly reviewed big-O notation to compare and contrast the running time of different data structures. Much of this course relies on comparing and contrasting different data structures and their implementation details, which is why we need to develop a more rigorous foundation for evaluating a program’s execution cost. Execution cost can be broken down into two categories.

Time complexity: How much time does it take for your program to execute?
Space complexity: How much memory does your program require?

In this course, we’ll mainly be determining the time complexity of different algorithms, which is also known as running time (or runtime) analysis.

Characterizing Runtime

Suppose we’re trying to determine if a sorted array contains duplicate values. Here are two ways to solve the problem.

dup1

Consider every pair, returning true if any match!

dup2

Take advantage of the sorted nature of our array.

We know that if there are duplicates, they must be next to each other.
Compare neighbors: return true first time you see a match! If no more items, return false.

We can see that dup1 seems like it’s doing a lot more unnecessary, redundant work than dup2. But how much more work? Ideally, we want our characterization to be simple and mathematically rigorous while also clearly demonstrating the superiority of dup2 over dup1.

Counting Steps

One characterization of runtime is by counting steps, or the number of operations executed by a program.

Look at your code and the various operations that it uses (i.e. assignments, incrementations, etc.).
Count the number of times each operation is performed.

dup1

Let’s count the number of steps executed as a result of calling dup1 on an array of size N = 10000.

public static boolean dup1(int[] A) {
  for (int i = 0; i < A.length; i += 1) {
    for (int j = i + 1; j < A.length; j += 1) {
      if (A[i] == A[j]) {
         return true;
      }
    }
  }
  return false;
}

How many times is the operation i = 0 executed?

i = 0 is only initialized once at the beginning of the nested for loops.

The analysis gets more complicated due to the if statement. In the best case, the program could exit early if a duplicate is found near the beginning of the array. In the worst case, the program could continue until the return false statement at the end if the array does not contain any duplicates.

What is the least and most number of times that the operation j = i + 1 is executed?

1 to 10000 times.

This process gets tedious very quickly. Double check that the counts in the table below match what you expect.

Operation	Count N = 10000
less-than `<`	2 to 50,015,001
increment `+= 1`	0 to 50,005,000
equals to `==`	1 to 49,995,000
array accesses	2 to 99,990,000

Not only is computing these counts tedious, but it doesn’t tell us about how the algorithm scales as N, the size of the array, increases. Rather than setting N = 10000, we can instead determine the count in terms of N.

Operation	Count N = 10000	Symbolic Count
`i = 0`	1	1
`j = i + 1`	1 to 10,000	1 to N
less-than `<`	2 to 50,015,001	2 to (N² + 3N + 2) / 2
increment `+= 1`	0 to 50,005,000	0 to (N² + N) / 2
equals to `==`	1 to 49,995,000	1 to (N² - N) / 2
array accesses	2 to 99,990,000	2 to N² - N

dup2

Try to come up with rough estimates for the symbolic and exact counts for at least one of the operations for dup2, and check that the rest of the counts match what you expect.

OperationCount N = 10000Symbolic Count
i = 011
less-than <  
increment += 1  
equals to ==  
array accesses  

public static boolean dup2(int[] A) {
  for (int i = 0; i < A.length - 1; i += 1) {
    if (A[i] == A[i + 1]) {
      return true;
    }
  }
  return false;
}

Solution for dup2

Operation	Count N = 10000	Symbolic Count
`i = 0`	1	1
less-than `<`	1 to 10000	1 to N
increment `+= 1`	0 to 9999	0 to N - 1
equals to `==`	1 to 9999	1 to N - 1
array accesses	2 to 19998	2 to 2N - 2

dup2 is better! But why?

An answer: It takes fewer operations to accomplish the same goal.
Better answer: Algorithm scales better in the worst case: (N² + 3N + 2) / 2 vs. N.
Even better answer: Parabolas grow faster than lines.

While the even better answer the same idea as the better answer, it provides a more general geometric intuition. As the size of the array (N) grows, the parabolic N²-time algorithm will take much longer to execute than the linear N-time algorithm.

Reading Quiz

Algorithm Analysis I

Table of contents

Characterizing Runtime

Counting Steps

dup1

dup2