Question: how fast is this method?
public double sum(double[] data) { double result = 0.0; for (int k = 0; k < data.length; k++) { result = result + data[k]; } return result; }
What is the question asking? How many microseconds? Compared to what?
Goal: We'd like some way to compare the performance of two algorithms that do the same task or two different data structures that implement the same abstraction using different algorithms. We want this to be independent of any particular implementation or machine. It turns out that the same ideas and techniques work for comparing execution time, space, and other resource usage. We'll focus on execution time, since this is usually the one we're most interested in for the kinds of problems we've been looking at.
To analyze an algorithm, we need to do the following:
What's a step? The idea is that we want to think abstractly about the elementary operations that a simple computing machine can perform. As a first approximation, a step is a simple operation or statement in a programming language like Java. Examples:
The costs of more complex operations are typically the sum of the costs of their components. For example:
Remember that all of these costs should be measured relative to the problem size. Some of the costs of an algorithm, even fairly large ones, won't depend on the problem size, others will.
Once we've done this analysis, we will wind up with a number that says, for instance, it takes 25n + 3n2 + 17 steps to solve a problem of size n. Another algorithm might take 42n + 6(n log n) + 300 steps to solve the same problem. Now, which is better?
Certainly for small values of n, the first algorithm requires fewer steps. But in general we're interested in how an algorithm behaves for large problems - after all, small problems can almost always be solved so quickly that it really doesn't matter. So what we're interested in is finding the asymptotic complexity of an algorithm - its cost as the problem size gets large. For this sort of analysis, only the high-order terms really matter.
Rule of thumb: to compare the asymptotic complexity of two algorithms, drop all but the high-order terms and ignore the constants. So, for the examples above, what matters about 25n + 3n2 + 17 is that it's proportional to n2, and 42n + 6(n log n) + 300 is proportional to n log n. For large values of n, then, the second one is faster.
There is a standard notation in computer science that captures this idea: Big-O notation. Definition: If f(n) and g(n) are two complexity functions, we say that f(n) = O(g(n)) [pronounced "f(n) is order g(n)"] if there is some constant c such that f(n) ≤ c g(n) for all sufficiently large n.
Exercise: give an informal proof that 5n+3 is O(n)
Exercise: give an informal proof that 5n2 + 42n + 17 is O(n2)
Fine print:
Complexity classes. There are several common, basic complexity classes. You should know these in order, and be able to draw graphs of them..
Times that are O(nk) or better are called polynomial time. Algorithms that run in polynomial time are generally considered to be feasible; algorithms that require exponential time are not.