Cache-oblivious Algorithms for Dynamic Programming: are they necessary?

by
Jerry Pan

Dynamic programming is a widely-used algorithmic technique to solve a variety of problems. It is known that standard implementations of these algorithms often fail to exploit the temporal locality of data which leads to poor cache performance for large problem sizes. Therefore, many cache efficient algorithms using the divide and conquer technique have been designed to out perform the standard algorithms. However, the empirical timing studies done on the Intel Pentium 4/Xeon processors and various other modern processors for algorithms that solve the all-pairs shortest paths problem - the standard Floyd-Warshall's algorithm, the cache-oblivious Gaussian Elimination Paradigm algorithm and the Transitive Closure algorithm show that the standard algorithm out performs all the other cache-oblivious algorithms, despite the fact that cache simulations do not favor the standard algorithm. Similar results are obtained for Gaussian elimnation without pivoting and matrix multiplication algorithms. On the other hand, the results are reversed and in sync with previous studies done for the Simple Dynamic Programming algorithms. Further research reveals that the performance boost could be due to the hardware prefetchers that exist in modern processors. With this advanced hardware technology, certain algorithms such as the Floyd-Warshall's that once have poor cache performance can now be cache efficient, while some other algorithms such as the standard Simple Dynamic Programming algorithm do not benefit from it.

Advised by Richard Ladner

CSE 403
Tuesday
May 30, 2006
10:30 - 11:30 am