Retro school children University of Washington Computer Science & Engineering
 CSE 548: Computer Architecture - Winter 2006
  CSE Home   About Us    Search    Contact Info 

 Course Home
   

Class Project




When you've chosen a benchmark and are ready to start, read the Kahuna Tutorial. You will need the Kahuna Simulator and the Kahuna Workloads files to go through the tutorial.

Remember that there is a progress report due on Wednesday, February 1st at the beginning of class. Check out First Milestone Guidelines for a guide to what should be in the report.


Splash-2
The following applications are all part of the SPLASH-2 benchmark suite. They are good choices for small, simple programs that do a variety of tasks. These benchmarks vary in size, but none of them are particularly big. Each has at most 3 methods that take any significant time, so you can make big Amdahl's Law gains by hand-coding only a few routines.
I have compiled and run all of these applications on Linux. I modified some of the benchmarks to work since the originals were targeted toward Solaris. You will want to modify the Makefiles to add the "-pg" flag to the compiler and linker flags (CFLAGS and LDFLAGS) so that the application will output the gmon.out file for use with gprof.

 
Benchmark
Description
Source Code
Notes
Group
Barnes-Hut

Implements the Barnes-Hut method to simulate the interaction of a system of bodies (N-body problem). A general description of the Barnes-Hut method can be found HERE.

barnes.tar.gz README.barnes
Cholesky

Performs blocked Cholesky Factorization on a sparse matrix. The implementation contained in SPLASH-2 is described HERE.

cholesky.tar.gz README.cholesky
FMM

The FMM application implements a parallel adaptive Fast Multipole Method to simulate the interaction of a system of bodies (N-body problem). A description of this implementation can be found HERE.

fmm.tar.gz README.fmm
LU

Factors a dense matrix into the product of a lower triangular and an upper triangular matrix. The factorization uses blocking to exploit temporal locality on individual submatrix elements. The algorithm used in this implementation is described HERE.

lu.tar.gz README.lu Stef and Hoifung
Ocean

Simulates large-scale ocean movements based on eddy and boundary currents, and is an enhanced version of the SPLASH Ocean code. A description of the functionality of this code can be found HERE.

ocean.tar.gz README.ocean
Radiosity

Computes the equilibrium distribution of light in a scene using the hierarchical diffuse radiosity method. A description of the sequential hierarchical radiosity method can be found HERE.

radiosity.tar.gz README.radiosity Shobhit and Pravin
Radix

Implements an integer radix sort based on the method described HERE.

radix.tar.gz README.radix Nodira and Yaw
Raytrace

Renders a three-dimensional scene onto a two-dimensional image plane using optimized ray tracing. A hierarchical uniform grid is used to represent the scene for efficient access, and early ray termination and antialiasing are implemented. The best description of the algorithm can be found HERE.

raytrace.tar.gz README.raytrace Yael and Ravi
VolRend

Renders a three-dimensional volume onto a two-dimensional image plane using an optimized ray casting technique developed by Marc Levoy. A hierarchical octree data structure is used to represent the scene for efficient access, and early ray termination and antialiasing are implemented. The best description of the algorithm can be found HERE.

volrend.tar.gz README.volrend Jenny and Indri
Water

Solves the same molecular dynamics N-body problem as the original Water code in SPLASH (which is called WATER-NSQUARED in SPLASH-2), but uses a different algorithm. In particular, it imposes a 3-D spatial data structure on the cubical domain, resulting in a 3-D grid of boxes. Every box contains a linked list of the molecules currently in that box (in the current time-step). The advantage of the spatial grid is that a process that owns a box in the grid has to look at only its neighboring boxes for molecules that might be within the cutoff radius from a molecule in the box it owns. This makes the algorithm O(n) instead of O(n^2). For small problems (upto several hundred to a couple of thousand molecules) the overhead of the spatial data structure is not justified and WATER-NSQUARED might solve the problem faster. But for large systems this program is much better. That is why we provide both, since both small and large systems are interesting in this case.

water.tar.gz README.water-spatial
README.water-nsquared
Scott and Colin


Other Benchmarks
These are other programs that I have found that vary in size, but are still good candidates for the project.

 
Benchmark
Description
Source Code
Notes
Group
SpMV

Implements sparse matrix-vector multiplication.

SPMV.tar.gz This is a large benchmark, but it is one of the newest. There are a lot of people who would like to see this code run fast.
ASA

Adaptive Simulated Annealing.

ASA.tar.gz ASA-README.pdf



CSE logo Computer Science & Engineering
University of Washington
Box 352350
Seattle, WA  98195-2350
(206) 543-1695 voice, (206) 543-2969 FAX
[comments to aputnam]