|
CSE Home | About Us | Search | Contact Info |
|
Sim3mpSim3mp is a reimplementation of SimpleScalar using C++, written by Mark Oskin. The level of detail of sim3mp is around that of simcache but it models a three state bus-based cache coherency multiprocessor system, although it does not simulate an actual out-of-order pipeline and there is no virtual memory. There are L1 instruction and data caches per processor and a unified L2 cache. The L1 caches do not have to snoop because the L2 caches do snoop i.e. if a processor is writing to a line in its L1 data cache, then the corresponding line in the L2 cache will be in the exclusive state. Naturally this means that the caches must have the inclusion property and the L1 caches must be write-through because they do not store coherency state. It may prove necessary to modify the L1 caches to store coherency state when changing the existing three state protocol into a four state protocol.This is how to build sim3mp:
Configuration parametersThere are fortunately far fewer configuration options than sim-outorder and you probably can leave them all at the default values (except for number of processors). Parameters are set by using the format -Parameter:value. Run sim3mp by typing:sim3mp parameter-strings program-to-simulate parameters-for-simulated-program
Modifying sim3mpThe first task you will have to perform is to add code to gather statistics on the numbers of cache line state transitions. To this end some of the files you should modify are:
Writing multiprocessor programsUse C, not C++. All simulated processors run a copy of the same program, share global variables but have separate stacks and local variables. Be sure to link in the pre-built MPLib.o library found in the examples subdirectory, which replaces some standard system calls with multiprocessor-compatible versions. The Makefile in that subdirectory shows how to do so.synch.c is a trivial multiprocessor example program. Semaphores can be used for more complicated tasks than the simple lock scheme I've implemented. Note that every processor starts executing a copy of the program in a separate process - there is no need to call a fork() system call to initialize the processes on each processor. Download synch.c into the examples subdirectory. Concatenate add-to-makefile to examples/Makefile and build synch.ss with the following commands:
cd examples
Benchmark programsUnfortunately there's something odd about the way multiprocessor programs are compiled, which messes up the passing of command line options to the simulated binaries. You will have to modify the programs to hardcode in the options before recompiling and simulating them with sim3mp.Concatenate add-to-makefile-2 to examples/Makefile. Download sor.c into the examples subdirectory. Be sure to pass the parameter -Processors: to sim3mp with a value matching the value of the variable NUM_PROCS when running sor.ss. Concatenate add-to-makefile-3 to examples/Makefile. Download quicksort.c and quicksort.h into the examples subdirectory. Be sure to pass the parameter -Processors: to sim3mp with a value matching the value of the variable NUM_PROCS when running quicksort.ss. |
Computer Science & Engineering University of Washington Box 352350 Seattle, WA 98195-2350 (206) 543-1695 voice, (206) 543-2969 FAX [comments to douglas] |