CSE 378 Homework #6 Stalling

HW6 Task 2 - Implement Stalling

Supplement to the hw6 main assignment.

Overview
The skeletal ICache and DCache implementations create output ports labelled Stall, which will be used to indicate cache misses. However, the base cache implementations never miss -- they simply (magically) perform the requested memory operation in less than a cycle time. Because they never miss, they assign the value 0 to the Stall ports each cycle.
Eventually your implementation will use the output ports to indicate that a cache miss has occurred, and that the data path should stall for that cycle. You can, and should, make and test the datapath changes required to get stalling working even before you start implementing real caches. You do this by taking the other simple approach -- always miss.
Change ICache.cpp and DCache.cpp so that every memory operation request results in a fixed number of stall cycles - two would be a reasonable value for testing (on the theory that as far as code bugs are concerned, the non-negative integers are 0, 1, and "other"). Once your cache component has stalled for that number of cycles, honor the memory request (by simply executing the original skeletal code).
Each stall cycle, the cache component should set its Stall output port to 1. When a memory request is finally honored (or when no memory operation is requested) the component should set its Stall output to 0. Add logic to your data path that prevents the current instruction from completing execution whenever either cache signals a stall (has Stall output of 1). Be careful that no part of the stalled instruction completes. It would be easy for a machine that allowed register writing during a stall cycle, say, to pass a simple test program, since the same value might be written into the same register once the stalling is over. However, while a machine like that might pass some tests, it is broken, and is unlikely to run the full benchmark suite correctly.
Having done this, run an application (not the OS, just an app) on your machine and compare the number of cycles required for it to finish with the number you obtain using the Cebollita simulator (or your HW5 machine). You machine with stalls doesn't take a factor of three more cycles, it's even slower than that. (Why?)
A Detail
It's a lot easier to ge through this task by testing only a single cache stalling at a time -- if both stall, you have to worry about the complicated situation that perhaps there is no cycle in which both of them are not stalling (so perhaps no prgress every is made).
Test by running with only one of the two caches stalling. When that works, test with the other. This isn't quite a complete test, but since all you're really doing is OR'ing these outputs (in most cases) you should be okay to move on.
How To Stall
When either cache asserts its Stall output, you want to make sure that the machine's state doesn't change:

disable writing to the PC
disable writing to the register file
disable writing within the exception container:

disable the input to Write Requested
disable the input to Exception
disable the input to IsRFE

disable writing to memory (through the DCache)
If W is the original write enable line, the easiest way to do this is to replace that input with:
not(ICache_stall || DCache_stall) && W
There is a small snag with this, though -- when disabling memory writes, you create a loop of combinational components (from DCache back to itself). This is easy enough to fix, though -- DCache doesn't need to be told that it is asserting Stall, as it already knows that, so don't feed that input back to it. Instead, use
not ICache_stall && W
(The DCache code will prevent it from writing if it itself needs to stall.)