Concerning Caches

Caches in the MIPS Pipeline

Memory is accessed through the caches in two places in the MIPS pipeline, at the instruction fetch and the memory stages.

Fetch (IF)
___ Instruction
Decode (ID)
___ Execute
___ Memory
___ Write Back
| |
L1 I-cache L1 D-cache
| |

L2 Unified Cache

Cache CPI Contributions

Ignore the L2 cache for this problem. Suppose our D-cache miss rate is 0.05 and I-cache miss rate is 0.01. The cache miss penalty is 10 cycles. 20% of our instructions are loads or stores. CPIbase of the pipelined machine is 1 (it makes the math easy, it is not realistic).

CPIreal = CPIbase + CPII-cache miss + CPII-cache miss

CPII-cache miss = miss rate x penalty = 0.01 x 10 = 0.1

CPII-cache miss = miss rate x penalty x load/store frequency = 0.05 x 10 x 0.20 = 0.1

CPIreal = 1 + 0.1 + 0.1 = 1.2

Notice although the cache miss penalties are high, the caches are doing a very good job of hiding latency because the CPI is hardly affected.

Addressing Caches


31          16 15              6 5   0
tag index displacement

Suppose we have 64 byte blocks, we need six bits to index into the cache block. This is the displacement. We can further divide this into a block offset (four bits because our blocks are 16 words wide) and a byte offset (two bits because our words are four bytes wide).

If our cache size is 64 kbytes, then there are 1024 blocks in total. That means 10 bits are needed to select a block in the cache.

The remaining 16 bits of the address form the tag.

The cache capacity counts only the actual data, not the control information such as valid bits, tag bits and dirty bits which is stored with each cache block. You can think of this as asking what the maximum occupancy of a hotel is - you would not count the elevators or hallways as places for guests to stay.


31              14 13          6 5   0
tag set index displacement

Keeping the rest of the cache parameters the same, we now make the cache 4-way set associative. There are still 1024 blocks but only 256 sets since each set has four blocks. Thus we only need eight bits to select a set.

Note that the tag is now longer.

Fully Associative

31                                        6 5   0
tag displacement

There are no index bits, just a wide tag.

CSE 378 Spring 2002 - Section 9
First Previous Page 1 Next Last