CSE 378, Autumn 1997 | Assignment #5 | Due: Monday, Nov 17th, 1997 |
You'll hand in this assignment on paper. However, we request that you use your favorite word processor, as hand-written assignments are difficult to read. Please put the section name (AA or AB) on your assignment, along with your name.
1.
Consider the following idea: We will modify the MIPS ISA and remove
the ability to specify an offset for memory access instructions.
Specifically, all load-store instructions with non-zero offsets
will become pseudo-instructions and will be implemented using two
instructions. For example:
addi $at, $t1, 104 # add the offset to a temporary
lw $t0, $at # new way of doing lw $t0, 104($t1)
a) What changes would you make to the single cycle datapath and control
if this simplified architecture were to be used?
b) The operation time for the major functional units in an implementation
of the architecture are:
Describe how your change in the first part of this question would
affect the cycle time?
2. Here is a series of memory address references from MIPS lw instructions: 4, 16, 32, 20, 80, 68, 76, 224, 36, 44, 16, 172, 20, 24, 36, 68. Assuming a 16 entry direct mapped cache with one-word lines that is initially empty, label each reference as a hit or a miss, and show the final contents of the cache.
3. Using the reference stream from question 2, show the hits and misses and final cache contents for a direct-mapped cache with four-word lines and a total size of 16 words.
4. Using the reference stream from question 2, show the hits and misses and final cache contents for a two-way set associative cache with one-word lines and a total size of 16 words. Assume LRU replacement.
5. Using the reference stream from question 2, show the hits and misses and final cache contents for a fully associative cache with four-word lines and a total size of 16 words. Assume LRU replacement.
6.
Cache C1 is direct-mapped with 16 one-word lines. Cache C2 is direct-mapped
with 4 four-word lines. Assume that the miss penalty for C1 is 8 clock
cycles and the miss penalty for C2 is 11 clock cycles.
a) Assuming that the
caches are initially empty, find a reference stream for which C2 has a
lower miss rate but spends more cycles on cache misses than C1. Clearly
state your calculation of the miss rate and the total memory stall cycles
for each cache.
b) Show a series of references for which cache C2 has more misses than
cache C1. Label each reference as a hit or a miss.