

## Issues With the Single Cycle Implementation

- All instructions take the same time (CPI = 1), but some are actually *shorter* than others...
- ADD uses Instruction Memory, Register File, ALU, Register File
- LW uses Instruction Memory, Register File, ALU, Data Memory, Register file again...
- The cycle time of the machine has to be the cycle time of the "longest" instruction
- We are violating an important principle: Make the common case fast.

136



5/10/2002

## Thought Experiment 3

- A given benchmark (say GCC) has this mix: 20% loads, 10% stores, 50% R-format, 20% branches
  - What's the single-cycle time?
  - What's the vari-cycle time?
  - What's the speedup?

• Suppose we add floating point instructions, and it takes 8ns to do an FP add and 16 ns to do a FP mult.

- New cycle time of single cycle machine?
- Another mix: 25% loads, 15% stores, 30% R-format, 10% branches, 10% FP add, 10% FP mult

5/10/2002

| Making it Better                                                                                                                                                                                                  |      |
|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------|
| • Of course, it's impractical to build a vari-cycle machine.<br>• What to do?                                                                                                                                     |      |
| <ul> <li>Multiple cycle implementation (section 5.4): Approximate the effer<br/>a variable clock, by letting instructions take different numbers of<br/>cycles to complete.</li> </ul>                            | t of |
| <ul> <li><i>Pipelining</i>: Observation: We're underutilizing functional units (eg. 1<br/>ALU is idle while we're accessing memory). Find a way to work on<br/>multiple instructions at the same time.</li> </ul> | he   |
| • CISCS pretty much require a multi-cycle implementation. Why                                                                                                                                                     | /?   |

• RISCS are amenable to pipelining...

5/10/2002

139

140