





| how the de                  | ependencies that exist in the following code:                                        |
|-----------------------------|--------------------------------------------------------------------------------------|
| lw                          | \$t0, 8(\$fp)                                                                        |
| addi                        | \$at, \$0, 2                                                                         |
| sllv                        | \$t0, \$t0, \$at                                                                     |
| lw                          | \$t1, 68(\$fp)                                                                       |
| add                         | \$t1, \$t1, \$t0                                                                     |
| lw                          | \$t0, 72(\$fp)                                                                       |
| SW                          | \$t0, 0(\$t1)                                                                        |
| • 3 kinds of                | dependencies, read after write (data) write after read (apti) write                  |
| • 3 kinds of<br>after-write | dependencies: read-after-write (data), write-after-read (anti), write-<br>e (output) |
| after-write                 |                                                                                      |
| after-write                 | e (output)                                                                           |





## What do the units do?

- Instruction queue: holds a pile of to-be-executed instructions. These may come from different paths on a branch.
- *Dispatch unit*: tries to find the best set of instructions to send to the functional units. It may also rename registers.
- Functional units: integer ALU, fp ALU, branch, load/store
- Branch predictor: maintains branch history
- Instructions wait at *reservation stations* until their operands are ready
- The *commit unit* makes sure instructions change the register file in an orderly manner (or maybe not at all if they were on the wrong side of a branch).

5/21/2002

## **Branch Prediction**

- Keep a table mapping branches to history (did we take the branch last time?).
- With 256 entries: If we find a branch at address N, we locate its entry in the table like this:

182

- index = N % 256 (or easier: index = N & 0x000000FF)
- Simplest predictors are one bit: taken/not-taken
- Think about a loop that goes around 10 times.
  - How accurate is the one bit scheme?

5/21/2002

181

<text><text><image><list-item><list-item><list-item>



|     | Athlon Specifics                                                                                                                                                                             |  |  |
|-----|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--|--|
| • 3 | 80+ million transistors; clock speeds > 1GHz                                                                                                                                                 |  |  |
|     | ront end of the processor translates incoming CISC stream into<br>RISC86/MacroOp instructions                                                                                                |  |  |
| ι   | RISC86 instructions are passed to instruction control unit, which manages<br>up to 72 instructions at once. The instructions are scheduled here. Up to<br>9 instructions may issue per cycle |  |  |
|     | 'here are 9 pipelines: 3 integer; 3 address calculation; 3 FP/MMX. Integer<br>vipelines have 10 stages.                                                                                      |  |  |
| •1  | he instruction control unit also handles commits (up to 9 per cycle)                                                                                                                         |  |  |
| • [ | Branch prediction table has 2048 entries.                                                                                                                                                    |  |  |

5/21/2002