# Memory & Caches III

CSE 351 Autumn 2021

| Instructor:       | Teaching Assis | tants:                                                           |                                                                                                                                                   |                                                                                                                                                                                           |
|-------------------|----------------|------------------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| Justin Hsia       | Allie Pfleger  | Aniru                                                            | dh Kumar                                                                                                                                          | Assaf Vayner                                                                                                                                                                              |
| Atharva Deo       |                | ar Celeste Zeng                                                  |                                                                                                                                                   | Dominick Ta                                                                                                                                                                               |
|                   | Francesca Wang | ; Hams                                                           | a Shankar                                                                                                                                         | Isabella Nguyen                                                                                                                                                                           |
| Joy Dang Julia Wa |                | Nang Maggie Jiang                                                |                                                                                                                                                   |                                                                                                                                                                                           |
|                   | Monty Nitschke | More                                                             | l Fotsing                                                                                                                                         | Sanjana Chintalapati                                                                                                                                                                      |
| THE CLOUD, YEAL   |                | HOW? YOU'RE ON<br>A CABLE MODEM.<br>THERE'S A LOT<br>OF CACHING. | SHOULD THE CORD BE<br>STRETCHED ACROSS<br>THE ROOM LIKE THIS?<br>OF COURSE. IT<br>HAS TO REACH<br>THE SERVER,<br>AND THE SERVER<br>IS OVER THERE. | WHAT IF SOMEONE TRIPS ON IT?<br>( WHO WOULD WANT TO DO THAT?<br>IT SOUNDS UNPLEASANT.<br>UH. SOMETIMES PEOPLE<br>DO STUFF BY ACCIDENT.<br>( I DON'T THINK<br>I KNOW ANYBODY<br>LIKE THAT. |

http://xkcd.com/908/

# **Relevant Course Information**

- Lab 3 due tonight
- Lab 4 released Monday, due after Thanksgiving
  - Can do Part 1 after today; will need Lecture 20 to do Part 2
- hw19 due Wednesday (11/17)
  - Covers the major cache mechanics BIG homework
- hw20 due Friday (11/19)
  - Preparation for Lab 4

# **Mid-quarter Survey Debrief**

- Pace is a little fast (lecture pace, lots of assignments)
- Some readings are too dense/confusing
  - Ask on the Ed lecture post!
- We have e-flashcards? (Ed post #131)
  - #.2 lecture lesson → Terminology slide → .apkg file
  - https://courses.cs.washington.edu/courses/cse351/21au/flashcards/
- HW: formatting is frustrating, could use explanations
- Midterm:
  - 72-hour window was good (maybe include a weekend day?)
  - GDB/Stack question was long/difficult
  - Love and hate for the design/explanation questions

# Making memory accesses fast!

- Cache basics
- Principle of locality
- Memory hierarchies
- Cache organization
  - Direct-mapped (sets; index + tag)
  - Associativity (ways)
  - Replacement policy
  - Handling writes
- Program optimizations that consider caches

## **Reading Review**

- Terminology:
  - Associativity: sets, fully-associative cache
  - Replacement policies: least recently used (LRU)
  - Cache line: cache block + management bits (valid, tag)
  - Cache misses: compulsory, conflict, capacity
- Questions from the Reading?

#### **Review: Direct-Mapped Cache**



#### **Direct-Mapped Cache Problem**



# Associativity

- What if we could store data in any place in the cache?
  - More complicated hardware = more power consumed, slower
- So we combine the two ideas:
  - Each address maps to exactly one set
  - Each set can store block in more than one way



# **Cache Organization (3)**

- Associativity (E): # of ways for each set
  - Such a cache is called an "E-way set associative cache"
  - We now index into cache *sets*, of which there are S = C/K/E
  - Use lowest  $\log_2(C/K/E) = s$  bits of block address
    - <u>Direct-mapped</u>: E = 1, so  $s = \log_2(C/K)$  as we saw previously
    - <u>Fully associative</u>: E = C/K, so s = 0 bits



(ပယ) ႐

(001) 1

(010) 2

(ບII) 3 (ໄໝ) 4

(101) 5

(110) 6

(III) 7

| <b>Example Placement</b> |
|--------------------------|
|--------------------------|

| block size: | 16 B     |  |
|-------------|----------|--|
| capacity:   | 8 blocks |  |
| address:    | 16 bits  |  |

Offset (k)

- Where would data from address 0x1833 be placed?
  - Binary: 0b 0001 1000 0011 0011

Tag (**t**)

*m*-bit address:

**s** = ? **s** = ? **s** = ? Direct-mapped 2-way set associative 4-way set associative Set Tag Set Tag Data Set Tag Data Data (000)(0)0 (01)1 (10)2(1)1 (II)3

t = m - s - k  $s = \log_2(C/K/E)$   $k = \log_2(K)$ 

Index (S)

# **Block Placement and Replacement**

- Any empty block in the correct set may be used to store block
  - Valid bit for each cache block indicates if valid (1) or mystery (0) data
- If there are no empty blocks, which one should we replace?
  - No choice for direct-mapped caches
  - Caches typically use something close to *least recently used (LRU)* (hardware usually implements "not most recently used")



# **Polling Questions**

 $K=2^7 B$  $P C = 2^{11} B$ ✤ We have a cache of size 2 KiB with block size of 128 B. If our cache has 2 sets, what is its associativity? cache holds C/K=211-7=14=16 blocks Vote in Ed Lessons 1 block **A**. 2 S= C/K/E set O **B.** 4 E = (C/K)/Seach set has **C.** 8 8 blocks, so F=8 = 16/2 = 8 cache size **D.** 16 set 1 E. We're lost... m=16 < If addresses are 16 bits wide, how wide is the Tag field?  $k = \log_2(K) = 7 \text{ bits}, s = \log_2(S) = 1 \text{ bits}, t = m - s - k = 8 \text{ bits}$ 



#### **Notation Review**

- We just introduced a lot of new variable names!
  - Please be mindful of block size notation when you look at past exam questions or are watching videos

| Parameter          | Variable                     | Formulas                                                                          |  |  |
|--------------------|------------------------------|-----------------------------------------------------------------------------------|--|--|
| Block size         | K (B in book)                |                                                                                   |  |  |
| Cache size         | С                            | $M = 2m$ () $m = \log M$                                                          |  |  |
| Associativity      | Ε                            | $M = 2^{m} \leftrightarrow m = \log_2 M$ $S = 2^{s} \leftrightarrow s = \log_2 S$ |  |  |
| Number of Sets     | S                            | $K = 2^{k} \leftrightarrow k = \log_2 K$                                          |  |  |
| Address space      | М                            | $C = K \times E \times S$                                                         |  |  |
| Address width      | m                            | $c = K \times E \times S$<br>$s = \log_2(C/K/E)$                                  |  |  |
| Tag field width    | t                            | $m = \frac{t}{t} + s + \frac{k}{k}$                                               |  |  |
| Index field width  | S                            |                                                                                   |  |  |
| Offset field width | <b>k</b> ( <b>b</b> in book) |                                                                                   |  |  |

0

#### **Example Cache Parameters Problem**

>2<sup>10</sup>B <⇒ m= 10 bits</li>
MP
Address space, 125 cycles to go to memory.
Fill in the following table:

| C                   |               |                                 | 2° 1             |
|---------------------|---------------|---------------------------------|------------------|
| $\mathcal{K}$       | Block Size    | 8 B                             | 2 <sup>3</sup> 2 |
| E                   | Associativity | 2-way                           | 2'3              |
| нт                  | Hit Time      | 3 cycles                        |                  |
| MR                  | Miss Rate     | 20%                             |                  |
| t = m - s - k       | Tag Bits      | 5                               |                  |
| $ = \log_2(C/K/E) $ | Index Bits    | 2                               | 2 12 /2'         |
| $k = \log_2(k)$     | Offset Bits   | 3                               |                  |
| AMAT= HT +MR *MP    | AMAT          | 3+().2(125)= 28 clock<br>cycles |                  |

Locate set

Check if any line in set

1)

2)

#### **Cache Read**



### **Example:** Direct-Mapped Cache (*E* = 1)

Direct-mapped: One line per set Block Size K = 8 B



#### **Example: Direct-Mapped Cache (***E* = 1**)**

Direct-mapped: One line per set Block Size K = 8 B



#### **Example: Direct-Mapped Cache (***E* = 1**)**

Direct-mapped: One line per set Block Size K = 8 B



#### **Example: Set-Associative Cache (***E* = 2**)**



• • •



### **Example:** Set-Associative Cache (E = 2)



block offset

# **Example: Set-Associative Cache (***E* = 2**)**



#### No match?

- One line in set is selected for eviction and replacement
- Replacement policies: random, least recently used (LRU), ...

# **Types of Cache Misses: 3 C's!**

- Compulsory (cold) miss
  - Occurs on first access to a block
- Conflict miss
  - Conflict misses occur when the cache is large enough, but multiple data objects all map to the same slot
    - e.g., referencing blocks 0, 8, 0, 8, ... could miss every time
  - Direct-mapped caches have more conflict misses than *E*-way set-associative (where *E* > 1)
- Capacity miss
  - Occurs when the set of active cache blocks (the *working set*) is larger than the cache (just won't fit, even if cache was *fully-associative*)
  - **Note:** *Fully-associative* only has Compulsory and Capacity misses

24

#### **Example Code Analysis Problem**

Assuming the cache starts <u>cold</u> (all blocks invalid) and sum, i, and j are stored in registers, calculate the miss rate:
m = 10 bits, C = 64 B, K = 8 B, E = 2

