# Memory & Caches III CSE 351 Spring 2024 #### **Instructor:** Elba Garza ### **Teaching Assistants:** Ellis Haker Adithi Raghavan Aman Mohammed Brenden Page Celestine Buendia Chloe Fong Claire Wang Hamsa Shankar Maggie Jiang Malak Zaki Naama Amiel Nikolas McNamee Shananda Dokka Stephen Ying Will Robertson Playlist: CSE 351 24Sp Lecture Tunes! ### **Relevant Course Information** - \* HW 15 due tonight! HW16 due Monday - HW 17/18 due following Friday (10 May) - Covers the major cache mechanics—big homework, start soon! - Take-home Midterm, May 6<sup>th</sup> to May 7<sup>th</sup> - 48 hours, but should take 1-3 hours to complete - No in-person lecture on Monday the 6<sup>th</sup>—I will post a new recording instead - Mid-Course Canvas Survey due May 6<sup>th</sup> by 11:59 PM - Lab 3 due Wednesday, May 8th - Lab 4 releasing soon afterward! - Can do Part 1 after today; will need Lecture 19 to do Part 2 ## Making memory accesses fast! - Cache basics - Principle of locality - Memory hierarchies - Cache organization - Direct-mapped (sets; index + tag) - Associativity (ways) - Replacement policy - Handling writes - Program optimizations that consider caches # **Reading Review** ### Terminology: - Associativity: sets, fully-associative cache - Replacement policies: least recently used (LRU) - Cache line: cache block + management bits (valid, tag) - Cache misses: compulsory, conflict, capacity ## **Review: Direct-Mapped Cache** ## **Direct-Mapped: A Problem!** # **Associativity: A Solution!** $\diamond$ What if we could store **any** data in **any** place in the cache? - But: requires more complicated hardware $\Rightarrow$ more power consumed, slower - Let's combine the two ideas: - Each address maps to exactly one set, but each set can store block in more than one way in the set! # **Cache Organization (3)** **Note:** The textbook uses "b" for offset bits - $\star$ Associativity (E): number of ways to store in each set - Such a cache is called an "E-way set associative cache" - We now index into cache sets, of which there are S = C/K/E - Use lowest $\log_2(C/K/E) = s$ bits of block address - <u>Direct-mapped</u>: E = 1, so $s = \log_2(C/K)$ as we saw previously - Fully associative: E = C/K, so s = 0 bits # **Example Placement** block size K: 16 B Capacity C/K: 8 blocks **Address** *m*: 16 bits - \* Where would data from address $0 \times 1833$ be placed? - Binary: 0b 0001 1000 0011 0011 $egin{array}{c|cccc} t & s & k \\ \hline m ext{-bit address:} & Tag (t) & Index (s) & Offset (k) \\ \hline \end{array}$ t = m-s-k $s = \log_2(C/K/E)$ $k = \log_2(K)$ E=1 s= Direct-mapped | et | Tag | Data | |-----------------------|-----|------| | 0 | | | | 1 | | | | 2 | | | | 3 | | | | 4 | | | | 1<br>2<br>3<br>4<br>5 | | | | | | | | 7 | | | E=2 **s** = 2-way set associative | Set | Tag | Data | |-----|-----|------| | 0 | | | | 1 | | | | 2 | | | | 3 | | | E=4 s = 4-way set associative ## **Block Placement and Replacement** - Any empty block in the correct set may be used to store block - Valid bit for each cache block indicates if valid (1) or mystery (0) data - If there are no empty blocks, which one should we replace? i.e. replacement policy - No choice for direct-mapped caches—gotta replace what's there. Super easy. - Otherwise, caches typically use something close to least recently used (LRU) (hardware usually implements "not most recently used") | Direct-mapped | | 2-way set associative | | | | | | |---------------|---|-----------------------|------|-----|---|-----|------| | Set | V | Tag | Data | Set | V | Tag | Data | | 0 | | | | 0 | | | | | 1 | | | | U | | | | | 2 | | | | 1 | | | | | 3 | | | | _ | | | | | 4 | | | | 2 | | | | | 5 | | | | _ | | | | | 6 | | | | 3 | | | | | 7 | | | | | | | | | 4-way set associative | | | | |-----------------------|---|-----|------| | Set | V | Tag | Data | | | | | | | 0 | | | | | U | | | | | | | | | | | | | | | 1 | | | | | 1 | | | | | | | | | # **Polling Questions** - We have a cache of size 2 KiB with block size of 128 B. If our cache has 2 sets, what is its associativity? - A. 2 - B. 4 - C. 8 - D. 16 - E. We're lost... - If addresses are 16 bits wide, how wide is the Tag field? ## General Cache Organization (S, E, K) ### **Notation Review** - We just introduced a lot of new variable names! - Please be mindful of block size notation when you look at past exam questions or are watching videos | Parameter | Variable | Formulas | | |--------------------|---------------|-----------------------------------------------------|--| | Block size | K (B in book) | | | | Cache size | С | $M = 2^m \leftrightarrow m = \log_2 M$ | | | Associativity | E | $S = 2^{s} \leftrightarrow \mathbf{s} = \log_{2} S$ | | | Number of Sets | S | $K = 2^k \leftrightarrow k = \log_2 K$ | | | Address space | M | $C = K \times E \times S$ | | | Address width | m | $\mathbf{s} = \log_2(C/K/E)$ | | | Tag field width | t | m = t + s + k | | | Index field width | S | | | | Offset field width | k (b in book) | | | ## **Example Cache Parameters Problem** 1 KiB address space, 125 cycles to go to memory. Fill in the following table: | Cache Size C | 64 B | |-------------------|----------| | Block Size K | 8 B | | Associativity E | 2-way | | Hit Time | 3 cycles | | Miss Rate | 20% | | Address width (m) | | | Tag Bits (t) | | | Index Bits (s) | | | Offset Bits (k) | | | AMAT | | # Read: Direct-Mapped Cache (E = 1) Direct-mapped: One line per set Block Size K = 8 B - 1) Locate set - 2) Check if <u>any line</u> in set is valid and has matching tag: **hit!** - 3) Locate data starting at offset #### Address of int: # Read: Direct-Mapped Cache (E = 1) Direct-mapped: One line per set Block Size K = 8 B - 1) Locate set - 2) Check if <u>any line</u> in set is valid and has matching tag: **hit!** - 3) Locate data starting at offset #### Address of int: # Read: Direct-Mapped Cache (E = 1) Direct-mapped: One line per set Block Size K = 8 B - 1) Locate set - 2) Check if <u>any line</u> in set is valid and has matching tag: **hit!** - 3) Locate data starting at offset No match? Then old line/block gets evicted and replaced! 1) Locate set 2) Check if any line in set is valid and has matching tag: hit! 3) Locate data starting at offset # Read: Set-Associative Cache (E = 2) 2-way: Two lines per set Block Size K = 8 B # Read: Set-Associative Cache (E = 2) 1) Locate set - 2) Check if <u>any line</u> in set is valid and has matching tag: **hit!** - 3) Locate data starting at offset L18: Caches III # Read: Set-Associative Cache (E = 2) - 1) Locate set - 2) Check if <u>any line</u> in set is valid and has matching tag: **hit!** - 3) Locate data starting at offset #### No match? - One line in set is selected for eviction and replacement - Replacement policies: random, least recently used (LRU), ... ## **Cache Read** - 1) Locate set - 2) Check if any line in set is valid and has matching tag: hit! - 3) Locate data starting at offset L18: Caches III ## Types of Cache Misses: 3 C's! - Compulsory (cold) miss - Occurs on first access to a block - Conflict miss - Conflict misses occur when the cache is large enough, but multiple data objects all map to the same slot - e.g., referencing blocks 0, 8, 0, 8, ... could miss every time - Direct-mapped caches have more conflict misses than E-way set-associative (where E > 1) ### Capacity miss - Occurs when the set of active cache blocks (the working set) is larger than the cache (just won't fit, even if cache was fully-associative) - Note: Fully-associative only has Compulsory and Capacity misses