Week
|
Topic
|
Reading
|
1 9/28 Basics of Computer Architecture
|
Architecture overview
|
Let your eyes float over chapter 1. We won't cover this in class;
but it is good for your general background in computer architecture.
|
|
Instruction set design
|
Speedread chapter 2 (3rd edition) or Appendix B (4th edition). This
is a good summary of background instruction set design material, but is more
detailed than we will cover in class. Omit 2.4, 2.6,
2.11, 2.13, and 2.16 (3rd edition) or B.8 and B.12 (4th edition).
|
| | |
2 10/5 Pipelining
|
Instruction-level parallelism
|
Read section 3.1 (3rd edition) or section 2.1 (4th edition).
|
|
Basics of pipelining
|
Sections A.1 - A.2 (both editions) is a review of the basics of pipelining and
could replace reading in the undergraduate text. Only read it if you need
it.
|
Dynamic branch prediction
|
Read section 4.2, section 3.4, section 3.5, pp. A-24 to A-26, and pp. 245 to 249 (3rd edition) or
section 2.3, section 2.9 up to p. 127, pp. A-25 to A-26, and pp. 160 to 162
|
Predicated execution
|
Read pp. 340-344, 356, and 358 (3rd edition) or section G-4, p. G-38, and p. G-40 (4th edition).
|
Exceptions & pipelining
|
Read pp. A-37 to A-45 and A-54 to A-56 (both editions).
|
| | |
3 10/12
|
No class
NAE induction
|
|
|
|
|
|
|
| | |
4 10/19 Dynamic Execution Cores
|
Overview of multiple issue processors & static scheduling
|
Read pp. 215-220 (3rd edition) or section 2.7 (4th edition).
See pp. 304-312 (3rd edition) or section 2.2 (4th edition) for a
discussion on loop unrolling.
|
|
Overview of dynamic scheduling
|
Read pp. 181-184 and 220-224 (3rd edition) or pp. 89-92 and section 2.8 (4th edition).
|
Tomasulo's algorithm
|
Read pp. 184-196 (3rd edition) or pp. 92-104 (4th edition).
|
| | |
5 10/26 Midterm starts at 6:30 Class starts at 7:30 Dynamic Execution Cores
|
R10000-style dynamic scheduling
|
The Smith/Sohi article on
superscalars.
The R10000 article. Read from Register
mapping, p. 32 through Register files, p. 35.
|
|
| | |
6 11/2 Static Execution Cores
|
Software techniques to exploit ILP
|
We have already discussed loop unrolling. We'll briefly touch upon two
other techniques on pp. 329-340 (3rd edition) or G-12 to G21 (4th edition).
|
VLIW machines
|
Read pp. 315-319 (3rd edition) or pp. 114-118 (4th edition).
I've also included two supplementary papers on the IA-64.
In the HP/Intel architecture paper omit the memory model,
software pipelining, & floating point. In the
Intel implementation paper, omit floating point again,
IA-432 compatibility & machine resources per port. Both of these articles
contain too much detail, but they are better than the text (section 4.7 (3rd edition)
or G-6 (4th edition)). Let my lecture be your guide for what is important
for us. There is also a critique by a rival which should give you
a sense of how and why architects can disagree.
|
| | |
7 11/9 Caching
|
Basics of caches
|
This is standard undergraduate material. You might skip the reading and just
look at the slides for a review. But read pp. 390-410, 423-430 (3rd edition) or
section 5.1, pp. C-1 to C-19, C-22 to C-29 (4th edition) if the slides seem incomprehensible.
|
|
Advanced caching techniques
|
Read pp. 410-413, section 5.4, pp. 430-435, sections 5.6, 5.7 (3rd edition) or
C-19 to C-21, C-29 to C-38, pp. 293-309 (4th edition).
|
Main memory
|
Read sections 5.8, 5.9 to p. 457 (3rd edition) or pp. 310-312 (4th edition).
|
| | |
8 11/16
|
Overview of multiprocessing
|
Read section 6.1 (3rd edition) or section 4.1 (4th edition).
|
|
Cache coherence, snooping and directory protocols
|
Read sections 6.3 - 6.6 (3rd edition) or sections 4.2 - 4.4 (4th edition).
|
Synchronization
|
Read section 6.7 (3rd edition) or section 4.5 & H.4 (4th edition).
This includes slightly more than we will cover is class, so let the
class notes be your guide as to what is important for us.
|
| | |
9 10/12
|
No class
Thanksgiving
|
|
|
|
|
|
|
| | |
10 11/30
|
Tera-style multithreading
|
Read the Tera paper .
Tera's runtime system (not required - this is just in
case the OS/RT students are interested).
|
|
Simultaneous multithreading
|
Read section 6.9 (3rd edition) or section 3.5 (4th edition)
and the SMT paper
.
|
| | |
11 12/7 Dataflow Machines
|
Content of the final.
|
|
Dataflow machines and WaveScalar.
|
After looking them over, I don't like any of the papers on the early dataflow machines.
Just listen to the lecture. Read The WaveScalar Architecture
for an overview of WaveScalar and Area-Performance Trade-offs in
Tiled Dataflow Architectures for an implementation.
|
|
Course evaluations.
|
| | |
12 12/13
|
Final from 8:30pm to 10:30pm.
|