The final will cover material from the lectures, slides, assignments, midterm. The list of topics here is in bottom-up order of the material, not slide order, following the review sessions classes.
- Logic
- gates: and, or, xor, not, [nor, nand, ...]
- truth tables
- boolean circuits
- sequential and combinational components
- synthesizing a boolean circuit from a truth table
- snthesizing a truth table from a boolean circuit
- half adder circuit
- full adder built from half adders
- d-latch circuit
- role of the clock
Instruction Set Architecture (ISA)
- Why define an ISA?
- ISA approaches: CISC (Complex Instruction Set Computer) and RISC (Reduced Instruction Set Computer)
- RISC-V: an open source ISA
- flexible, extensible
- ISA designed to allow very fast implementations; power efficient implementations
Elements of an ISA
- CPU resources
- registers: how many? how wide?
- why would you want a lot of regisgters? why wouldn't you?
- relationship of register width to size of memory (virtual address space size)
- PC (program counter)
- system use registers: e.g., memory context, trap handler address
Instuctions
- What can they do?
- We know about RISC-style instruction sets
- "load-store" architectures allow you to operate on values only in registers, not in memory
- Memory adddressing modes
- byte addressing
- Instruction encoding in binary
- How long are instructions? Why want short? Why don't want short?
- Fixed length or variable length?
- Decoding: opcode in fixed place
- Does instruction set include instructions with immediate data?
- what is "immediate data"?
- branch addressing: PC-relative
- leave out of instruction bits whose values are constant: e.g., low-order bit of branch address,
since branch addresses must be even
- Instructions are encoded in a small number of formats; e.g., R-type
Base-Offset Addressing
- if offset is 0, then you're using the address in the base register
- you can compute any address into a register
- natural for accessing arrays
- base register holds address of start of array offset is computed from the index of the element you wan
- for this to work, arrays must be allocated contiguously in memory
Pointers: base-offset where offset is 0
- E.g., Instruction fetch: pc is the "base register" that points to the next instruction
- In C: int x; int *p; p = &x; *p = 4; // sets x to 4
- In Java: Person p = new Preson(); p.getName();
- In any language, what does x = y mean? In particular, if I next modify y, is x modified as well?
Higher Level Language Programs and Execution
- compiling: translating from HLL to another language (e.g., machine code) in way that you compute the same thing
in both
- representing values using bits
- unsigned int - place value representation
- signed int - 2's complement representation
- char - ASCII / Unicode
- colors - red/green/blue
- etc.
Compiling / Generating Instructions
- statement in higher level language is a specification for what to compute, not how to compute it
- expressions: e.g., sum = 4 + 3 - ( 4 + 3)
- many different sequences of hardware instructions can be used for that
- if...then...else
Procedure Call
- have to get args from caller to callee
- have to get return value to caller
- exactly how this is done is a system-wide convention (depends on OS and CPU)
- typicall leave args in registers, find return value there as well
- caller has to tell callee where to branch to when done
- need some way to store the PC somewhere the callee can find the value
- on RISC-V, jal instruction
- need some agreemenet about hwo save which registers
- s0, s1, ... are saved by callee
- t0, t1, ... are saved by caller
- local variables: allocated on stack
- amount of space required can be computed at compile time (so can generate code to do allocation)
- global variables: in data segment
- new'ed variables: in heap
- compiler and system need to agree on what memory is going to look like at runtime:
- stack at top
- instructions at bottom
- data segment above that
- heap above that
Virtual Memory
- one VAS (Virtual Address Space) per process (running program)
- simplifies compiler's job
- provides protection of one process from others
- Address translation
- solves (external) fragmentation issue for memory allocation
- fixed size pages of virtual memory and equal/fixed size frames of real memory
- program running on CPU issues virtual addresses
- MMU (memory map unit) translates from virtual address to physical address, and then reads/writes real memory
- MMU uses a page table to do the translation
- some data structure where the virtual page number is the key and the physical frame number is the value
- PTE (page table entry) also stores access rights: r(ead), w(rite), (e)x(ecute)
- and a "valid bit": if off, means the correspond page in virtual memory doesn't exist at all
- Page Tables
- multi-level page table is an approach to reducing the size of the page table
- rather than proportional to the maximum possible size of the VAS, it's more like proportional to the
amount of VAS actually used by the process
CPU Implementation
- (Yes, this isn't bottom-up, as this is way at the bottom)
- Five phases: instruction fetch (IF), register read (REG), ALU (arithmentic/logic unit) (EX), data memory (MEM),
and register write (WB - write-back)
- The datapath has "pipes" that connect these components so that every flow of information among them needed by
every instruction type can be performed
- multiplexors are switches that "activate" the portion of the full datapath required for an individual instruction
(and cause other paths to be ignored)
- control figures out what the instrucgtion is and sets the switches appropriately
- (Note: control controls more than just multiplexors, like the W bit on the register file)
Pipelining / Parallelism
- pipelining is used to execute a single instruction stream in parallel
- the pipelined execution is required to get results identical to a sequential execution (executing only
one instruction at a time)
- General idea in parallelism: dependences
- dependences restrict the order in which instructions from the sequential stream can be done,
relative to each othe
- RAW (read-after-write), WAR (write-after-read), and WAW (write-after-write)
- Pipelining specific:
- WAR nad WAW are not issues (because pipeline always respects those dependences)
- RAW depences can cause hazards
- hazards on data produced by an R-type instruction can be resolved by "forwarding"
- the value needed on the read exists somewhere in the pipeline; send it from there to the ALU
- if a LW is followed immediately by an R-type instruction that uses the just loaded value, that
value is not available in time to be fed to the ALU
- resolved introducing "a bubble" (a NOP (no operation)) into the pipeline to separate the LW and the R-type
- who introduces the bubble?
- the hardware may have complicated enough control to do so, or
- the compiler does so if the hardware can't
- Control Hazard: data hazard on the PC
- don't know if a conditional branch will be taken or not for a few cycles after the branch is fetched,
but need to fetch another instruction the very next cycle
- approaches:
(a) insert bubles -- do nothing until you figure out whether or not to take the branch
(b) assume not taken -- just updated PC with PC+4 and keep fetching. If branch is taken, convert
those incorrectly fetched instructions into NOPs (in the pipeline registers)
(c) branch prediction: hardware "remembers" what happens (branch taken or not) for each value of the
PC that fetches a branch, and uses that information to predict taken or not taken next time it fetches
that branch
NOT ON FINAL
- Interrupts/Exceptions/Traps
- Protected mode execution of CPU
- Context switching