The final will cover material from the lectures, slides, assignments, midterm. The list of topics here is in bottom-up order of the material, not slide order, following the review sessions classes.
- Logic - gates: and, or, xor, not, [nor, nand, ...] - truth tables - boolean circuits - sequential and combinational components - synthesizing a boolean circuit from a truth table - snthesizing a truth table from a boolean circuit - half adder circuit - full adder built from half adders - d-latch circuit - role of the clock Instruction Set Architecture (ISA) - Why define an ISA? - ISA approaches: CISC (Complex Instruction Set Computer) and RISC (Reduced Instruction Set Computer) - RISC-V: an open source ISA - flexible, extensible - ISA designed to allow very fast implementations; power efficient implementations Elements of an ISA - CPU resources - registers: how many? how wide? - why would you want a lot of regisgters? why wouldn't you? - relationship of register width to size of memory (virtual address space size) - PC (program counter) - system use registers: e.g., memory context, trap handler address Instuctions - What can they do? - We know about RISC-style instruction sets - "load-store" architectures allow you to operate on values only in registers, not in memory - Memory adddressing modes - byte addressing - Instruction encoding in binary - How long are instructions? Why want short? Why don't want short? - Fixed length or variable length? - Decoding: opcode in fixed place - Does instruction set include instructions with immediate data? - what is "immediate data"? - branch addressing: PC-relative - leave out of instruction bits whose values are constant: e.g., low-order bit of branch address, since branch addresses must be even - Instructions are encoded in a small number of formats; e.g., R-type Base-Offset Addressing - if offset is 0, then you're using the address in the base register - you can compute any address into a register - natural for accessing arrays - base register holds address of start of array offset is computed from the index of the element you wan - for this to work, arrays must be allocated contiguously in memory Pointers: base-offset where offset is 0 - E.g., Instruction fetch: pc is the "base register" that points to the next instruction - In C: int x; int *p; p = &x; *p = 4; // sets x to 4 - In Java: Person p = new Preson(); p.getName(); - In any language, what does x = y mean? In particular, if I next modify y, is x modified as well? Higher Level Language Programs and Execution - compiling: translating from HLL to another language (e.g., machine code) in way that you compute the same thing in both - representing values using bits - unsigned int - place value representation - signed int - 2's complement representation - char - ASCII / Unicode - colors - red/green/blue - etc. Compiling / Generating Instructions - statement in higher level language is a specification for what to compute, not how to compute it - expressions: e.g., sum = 4 + 3 - ( 4 + 3) - many different sequences of hardware instructions can be used for that - if...then...else Procedure Call - have to get args from caller to callee - have to get return value to caller - exactly how this is done is a system-wide convention (depends on OS and CPU) - typicall leave args in registers, find return value there as well - caller has to tell callee where to branch to when done - need some way to store the PC somewhere the callee can find the value - on RISC-V, jal instruction - need some agreemenet about hwo save which registers - s0, s1, ... are saved by callee - t0, t1, ... are saved by caller - local variables: allocated on stack - amount of space required can be computed at compile time (so can generate code to do allocation) - global variables: in data segment - new'ed variables: in heap - compiler and system need to agree on what memory is going to look like at runtime: - stack at top - instructions at bottom - data segment above that - heap above that Virtual Memory - one VAS (Virtual Address Space) per process (running program) - simplifies compiler's job - provides protection of one process from others - Address translation - solves (external) fragmentation issue for memory allocation - fixed size pages of virtual memory and equal/fixed size frames of real memory - program running on CPU issues virtual addresses - MMU (memory map unit) translates from virtual address to physical address, and then reads/writes real memory - MMU uses a page table to do the translation - some data structure where the virtual page number is the key and the physical frame number is the value - PTE (page table entry) also stores access rights: r(ead), w(rite), (e)x(ecute) - and a "valid bit": if off, means the correspond page in virtual memory doesn't exist at all - Page Tables - multi-level page table is an approach to reducing the size of the page table - rather than proportional to the maximum possible size of the VAS, it's more like proportional to the amount of VAS actually used by the process CPU Implementation - (Yes, this isn't bottom-up, as this is way at the bottom) - Five phases: instruction fetch (IF), register read (REG), ALU (arithmentic/logic unit) (EX), data memory (MEM), and register write (WB - write-back) - The datapath has "pipes" that connect these components so that every flow of information among them needed by every instruction type can be performed - multiplexors are switches that "activate" the portion of the full datapath required for an individual instruction (and cause other paths to be ignored) - control figures out what the instrucgtion is and sets the switches appropriately - (Note: control controls more than just multiplexors, like the W bit on the register file) Pipelining / Parallelism - pipelining is used to execute a single instruction stream in parallel - the pipelined execution is required to get results identical to a sequential execution (executing only one instruction at a time) - General idea in parallelism: dependences - dependences restrict the order in which instructions from the sequential stream can be done, relative to each othe - RAW (read-after-write), WAR (write-after-read), and WAW (write-after-write) - Pipelining specific: - WAR nad WAW are not issues (because pipeline always respects those dependences) - RAW depences can cause hazards - hazards on data produced by an R-type instruction can be resolved by "forwarding" - the value needed on the read exists somewhere in the pipeline; send it from there to the ALU - if a LW is followed immediately by an R-type instruction that uses the just loaded value, that value is not available in time to be fed to the ALU - resolved introducing "a bubble" (a NOP (no operation)) into the pipeline to separate the LW and the R-type - who introduces the bubble? - the hardware may have complicated enough control to do so, or - the compiler does so if the hardware can't - Control Hazard: data hazard on the PC - don't know if a conditional branch will be taken or not for a few cycles after the branch is fetched, but need to fetch another instruction the very next cycle - approaches: (a) insert bubles -- do nothing until you figure out whether or not to take the branch (b) assume not taken -- just updated PC with PC+4 and keep fetching. If branch is taken, convert those incorrectly fetched instructions into NOPs (in the pipeline registers) (c) branch prediction: hardware "remembers" what happens (branch taken or not) for each value of the PC that fetches a branch, and uses that information to predict taken or not taken next time it fetches that branch NOT ON FINAL - Interrupts/Exceptions/Traps - Protected mode execution of CPU - Context switching