**Iniversity of Washingto** # **Machine Programming I: Basics** - What is an ISA (Instruction Set Architecture) - A brief istory of Intel processors and architectures - Intel processors (Wikipedia) - Intel <u>microarchitectures</u> - C, assembly, machine code - Assembly basics: registers, operands, move instructions Autumn 201 Instruction Set Architecture University of Washington ### What should the HW/SW interface be? Autumn 2012 University of Washingt # **General ISA Design Decisions** ### Instructions - What instructions are available? What do they do? - How are they encoded? ### Registers - How many registers are there? - How wide are they? ### Memory How do you specify a memory location? utumn 2012 University of Washingto # **Executing Programs Fast!** - The time required to execute a program depends on: - The program (as written in C, for instance) - The compiler: what set of assembler instructions it translates the C program into - The ISA: what set of instructions it made available to the compiler - The hardware implementation: how much time it takes to execute an instruction - There is a complex interaction among these utumn 2012 **Jniversity of Washingto** ### **Intel x86 Processors** - Totally dominate the server/laptop market - Evolutionary design - Backwards compatible up until 8086, introduced in 1978 - Added more features as time goes on - Complex instruction set computer (CISC) - Many different instructions with many different formats - But, only small subset encountered with Linux programs - Hard to match performance of Reduced Instruction Set Computers (RISC) - But, Intel has done just that! Autumn 2012 Instruction Set Architecture niversity of Wash ### Intel x86 Evolution: Milestones Name Date Transistors MHz ■ 8086 1978 29K 5-10 - First 16-bit processor. Basis for IBM PC & DOS - 1MB address space - 386 1985 275K 16-33 - First 32 bit processor, referred to as IA32 - Added "flat addressing" - Capable of running Unix - 32-bit Linux/gcc uses no instructions introduced in later models ■ Pentium 4F 2005 230M 2800-3800 - First 64-bit processor - Meanwhile, Pentium 4s (Netburst arch.) phased out in favor of "Core" line Autumn 2012 **University of Washingt** ### **Intel x86 Processors** ### **■** Machine Evolution | <b>486</b> | 1989 | 1.9M | |------------------------------|------|------| | Pentium | 1993 | 3.1M | | ■ Pentium/MMX | 1997 | 4.5M | | PentiumPro | 1995 | 6.5M | | Pentium III | 1999 | 8.2M | | Pentium 4 | 2001 | 42M | | <ul><li>Core 2 Duo</li></ul> | 2006 | 291M | #### Added Features - Instructions to support multimedia operations - Parallel operations on 1, 2, and 4-byte data, both integer & FP - Instructions to enable more efficient conditional operations ### ■ Linux/GCC Evolution Very limited impact on performance --- mostly came from hardware Autumn 2012 Instruction Set Architecture University of Washin # x86 Clones: Advanced Micro Devices (AMD) ### Historically - AMD has followed just behind Intel - A little bit slower, a lot cheaper #### Then - Recruited top circuit designers from Digital Equipment and other downward trending companies - Built Opteron: tough competitor to Pentium 4 - Developed x86-64, their own extension to 64 bits nn 2012 Instruction Set Architectu ### Intel's 64-Bit - Intel attempted radical shift from IA32 to IA64 - Totally different architecture (Itanium) and ISA - Executes IA32 code only as legacy - Performance disappointing - AMD stepped in with evolutionary solution - x86-64 (now called "AMD64") - Intel felt obligated to focus on IA64 - Hard to admit mistake or that AMD is better - 2004: Intel announces EM64T extension to IA32 - Extended Memory 64-bit Technology - Almost identical to x86-64! - Meanwhile: EM64T slow to be adopted by OS, programs Autumn 201 Instruction Set Architectur university of wa ### Our Coverage in 351 - IA32 - The traditional x86 - x86-64/EM64T - The emerging standard all Labs use 64-bit platform! Autumn 2012 Instruction Set Architecture University of Wa ### **Definitions** - Architecture: (also instruction set architecture or ISA) The parts of a processor design that one needs to understand to write assembly code ("what is directly visible to SW") - Microarchitecture: Implementation of the architecture - How about CPU frequency? - The number of registers? - Is the cache size "architecture"? Autumn 2012 Instruction Set Architecture # **Compiling Into Assembly** ### C Code ``` int sum(int x, int y) { int t = x+y; return t; } ``` ### **Generated IA32 Assembly** ``` sum: push1 %ebp mov1 %esp,%ebp mov1 12(%ebp),%eax add1 8(%ebp),%eax mov1 %ebp,%esp pop1 %ebp ret ``` ### **Obtain with command** ``` gcc -0 -S code.c ``` Produces file code.s utumn 2012 Instruction Set Architecture ### **Three Basic Kinds of Instructions** - Perform arithmetic function on register or memory data - Transfer data between memory and register - Load data from memory into register - Store register data into memory - **■** Transfer control (control flow) - Unconditional jumps to/from procedures - Conditional branches Autumn 2012 Instruction Set Architectur # **Assembly Characteristics: Data Types** - "Integer" data of 1, 2, 4 (IA32), or 8 (just in x86-64) bytes - Data values - Addresses - Floating point data of 4, 8, or 10 bytes - What about aggregate types such as arrays or structures? Autumn 2012 Instruction Set Architecture Jniversity of Washingto # **Assembly Characteristics: Data Types** - "Integer" data of 1, 2, 4 (IA32), or 8 (just in x86-64) bytes - Data values - Addresses (unsigned pointers) - Floating point data of 4, 8, or 10 bytes - No aggregate types such as arrays or structures - Just contiguously allocated bytes in memory lutumn 2012 Instruction Set Architecture niversity of Wash ### **Object Code** ### Code for sum 0x401040 <sum>: 0x55 0x89 0xe5 0x8b 0x45 • Total of 13 bytes 0x0c 0x03 • Each instruction 1, 2, or 3 bytes 0x45 1, 2, or 3 bytes 0x08 • Starts at address 0x5d obvious where each instruction starts and ends #### Assembler - Translates .s into .o - Binary encoding of each instruction - Nearly-complete image of executable code - Missing links between code in different files - Linker - Resolves references between files - Combines with static run-time libraries - E.g., code for malloc, printf - Some libraries are dynamically linked - Linking occurs when program begins execution Autumn 2012 Instruction Set Architecture University of Washington # **Example** #### addl 8(%ebp),%eax Similar to expression: More precisely: int eax; int \*ebp; eax += ebp[2] 0x401046: 03 45 08 #### C Code Add two signed integers #### Assembly - Add 2 4-byte integers - "Long" words in GCC speak - Same instruction whether signed or unsigned - Operands: **x:** Register %eax y: Memory M[%ebp+8] t: Register %eax - Return function value in %eax ### ■ Object Code - 3-byte instruction - Stored at address 0x401046 Autumn 201 Instruction Set Architecture 2.1 # **Disassembling Object Code** ### Disassembled | 00401040 | <_sum>: | | | |----------|----------|------|----------------| | 0: | <br>55 | push | %ebp | | 1: | 89 e5 | mov | %esp,%ebp | | 3: | 8b 45 0c | mov | 0xc(%ebp),%eax | | 6: | 03 45 08 | add | 0x8(%ebp),%eax | | 9: | 89 ec | mov | %ebp,%esp | | b: | 5d | pop | %ebp | | c: | c3 | ret | | | d: | 8d 76 00 | lea | 0x0(%esi),%esi | ### Disassembler objdump -d p - Useful tool for examining object code - Analyzes bit pattern of series of instructions (delineates instructions) - Produces approximate rendition of assembly code - Can be run on either a .out (complete executable) or .o file Autumn 2012 Instruction Set Architecture **Iniversity of Washington** ### **Alternate Disassembly** ### Object #### Disassembled ``` 0x401040: 0x55 0x89 0xe5 0x8b 0x45 0x0c 0x03 0x45 0x08 0x89 0xec 0x5d 0xc3 ``` ``` 0x401040 <sum>: push %ebp 0x401041 <sum+1>: mov %esp,%ebp 0x401043 <sum+3>: mov 0xc(%ebp),%eax 0x401046 <sum+6>: add 0x8(%ebp),%eax 0x401049 <sum+9>: mov %ebp,%esp 0x40104b <sum+11>: pop %ebp 0x40104c <sum+12>: ret 0x40104d <sum+13>: lea 0x0(%esi),%esi ``` ### Within gdb Debugger ``` gdb p disassemble sum (disassemble procedure) x/13b sum (examine the 13 bytes starting at sum) ``` Autumn 201 Instruction Set Architecture ### What Can be Disassembled? - Anything that can be interpreted as executable code - Disassembler examines bytes and reconstructs assembly source Autumn 2012 Instruction Set Architecture x86-64 Integer Registers | %rax | %eax | |------|------| | %rbx | %ebx | | %rcx | %есх | | %rdx | %edx | | %rsi | %esi | | %rdi | %edi | | %rsp | %esp | | %rbp | %ebp | | | | | %r8 | %r8d | |------|-------| | %r9 | %r9d | | %r10 | %r10d | | %r11 | %r11d | | %r12 | %r12d | | %r13 | %r13d | | %r14 | %r14d | | %r15 | %r15d | ■ Twice the number of registers, accessible as 8, 16, 32, 64 bits Autumn 201 Instruction Set Architecture Jniversity of Washingto # x86-64 Integer Registers: Usage Conventions | %rax | Return value | | |------|---------------|--| | %rbx | Callee saved | | | %rcx | Argument #4 | | | %rdx | Argument #3 | | | %rsi | Argument #2 | | | %rdi | Argument #1 | | | %rsp | Stack pointer | | | %rbp | Callee saved | | | %r8 | Argument #5 | |------|--------------| | %r9 | Argument #6 | | %r10 | Caller saved | | %r11 | Caller Saved | | %r12 | Callee saved | | %r13 | Callee saved | | %r14 | Callee saved | | %r15 | Callee saved | Autumn 2012 Instruction Set Architecture