#### University of Washing # The Hardware/Software Interface CSE351 Winter 2013 ### **Basics of Machine Programming** #### Data & addressing Roadmap Integers & floats Machine code & C Java: x86 assembly car \*c = malloc(sizeof(car)); Car c = new Car(); programming c.setMiles(100); c->miles = 100; Procedures & c.setGals(17); c->gals = 17; stacks float mpg = get\_mpg(c); float mpg = Arrays & structs c.getMPG(); free(c); Memory & caches Processes Assembly get\_mpg: Virtual memory pushq %rbp language: %rsp, %rbp Memory allocation Java vs. C popq %rbp ret os: Machine 0111010000011000 100011010000010000000010 code: 1000100111000010 110000011111101000011111 Windows 8. Mac Computer system: ### Themes of CSE 351 #### Interfaces and abstractions - So far: some abstractions in C code - e.g. various data types: ints, floats, pointers, arrays - Today: what interface does the hardware present? #### Representation - So far: integers, floating point numbers, addresses - Understanding what's below the C abstractions makes you a better programmer #### Translation Today: how do we get from C code to machine code? What machine code should you expect to be produced from your C code? Control flow Winter 2013 Instruction Set Architecture University of Washingt ## **Today's Topics** - What is an ISA (Instruction Set Architecture)? - A brief history of Intel processors and architectures - C, assembly, machine code - x86 basics: registers Winter 2013 Instruction Set Architecture #### University of Washingt ### **Translation** What makes programs run fast? Winter 2013 Instruction Set Architecture ity of Washin ### **Instruction Set Architectures** - The ISA defines: - The system's state (e.g. registers, memory, program counter) - The instructions the CPU can execute - The effect that each of these instructions will have on the system state ## **Translation Impacts Performance** - The time required to execute a program depends on: - The program (as written in C, for instance) - The compiler: what set of assembler instructions it translates the C program into - The instruction set architecture (ISA): what set of instructions it makes available to the compiler - The hardware implementation: how much time it takes to execute an instruction - There is a complex interaction among these Winter 2013 Instruction Set Architecture **General ISA Design Decisions** ### Instructions - What instructions are available? What do they do? - How are they encoded? #### Registers - How many registers are there? - How wide are they? ### Memory How do you specify a memory location? ter 2013 Instruction Set Architecture 7 Winter 2013 Instruction Set Architecture 8 **University of Washing** ### x86 - Processors that implement the x86 ISA completely dominate the server, desktop and laptop markets - Evolutionary design - Backwards compatible up until 8086, introduced in 1978 - Added more features as time goes on - Complex instruction set computer (CISC) - Many different instructions with many different formats - But, only small subset encountered with Linux programs - (as opposed to Reduced Instruction Set Computers (RISC), which use simpler instructions) Winter 2013 Instruction Set Architecture ### **Intel x86 Processors** #### ■ Machine Evolution **486** 1989 1.9M ■ Pentium 1993 3.1M Pentium/MMX 1997 4.5M ■ PentiumPro 1995 6.5M Pentium III 1999 8.2M Pentium 4 2001 42M Core 2 Duo 2006 291M Core i7 731M 2008 #### Intel Core i7 #### Added Features - Instructions to support multimedia operations - Parallel operations on 1, 2, and 4-byte data - Instructions to enable more efficient conditional operations - More cores! Intel x86 Evolution: Milestones Name Date Transistors MHz ■ 8086 1978 29K 5-10 - First 16-bit processor. Basis for IBM PC & DOS - 1MB address space - 386 1985 275K 16-33 - First 32 bit processor, referred to as IA32 - Added "flat addressing" - Capable of running Unix - 32-bit Linux/gcc targets i386 by default - Pentium 4F 2005 230M 2800-3800 - First 64-bit Intel x86 processor, referred to as x86-64 Winter 2013 Instruction Set Architecture 10 University of Washingto ### More information - References for Intel processor specifications: - Intel's "automated relational knowledgebase": - http://ark.intel.com/ - Wikipedia: - http://en.wikipedia.org/wiki/List of Intel microprocessors nter 2013 Instruction Set Architecture 11 Winter 2013 Instruction Set Architecture 12 University of Washing 13 ### x86 Clones: Advanced Micro Devices (AMD) #### Historically - AMD has followed just behind Intel - A little bit slower, a lot cheaper #### Then - Recruited top circuit designers from Digital Equipment and other downward trending companies - Built Opteron: tough competitor to Pentium 4 - Developed x86-64, their own extension of x86 to 64 bits Winter 2013 Instruction Set Architecture ## Our Coverage in 351 #### ■ IA32 The traditional x86 #### x86-64 ■ The emerging standard – all lab assignments use x86-64! ### Intel's Transition to 64-Bit - Intel attempted radical shift from IA32 to IA64 (2001) - Totally different architecture (Itanium) and ISA than x86 - Executes IA32 code only as legacy - Performance disappointing - AMD stepped in with evolutionary solution (2003) - x86-64 (also called "AMD64") - Intel felt obligated to focus on IA64 - Hard to admit mistake or that AMD is better - Intel announces "EM64T" extension to IA32 (2004) - Extended Memory 64-bit Technology - Almost identical to AMD64! - Today: all but low-end x86 processors support x86-64 - But, lots of code out there is still just IA32 Winter 2013 Instruction Set Architecture 14 ### **Definitions** - Architecture: (also instruction set architecture or ISA) The parts of a processor design that one needs to understand to write assembly code - "What is directly visible to software" - Includes: instruction set specification, registers, memory model - Microarchitecture: Implementation of the architecture - Includes: CPU frequency, cache sizes, other implementation details - The ISA is an abstraction of the microarchitecture ter 2013 Instruction Set Architecture 15 Winter 2013 Instruction Set Architecture 16 ## **Assembly Programmer's View** ## **Compiling Into Assembly** Used for conditional branching #### C Code ``` int sum(int x, int y) { int t = x+y; return t; } ``` #### **Generated IA32 Assembly** ``` sum: pushl %ebp movl %esp,%ebp movl 12(%ebp),%eax addl 8(%ebp),%eax movl %ebp,%esp popl %ebp ret ``` procedures (we'll come back to that) #### Obtain with command ``` gcc -O1 -S code.c ``` Produces file code.s ### **Turning C into Object Code** - Code in files p1.c p2.c - Compile with command: gcc -O1 p1.c p2.c -o p - Use basic optimizations (-01) - Put resulting binary in file p ### Three Basic Kinds of Instructions - Perform arithmetic function on register or memory data - Transfer data between memory and register - Load data from memory into register - Store register data into memory - Transfer control Winter 2013 - Unconditional jumps to/from procedures - Conditional branches ter 2013 Instruction Set Architecture 19 Winter 2013 Instruction Set Architecture 20 ## **Assembly Characteristics: Data Types** - "Integer" data of 1, 2, 4 (IA32), or 8 (just in x86-64) bytes - Data values - Addresses (untyped pointers) - Floating point data of 4, 8, or 10 bytes - What about "aggregate" types such as arrays or structs? - No aggregate types, just contiguously allocated bytes in memory Winter 2013 Instruction Set Architecture 21 ## **Machine Instruction Example** x += y More precisely: 0x401046: int eax; int \*ebp; eax += ebp[2] 03 45 08 Object Code Assembly - 3-byte instruction ■ C Code: add two signed integers • "Long" words in GCC speak Same instruction whether signed %eax %eax -Return function value in %eax M[%ebp+8] Add two 4-byte integers or unsigned x: Register y: Memory t: Register Operands: Stored at address 0x401046 ### **Object Code** #### Code for sum 0x401040 <sum>: 0x550x89 0xe5 0x8b 0x45 0x0c · Each instruction 0x4580x0 Starts at address 0x89 0x5d0xc3 ### Total of 13 bytes 0x031, 2, or 3 bytes 0x4010400xec • Not at all obvious where each instruction starts and ends ### Assembler - Translates .s into .o - Binary encoding of each instruction - Nearly-complete image of executable code - Missing links between code in different files #### Linker - Resolves references between object files and (re)locates their data - Combines with static run-time libraries - E.g., code for malloc, printf - Some libraries are dynamically linked - Linking occurs when program begins execution Winter 2013 Instruction Set Architecture 22 ### **Disassembling Object Code** #### Disassamhlad | Disasserribled | | | | | |----------------|----------|------|----------------|--| | 00401040 | <_sum>: | | | | | 0: | 55 | push | %ebp | | | 1: | 89 e5 | mov | %esp,%ebp | | | 3: | 8b 45 0c | mov | 0xc(%ebp),%eax | | | 6: | 03 45 08 | add | 0x8(%ebp),%eax | | | 9: | 89 ec | mov | %ebp,%esp | | | b: | 5d | pop | %ebp | | | c: | c3 | ret | | | | | | | | | #### Disassembler #### objdump -d p - Useful tool for examining object code (man 1 objdump) - Analyzes bit pattern of series of instructions (delineates instructions) - Produces near-exact rendition of assembly code - Can be run on either p (complete executable) or p1.o/p2.o file 23 24 ## **Alternate Disassembly** ### Object #### Disassembled ``` 0x401040: 0x55 0x89 0xe5 0x8b 0x45 0x0c 0x03 0x45 0x08 0x89 0xec 0x5d 0xc3 ``` ``` 0x401040 <sum>: push %ebp 0x401041 <sum+1>: mov %esp, %ebp 0x401043 <sum+3>: mov 0xc(%ebp), %eax 0x401046 <sum+6>: add 0x8(%ebp), %eax 0x401049 <sum+9>: mov %ebp, %esp 0x40104b <sum+11>: pop %ebp ``` Within gdb debugger ``` gdb p disassemble sum (disassemble function) x/13b sum (examine the 13 bytes starting at sum) ``` Winter 2013 Instruction Set Architecture ## What Is A Register? - A location in the CPU that stores a small amount of data, which can be accessed very quickly (once every clock cycle) - Registers are at the heart of assembly programming - They are a precious commodity in all architectures, but especially x86 ### What Can be Disassembled? ``` % objdump -d WINWORD.EXE WINWORD.EXE: file format pei-i386 No symbols in "WINWORD.EXE". Disassembly of section .text: 30001000 <.text>: 30001000: 55 30001001: 8b ec %esp,%ebp mov 30001003: 6a ff $0xffffffff push 30001005: 68 90 10 00 30 push $0x30001090 $0x304cdc91 3000100a: 68 91 dc 4c 30 push ``` - Anything that can be interpreted as executable code - Disassembler examines bytes and reconstructs assembly source Winter 2013 Instruction Set Architecture 26 # **Integer Registers (IA32)** # Origin (mostly obsolete) finter 2013 Instruction Set Architecture 27 Winter 2013 Instruction Set Architecture 28 general purpose 25 **Integer Registers (IA32)** Origin (mostly obsolete) accumulate %eax %ax %ah %al %c1 counter %ecx %cx general purpose %edx %dx %dh %d1 %bx %bh %b1 base %ebx source %esi %si index destination %edi %di index stack%esp %sp pointer base %ebp %bp pointer 16-bit virtual registers (backwards compatibility) 29 # **Summary: Machine Programming** - What is an ISA (Instruction Set Architecture)? - Defines the system's state and instructions that are available to the software - History of Intel processors and architectures - Evolutionary design leads to many quirks and artifacts - C, assembly, machine code - Compiler must transform statements, expressions, procedures into lowlevel instruction sequences - x86 registers Winter 2013 - Very limited number - Not all general-purpose Winter 2013 Instruction Set Architecture 31 x86-64 Integer Registers 64-bits wide %r8 %rax %eax %r8d 8r9 %rbx %ebx %r9d%r10d %rcx **%ес**х %r10 %rdx %edx %r11 %r11d %rsi %r12 %r12d %esi %rdi %edi %r13 %r13d %esp 8r14 %r14d %rsp • Extend existing registers, and add 8 new ones; *all* accessible as 8, 16, 32, 64 bits. %r15 %r15d 30 Winter 2013 Instruction Set Architecture %rbp %ebp