CSE 413 Autumn 2007

x86 Overview

General

The 32-bit subset of the x86 instruction set is a fairly clean assembly language. It's difficult to tell this from the various Intel and Microsoft documentation, because the x86 is backwards compatible with 25-year old processors and operating systems, and all of that is described in these sources. This note is an attempt to highlight what we need for the compiler project. It describes the basic x86 instruction set and assembly language.

The standard Intel/Microsoft assembler notation is used here, since that is consistent with most of the available sources and published documentation. The GNU assembler, which we will use in our project, has differences in syntax, the principle one being the order of source and destination operands in instructions are reversed.

Full details about the x86 instruction set are available in the Intel Architecture reference manuals.  The two most useful are Volume I, which gives an overview of the architecture, and Volume II (now split into two volumes, IIA and IIB), which describes the individual instructions in excruciating detail.

Statement Format

An assembly language statement has the following format

   optLabel:  opcode  operands   ; comment

optLabel: is an optional label. Opcode and operands form the assembly language instructions. A semicolon indicates the beginning of a comment, which continues to the end of the source line.

The layout of an assembly language program is fairly free-form. A label may appear on a line by itself, as may a comment. An opcode and its operands should appear on the same line.

   aLabel:
        mov  eax,7
                      ; meaningful comment

There are no particular formatting requirements. For readability, it's best to start labels in the first column, indent opcodes and operands, and line these up in columns, but don't obsess over this in your compiler output.

x86 Memory Model

Memory is composed of bytes, each of which has 8 bits. A 32-bit integer is stored in 4 consecutive bytes, and it's best if the address of the integer or pointer is a multiple of 4 (there is a huge performance penalty on modern processors for fetching misaligned data). This alignment restriction is observed by default and you shouldn't have to worry about it. Just be sure not to alter it by, for example, adding or subtracting a value from the stack pointer that is not a multiple of 4. The address of an integer is the address of the low-order byte, i.e., an integer at location 1000 consists of bytes 1000-1003; where 1000 contains the low-order bits and 1003 contains the high-order bits. Again, you probably don't have to worry about this.

For historical reasons, the Intel x86 descriptions refer to 4-byte quantities as "double words".

x86 Processor Registers and the Fetch-Execute Cycle

There are 8 registers that can be specified in assembly-language instructions: eax, ebx, ecx, edx, esi, edi, ebp, and esp. Register esp points to the "top" (double) word currently in use on the stack (which grows down). It is unwise to use esp for anything else. Register ebp is typically used as a pointer to a location in the stack frame of the currently executing function.

In the CSE 413 compiler project, which will have a simple code generator, use register eax as the primary arithmetic register. Register ecx can be used in binary arithmetic operations to hold the second operand. In a industrial-strength compiler, it's important to optimize the use of registers, but that is beyond the scope of project for this course.

There are two registers that are used implicitly in x86 programs and cannot be referenced by name in an assembly language program. These are eip, the "instruction pointer" or "program counter"; and eflags, which contains bits indicating the result of arithmetic and compare instructions. (More details about eflags appears below in the description of conditional jump instructions.)

The basic operation of the processor is to repeatedly fetch and execute instructions.

   while (running) {
      fetch instruction beginning at address in eip;
      eip <- eip + length of instruction;
      execute fetched instruction;
   }

Execution continues sequentially unless execution of an instruction causes a jump, which is done by storing the target address in eip (this is how conditional and unconditional jumps, and function call and return are implemented).

Assembly Language Instructions and Operands

The format of a typical data manipulation instruction is

   opcode  dst,src

Normally, one of the operands may be a register, memory location, or an integer constant; the other should be a register.  In particular, it is normally not possible for both operands to refer to a memory location.

All of the variables in our compiled programs will be allocated in stack frames and referenced by addresses relative to ebp. So the following operand formats should be enough for our purposes (using mov as an example):

   mov  eax,17        ; store 17 in eax
   mov  eax,ecx       ; copy contents of ecx into eax
   mov  eax,[ebp-12]  ; copy contents of memory location with address 
                      ;   ebp-12 into eax
   mov  [ebp+8],eax   ; copy contents of eax into memory location with
                      ;   address ebp+8

Additional address formats are sometimes used, and sometimes additional information needs to be provided to the Microsoft assembler so it can pick the correct instruction.  The most likely way to run into this is with instructions that use pointers to access data, when there might not be enough information for the assembler to decide whether the pointer is referring to an 8-bit byte, a 16-bit word, or a 32-bit "double" word.  If the assembler complains about this, add the string "dword ptr" in front of a pointer operand to specify that you want to reference a 32-bit value.  (But don't bother with this unless you need it.)

   mov  dword ptr [ebp+8],eax   ; same as last example above,
                                ; but data length specified explicitly
This is not an issue with the GNU assembler, because different forms of instructions like mov have different names, like movl to indicate a 4-byte longword move.

Basic Data Manipulation Instructions

This list describes the basic x86 data manipulation instructions, including some that you probably won't need for the compiler project.

   mov  dst,src       ; dst <- src
   add  dst,src       ; dst <- dst+src
   sub  dst,src       ; dst <- dst-src
   imul dst,src       ; dst <- dst*src (32-bit product; dst must be
                      ;                 a register)
   idiv dst           ; divide edx:eax by dst (edx:eax = 64-bit signed
                      ;    integer); eax <- quotient, edx <- remainder
   cdq                ; edx:eax <- sign-extended copy of eax (useful prep for idiv)
   inc  dst           ; dst <- dst + 1
   dec  dst           ; dst <- dst - 1
   neg  dst           ; dst <- - dst (2's complement arithmetic negation)
   and  dst,src       ; dst <- dst & src (logical and)
   or   dst,src       ; dst <- dst | src (logical or)
   xor  dst,src       ; dst <- dst ^ src (logical exclusive or)
   test dst,src       ; same as and, but only sets cond codes; dst not changed
   not  dst           ; dst <- ~ dst (logical complement)
   shl  dst,count     ; dst <- dst shifted left count bits
   shr  dst,count     ; dst <- dst shifted right count bits, 0 fill
   sar  dst,count     ; dst <- dst shifted right count bits, sign bit fill
   shld dst,fill,cnt  ; dst <- dst shift left cnt bits; fill from fill (dbl shift)
   shrd dst,fill,cnt  ; dst <- dst shift right cnt bits; fill from fill (dbl shift)
   rol  dst,count     ; dst <- dst rotated left count bits
   ror  dst,count     ; dst <- dst rotated right count bits
   push src           ; esp <- esp-4; memory[esp] <- src
                      ;    (i.e., push src onto stack)
   pop  dst           ; dst <- memory[esp]; esp <- esp+4
                      ;    (i.e., pop top of stack into dst
                      ;     and remove it from the stack)
   lea  dst,src       ; dst <- address of src (including any indexing
                           or offset computation; dst must be a register)

Control Flow

The only control flow instructions at the assembly language level are unconditional and conditional jumps (goto).  Higher-level control structures like loops and conditional statements are synthesized from these.

Unconditional Jump

   jmp  label         ; jump to instruction prefixed with label

A jmp instruction is executed by storing the address of label in register eip, which is then used to fetch the next instruction to be executed.

Note: For both unconditional and conditional branches, the assembly language label need not be on the same source line as the target instruction. For example, if the code contains

  label1:
  ; a comment
  label2:  ; more commentary
  label3:  push eax

a jump to either label1, label2, or label3 will cause execution to continue at the push instruction.  This freedom can be helpful when writing the code generation part of the compiler, because it allows us to place labels in the assembly language program separately from the instructions that they label. 

Conditional Jumps

Many arithmetic instructions set bits in register eflags , as does the cmp (compare) instruction.

The following instructions can be used after an arithmetic instruction (add, sub, and, or, etc., but not imul or idiv) to conditionally branch depending on the result of the arithmetic operation.

  jz   label    ; jump to label if result == 0
  jnz  label    ; jump to label if result != 0
  jg   label    ; jump to label if result > 0
  jng  label    ; jump to label if result <= 0
  jge  label    ; jump to label if result >= 0
  jnge label    ; jump to label if result < 0
  jl   label    ; jump to label if result < 0
  jnl  label    ; jump to label if result >= 0
  jle  label    ; jump to label if result <= 0
  jnle label    ; jump to label if result > 0

More useful for our project are conditional jumps following a cmp (compare) instruction. The idea is to use cmp to set bits in eflags, then execute an appropriate jump that branches or not depending on the bits set by the cmp instruction.

A compare and jump works like this

   cmp  op1,op2      ; compare op1 to op2 and set bits in eflags
   jcc  label        ; jmp to label if the condition cc matches
                     ;   the current bit values in eflags

The various jcc instructions are defined as follows, when executed following a cmp instruction.

  je   label    ; jump to label if op1 == op2
  jne  label    ; jump to label if op1 != op2
  jg   label    ; jump to label if op1 > op2
  jng  label    ; jump to label if op1 <= op2
  jge  label    ; jump to label if op1 >= op2
  jnge label    ; jump to label if op1 < op2
  jl   label    ; jump to label if op1 < op2
  jnl  label    ; jump to label if op1 >= op2
  jle  label    ; jump to label if op1 <= op2
  jnle label    ; jump to label if op1 > op2

In several cases, a single machine instruction has more than one name in assembly language. For instance, jz and je are different names for a single instruction; the different names are intended to be used in different contexts in the hopes of making the code more readable or suggestive of the operation being done. Pick a name that seems appropriate and don't worry about it; but if you see the other name in a debugger, for example, you'll know what's going on.

Function Call and Return

The call and ret instructions provide the basic mechanism for transferring control. A call pushes the instruction pointer eip (which contains the address of the instruction following the call), then jumps. A ret pops the value on the top of the stack into eip, which means execution continues at the popped address, which should be the return address pushed by the call instruction.

  call label    ; push current instruction pointer and jump to label
                ; i.e., esp <- esp-4; memory[esp] <- eip;
                ;       eip <- memory address of label
  ret           ; pop top of stack into eip (jumping to that location)
                ; i.e., eip <- memory[esp]; esp <- esp+4

Win32 C Calling Conventions

The call and ret instructions allow us to jump to a function and get back, but additional conventions are needed to specify how registers are saved and restored, how arguments are transmitted, and how a function result is made available to the caller.

Caller

To call a function named, for example, f, the function arguments should be pushed onto the stack, then a call instruction is executed to branch to a label placed at the first instruction of the function. The instruction following a call should adjust esp to pop the arguments off the stack after the function returns (the called function does not pop the arguments). The scheme is

   push arguments
   call f
   add  esp,"size of arguments in bytes"

In a program with only integer and pointer variables, the quantity added to esp after the function returns should be 4 times the number of arguments.  Each argument is a 32-bit quantity occupying 4 bytes.

The standard Win32 C calling sequence requires pushing the function arguments from right to left.  For the CSE 413 project, it is probably easier to push the arguments from left to right.  That will work fine provided that you do not attempt to call an external C routine that has more than one argument. 

Callee

A function needs to begin with code to save the contents of some of the registers it is going to use, set up ebp to point to a location in the stack frame, and allocate space for local variables by decrementing the stack pointer appropriately.  The code generated for a return statement should restore the stack and associated registers to the state that existed just before the first instruction of the function was executed.

Here is a template for a function named f

   f: push ebp         ; save ebp on stack
      mov  ebp,esp     ; set frame pointer to current stack location
      sub  esp,"size of locals in bytes"  ; allocate local variables
      
      <code generated for the body of function f goes here>

      mov  esp,ebp     ; restore esp to value before local variables
                       ;   were allocated (i.e., pop locals)
      pop  ebp         ; restore ebp to value saved at beginning of f
      ret              ; return to caller (instruction following call)

When the function returns, registers esp, ebp, ebx, esi, and edi must have their original values. The above code properly saves and restores esp and ebp. In your project, you shouldn't need to use any of the other registers in the must-be-preserved list, but if you do, you need to save and restore them (use a push at the beginning of the function and a pop at the end).

Function Result

For functions that return an int or pointer result, the value should be in eax at the time the function returns.

Example

Here is a short example of a Win32 C function call and function definition, with source lines included as comments in the assembly language code.  The code generated by your compiler won't necessarily look like this; in particular, you will probably push arguments from left to right (not right to left), which affects the offsets of parameters in the function stack frame, and there are likely to be lots of extra push and pop instructions, among other things, in the compiler-generated code.

Caller:

   ; n = sumOf(17,42);
   push 42            ; push arguments (Win32 convention, R-to-L)
   push 17
   call sumOf         ; push return address and jump to label sumOf
   add  esp,8         ; pop arguments after return
   mov  [ebp+??],eax  ; store function result in n (where ?? is the
                      ;   offset of n in the caller's stack frame)

Function sumOf

 ; int sumOf(int x, int y) {
 ;    int a, b;
 sumOf:                 ; function label
     push ebp           ; function prologue (see above)
     mov  ebp,esp
     sub  esp,8         ; allocate local variables
 ; a = y;
     mov  eax,[ebp+12]  ; load parameter y
     mov  [ebp-4],eax   ; store local variable a
 ; b = x;
     mov  eax,[ebp+8]   ; load x
     mov  [ebp-8],eax   ; store b
 ; return a+b
     mov  eax,[ebp-4]   ; calculate a+b
     push eax
     mov  eax,[ebp-8]
     pop  ecx
     add  eax,ecx       ; result in eax
     mov  esp,ebp       ; function return epilogue
     pop  ebp
     ret