CSE 413 Winter 2002

x86 and MASM Overview

General

The 32-bit subset of the x86 instruction set is a fairly clean assembly language. It's difficult to tell this from the various Intel and Microsoft documentation, because the x86 is backwards compatible with 20-year old processors and operating systems, and all of that is described in these sources. This note is an attempt to highlight what we need for the compiler project. It describes the basic x86 instruction set and basic information about the Microsoft Assembler (MASM) that is installed in the lab, which we will use to assemble the generated code.

Full details about the x86 instruction set are available in the Intel Architecture reference manuals. The two most useful are Volume I, which gives an overview of the architecture, and Volume II, which describes the individual instructions in excruciating detail.

MASM is no longer available as a separate product from Microsoft, but there are several ways to get it. More information can be found on the index page for the course notes on compilers.

There are other assemblers for the x86. If you use one of them, you'll have to make the appropriate adjustments in some of the syntax and other details given here.

Statement Format

An assembly language statement has the following format

   optLabel:  opcode  operands   ; comment

optLabel: is an optional label. opcode and operands form the assembly language instructions. A semicolon indicates the beginning of a comment, which continues to the end of the source line.

The layout of an Assembly Language program is fairly free-form. A label may appear on a line by itself, as may a comment. An opcode and its operands should appear on the same line.

   aLabel:
        mov  eax,7
                      ; meaningful comment

There are no particular formatting requirements. For readability, it's best to start labels in the first column, indent opcodes and operands, and line them up in columns, but don't obsess over this in your compiler output.

x86 Memory Model

Memory is composed of bytes, each of which has 8 bits. A 32-bit integer is stored in 4 consecutive bytes, and it's best if the address of the integer or pointer is a multiple of 4 (there is a huge performance penalty on modern processors for fetching misaligned data). This alignment restriction is the default setup and you shouldn't have to worry about it. Just be sure not to alter it by, for example, adding or subtracting a value from the stack pointer that is not a multiple of 4. The address of an integer is the address of the low-order byte, i.e., an integer at location 1000 consists of bytes 1000-1003; where 1000 contains the low-order bits and 1003 contains the high-order bits. Again, you probably don't have to worry about this.

For historical reasons, the Intel x86 descriptions refer to 4-byte quantities as "double words".

x86 Processor Registers and Fetch-Execute Cycle

There are 8 registers that can be specified in assembly-language instructions: eax, ebx, ecx, edx, esi, edi, ebp, and esp. Register esp points to the "top" word currently in use on the stack (which grows down). It is unwise to use esp for anything else. Register ebp is typically used as a pointer to a location in the stack frame of the currently executing function.

In the CSE 413 compiler project, which will have a simple code generator, use register eax as the primary arithmetic register. Register ecx can be used in binary arithmetic operations to hold the second operand. In a industrial-strength compiler, it's important to optimize the use of registers, but that is beyond the scope of project for this course.

There are two registers that are used implicitly in x86 programs and cannot be referenced by name in an assembly language program. These are eip, the "instruction pointer" or "program counter"; and eflags, which contains bits indicating the result of arithmetic and compare instructions. (More details about eflags appears below in the description of conditional jump instructions.)

The basic operation of the processor is to repeatedly fetch and execute instructions.

   while (running) {
      fetch instruction beginning at address in eip;
      eip <- eip + length of instruction;
      execute fetched instruction;
   }

Execution continues sequentially unless execution of an instruction causes a jump, which is done by storing the target address in eip (this is how conditional and unconditional jumps, and function call and return are implemented).

Assembly Language Instructions and Operands

The format of a typical data manipulation instruction is

   opcode  dst,src

Normally, one of the operands may be a register, memory location, or an integer constant; the other should be a register. In particular, it is normally not possible for both operands to refer to a memory location.

All of the variables in our compiled programs will be allocated in stack frames and referenced by addresses relative to ebp. So the following operand formats should be enough for our purposes (using mov as an example):

   mov  eax,17        ; store 17 in eax
   mov  eax,ecx       ; copy contents of ecx into eax
   mov  eax,[ebp-12]  ; copy contents of memory location with address 
                      ;   ebp-12 into eax
   mov  [ebp+8],eax   ; copy contents of eax into memory location with
                      ;   address ebp+8

Additional address formats are sometimes used, and sometimes additional information needs to be provided so the assembler can pick the correct instruction. The only time we might run into this is with instructions that use pointers to access data, when there isn't enough information for the assembler to decide whether the pointer is referring to an 8-bit byte, a 16-bit word, or a 32-bit "double" word. If the assembler complains about this, add the string "dword ptr" in front of a pointer operand to specify that you want to reference a 32-bit value. (But don't bother with this unless you need it.)

   mov  dword ptr [ebp+8],eax   ; same as last example above,
                                ; but data length specified explicitly

Basic Data Manipulation Instructions

This list describes the basic x86 data manipulation instructions, including some that you probably won't need for the compiler project.

   mov  dst,src       ; dst <- src
   add  dst,src       ; dst <- dst+src
   sub  dst,src       ; dst <- dst-src
   imul dst,src       ; dst <- dst*src (32-bit product; dst must be
                      ;                 a register)
   idiv dst           ; divide edx:eax by dst (edx:eax = 64-bit signed
                      ;    integer); eax <- quotient, edx <- remainder
   cdq                ; edx:eax <- sign-extended copy of eax (useful for idiv)
   inc  dst           ; dst <- dst + 1
   dec  dst           ; dst <- dst - 1
   neg  dst           ; dst <- - dst (2's complement arithmetic negation)
   and  dst,src       ; dst <- dst & src (logical and)
   or   dst,src       ; dst <- dst | src (logical or)
   xor  dst,src       ; dst <- dst ^ src (logical exclusive or)
   test dst,src       ; same as and, but only sets cond codes; dst not changed
   not  dst           ; dst <- ~ dst (logical complement)
   shl  dst,count     ; dst <- dst shifted left count bits
   shr  dst,count     ; dst <- dst shifted right count bits, 0 fill
   sar  dst,count     ; dst <- dst shifted right count bits, sign bit fill
   shld dst,fill,cnt  ; dst <- dst shift left cnt bits; fill from fill (dbl shift)
   shrd dst,fill,cnt  ; dst <- dst shift right cnt bits; fill from fill (dbl shift)
   rol  dst,count     ; dst <- dst rotated left count bits
   ror  dst,count     ; dst <- dst rotated right count bits
   push src           ; esp <- esp-4; memory[esp] <- src
                      ;    (i.e., push src onto stack)
   pop  dst           ; dst <- memory[esp]; esp <- esp+4
                      ;    (i.e., pop top of stack into dst
                      ;     and remove it from the stack)
   lea  dst,src       ; dst <- address of src (including any indexing
                           or offset computation; dst must be a register)

Control Flow

The only control flow instructions at the assembly language level are unconditional and conditional jumps (goto). Higher-level control structures like loops and conditional statements are synthesized from these.

Unconditional Jump

   jmp  label         ; jump to instruction prefixed with label

Note: For both unconditional and conditional branches, the assembly language label need not be on the same source line as the target instruction. For example, if the code contains

  label1:
  ; a comment
  label2:  ; more commentary
  label3:  push eax

a jump to either label1, label2, or label3 will cause execution to continue at the push instruction. This freedom can be helpful when writing the code generation part of the compiler, because it allows us to place labels in the assembly language program independently of the instructions that they label.

Conditional Jumps

Many arithmetic instructions set bits in register eflags , as does the cmp (compare) instruction.

The following instructions can be used after an arithmetic instruction (add, sub, and, or, etc., but not imul or idiv) to conditionally branch depending on the result of the arithmetic operation.

  jz   label    ; jump to label if result == 0
  jnz  label    ; jump to label if result != 0
  jg   label    ; jump to label if result > 0
  jng  label    ; jump to label if result <= 0
  jge  label    ; jump to label if result >= 0
  jnge label    ; jump to label if result < 0
  jl   label    ; jump to label if result < 0
  jnl  label    ; jump to label if result >= 0
  jle  label    ; jump to label if result <= 0
  jnle label    ; jump to label if result > 0

More useful for our project are conditional jumps following a cmp (compare) instruction. The idea is to use cmp to set bits in eflags, then execute an appropriate jump that branches or not depending on the bits set by the cmp instruction.

A compare and jump works like this

   cmp  op1,op2      ; compare op1 to op2 and set bits in eflags
   jcc  label        ; jmp to label if the condition cc matches
                     ;   the current bit values in eflags

The various jcc instructions are defined as follows, when executed following a cmp instruction.

  je   label    ; jump to label if op1 == op2
  jne  label    ; jump to label if op1 != op2
  jg   label    ; jump to label if op1 > op2
  jng  label    ; jump to label if op1 <= op2
  jge  label    ; jump to label if op1 >= op2
  jnge label    ; jump to label if op1 < op2
  jl   label    ; jump to label if op1 < op2
  jnl  label    ; jump to label if op1 >= op2
  jle  label    ; jump to label if op1 <= op2
  jnle label    ; jump to label if op1 > op2

Function Call and Return

The call and ret instructions provide the basic mechanism for transferring control. A call pushes the instruction pointer eip (which contains the address of the next instruction following the call), then jumps. A ret pops the value on the top of the stack into eip, which means execution continues at the popped address, which should be the return address pushed by the call instruction.

  call label    ; push current instruction pointer and jump to label
                ; i.e., esp <- esp-4; memory[esp] <- eip;
                ;       eip <- memory address of label
  ret           ; pop top of stack into eip (jumping to that location)
                ; i.e., eip <- memory[esp]; esp <- esp+4

Win32 C Calling Conventions

The call and ret instructions allow us to jump to a function and get back, but additional conventions are needed to specify how registers are saved and restored, how arguments are transmitted, and how a function result is made available to the caller.

Caller

To call a function named, for example, f, the function arguments should be pushed onto the stack, then a call instruction is executed to branch to a label placed at the first instruction of the function. The instruction following a call should adjust esp to pop the arguments off the stack after the function returns (the called function does not pop the arguments). The scheme is

   push arguments
   call f
   add  esp,"size of arguments in bytes"

In a program with only integer and pointer variables, the quantity added to esp after the function returns should be 4*(# of arguments). Each argument is a 32-bit quantity occupying 4 bytes.

The standard Win32 C calling sequence requires pushing the function arguments from right to left. For the CSE 413 project, it is probably easier to push the arguments from left to right. That will work fine provided that you do not attempt to call an external C routine that has more than one argument.

Callee

A function needs to begin with code to save the contents of some of the registers it is going to use, set up ebp to point to the stack frame, and allocate space for local variables by decreasing the stack pointer appropriately. The code generated for a return statement should restore the stack and associated registers to the state that existed just before the first instruction of the function was executed.

Here is a template for a function named f

   f: push ebp         ; save ebp on stack
      mov  ebp,esp     ; set frame pointer to current stack location
      sub  esp,"size of locals in bytes"  ; allocate local variables
      
      <code generated for the body of function f goes here>

      mov  esp,ebp     ; restore esp to value before local variables
                       ;   were allocated (i.e., pop locals)
      pop  ebp         ; restore ebp to value saved at beginning of f
      ret              ; return to caller (instruction following call)

When the function returns, registers esp, ebp, ebx, esi, and edi must have their original values. The above code properly saves and restores esp and ebp. You shouldn't need to use any of the other registers in the must-be-preserved list, but if you do, you need to save and restore them (use a push at the beginning of the function and a pop at the end).

Function Result

For functions that return an int or pointer result, the value should be in eax at the time the function returns.

Example

Here is a short example of a Win32 C function call and function definition, with source lines included as comments in the assembly language code. The code generated by your compiler won't necessarily look like this; in particular, you will probably push arguments from left to right (not right to left), which affects the offsets of parameters in the function stack frame, and there are likely to be lots of extra push and pop instructions, among other things, in the compiler-

generated code.

Caller:

   ; n = sumOf(17,42);
   push 42            ; push arguments (Win32 convention, R-to-L)
   push 17
   call sumOf         ; push return address and jump to label sumOf
   add  esp,8         ; pop arguments after return
   mov  [ebp+??],eax  ; store function result in n (where ?? is the
                      ;   offset of n in the caller's stack frame)

Function sumOf

 ; int sumOf(int x, int y) {
 ;    int a, b;
 sumOf:                 ; function label
     push ebp           ; function prologue (see above)
     mov  ebp,esp
     sub  esp,8         ; allocate local variables
 ; a = y;
     mov  eax,[ebp+12]
     mov  [ebp-4],eax
 ; b = x;
     mov  eax,[ebp+8]
     mov  [ebp-8],eax
 ; return a+b
     mov  eax,[ebp-4]   ; calculate a+b
     push eax
     mov  eax,[ebp-8]
     pop  ecx
     add  eax,ecx       ; result in eax
     mov  esp,ebp       ; function return epilogue
     pop  ebp
     ret

MASM Assembly Language Code

For our purposes, every x86 MASM assembly language file should have this structure.

  .386
  .model flat, c
  public d$main
  extern get:near, put:near
  .code

    <your generated code goes here>

end

The .386 directive at the beginning specifies that the program uses the 32-bit instruction and register set introduced in the 80386, which is still the core instruction set of current x86 processors. The .model directive specifies that code should use a 32-bit non-segmented memory address space (i.e., Win32, not DOS).

The public and extern directives specify connections between routines (labels) defined in one file and used in another. Labels defined in this assembly language file and available to callers in other files must be listed following the keyword public. Labels defined in other files and referenced here are specified as extern. The get and put labels will provide us with access to simple routines for integer I/O - more details later. The ":near" option specifies that we're using a simple flat memory model, i.e., not calling functions in other memory segments. (The default is "far", which is left over from DOS days when programs had to be chopped up into multiple segments, each no larger than 64K. The 32-bit flat memory model, in contrast, includes all of the program's code in one large segment, and jumps in that segment are considered "near", even if the target address is a gigabyte away.)