Full details about the x86 instruction set are available in the Intel Architecture reference manuals. The two most useful are Volume I, which gives an overview of the architecture, and Volume II, which describes the instructions in excruciating detail.
MASM is no longer available as a separate product from Microsoft, but there are two ways to get it. If you are a registered owner of Visual Studio, MASM is available as a "fulfillment option" for a small fee. It is also available for free as part of the Windows 98 device driver kit on Microsoft's web site. The full MASM package is a complete development environment for 16-bit MS-DOS programs(!). All we need from it is the ML.exe file, which is the executable part of the assembler. More details about how to use it with Visual C++ will be provided in another handout.
There are other assemblers for the x86. If you use one of them, you'll have to make the appropriate adjustments in some of the syntax and other details given here.
An assembly language statement has the following format
optLabel: opcode operands ; comment
optLabel:
is an optional label. opcode
and operands
form the assembly
language instructions. A semicolon indicates the beginning of a
comment, which continues to the end of the source line.
The layout of an Assembly Language program is fairly free-form. A
label may appear on a line by itself, as may a comment. An opcode
and
its operands should appear on the same line.
aLabel: mov eax,7 ; meaningful comment
There are no particular formatting requirements. For readability, it's best to start labels in the first column, indent opcodes and operands, and line them up in columns if you can (but don't obsess over this).
Memory is composed of bytes, each of which has 8 bits. A 32-bit integer is stored in 4 consecutive bytes, and it's best if the address of the integer or pointer is a multiple of 4 (there is a big performance penalty on modern processors for fetching mis-aligned data). This alignment restriction is the default setup and you shouldn't have to worry about it. Just be sure not to alter it by, for example, adding or subtracting a value from the stack pointer that is not a multiple of 4. The address of an integer is the address of the low-order byte, i.e., an integer at location 1000 consists of bytes 1000-1003; where 1000 contains the low-order bits and 1003 contains the high-order bits. Again, you probably don't have to worry about this.
For historical reasons, the Intel x86 descriptions refer to 4-byte quantities as "double words".
There are 8 registers that can be specified in assembly-language
instructions: eax
, ebx
, ecx
, edx
,
esi
, edi
, ebp
, and esp
. Register
esp
points to the "top" word currently in use on the stack
(which grows down). It is unwise to use esp
for anything else.
Register ebp
is typically used as a pointer into the stack frame of
the currently executing function.
In the CSE 432 compiler project, which will have a simple code generator, use register eax
as the primary arithmetic
register. Register ecx
can be used in binary arithmetic operations to hold
the second
operand. In a industrial-strength compiler, it's important to optimize the use of registers, but that
is beyond the scope of project for this course.
There are two registers that are used implicitly in x86 programs and cannot be referenced
by name in an assembly language program.
These are eip
, the "instruction pointer" or "program counter"; and
eflags
, which contains bits indicating the result of arithmetic and
compare instructions. (More details about eflags
appears below in the description
of conditional jump instructions.)
The basic operation of the processor is to repeatedly fetch and execute instructions.
while (running) { fetch instruction beginning at address in eip; eip <- eip + length of instruction; execute fetched instruction; }
Execution continues sequentially unless execution of an instruction
causes a jump, which is done by altering eip
(this is how conditional
and unconditional jumps, and function call and return are implemented).
The format of a typical data manipulation instruction is
opcode dst,src
One of the operands may be a register, memory location or an integer constant; the other generally should be a register.
All of the variables in our programs are allocated in stack frames and are referenced by addresses relative to ebp
. So the
following operand formats should be enough for our purposes (using mov
as an example):
mov eax,17 ; store 17 in eax mov eax,ecx ; copy contents of ecx into eax mov eax,[ebp-12] ; copy contents of memory location with address ; ebp-12 into eax mov [ebp+8],eax ; copy contents of eax into memory location with ; address ebp+8
Additional address formats are needed for pointers, for example.
This list summarizes the main data manipulation instructions available, including some that won't be needed for the CSE 432 project. Use what you need.
mov dst,src ; dst <- src add dst,src ; dst <- dst+src sub dst,src ; dst <- dst-src imul dst,src ; dst <- dst*src (32-bit product; dst must be ; a register) idiv dst ; divide edx:eax by dst (edx:eax = 64-bit signed ; integer); eax <- quotient, edx <- remainder cdq ; edx:eax <- sign-extended copy of eax (useful for idiv) inc dst ; dst <- dst + 1 dec dst ; dst <- dst - 1 neg dst ; dst <- - dst (2's complement negation) and dst,src ; dst <- dst & src (logical and) or dst,src ; dst <- dst | src (logical or) xor dst,src ; dst <- dst ^ src (logical exclusive or) test dst,src ; same as and, but only sets cond codes; dst not changed not dst ; dst <- ~ dst (logical complement) shl dst,count ; dst <- dst shifted left count bits shr dst,count ; dst <- dst shifted right count bits, 0 fill sar dst,count ; dst <- dst shifted right count bits, sign bit fill shld dst,fill,cnt ; dst <- dst shift left cnt bits; fill from fill (dbl shift) shrd dst,fill,cnt ; dst <- dst shift right cnt bits; fill from fill (dbl shift) rol dst,count ; dst <- dst rotated left count bits ror dst,count ; dst <- dst rotated right count bits push src ; esp <- esp-4; memory[esp] <- src ; (i.e., push src onto stack) pop dst ; dst <- memory[esp]; esp <- esp+4 ; (i.e., pop top of stack into dst ; and remove it from the stack) lea dst,src ; dst <- address of src (including any indexing or offset computation; dst must be a register)
jmp label ; jump to instruction prefixed with label
Note: For both unconditional and conditional branches, the MASM Assembly Language label need not be on the same source line as the target instruction. For example, if the code contains
label1: ; a comment label2: ; more commentary label3: push eax
a jump to either label1
, label2
, or label3
will cause execution to
continue at the push instruction. This freedom can be useful when writing
the code generation part of the compiler, since it is possible to generate
labels in the assembly program independently of the instructions that they
label.
Many arithmetic instructions set bits in register eflags
, as
does the cmp
(compare) instruction.
The following instructions
can be used after an arithmetic instruction (add
, sub
,
and
, or
, etc., but *not* imul
or idiv
)
to conditionally branch depending on the result of the arithmetic operation.
jz label ; jump to label if result == 0 jnz label ; jump to label if result != 0 jg label ; jump to label if result > 0 jng label ; jump to label if result <= 0 jge label ; jump to label if result >= 0 jnge label ; jump to label if result < 0 jl label ; jump to label if result < 0 jnl label ; jump to label if result >= 0 jle label ; jump to label if result <= 0 jnle label ; jump to label if result > 0
More useful for our project are conditional jumps following a cmp
(compare) instruction. The idea is to use cmp
to set bits in eflags
,
then execute an appropriate jump that branches or not depending on the
flags set by the cmp
instruction.
A compare and jump works like this
cmp op1,op2 ; compare op1 to op2 and set bits in eflags jcc label ; jmp to label if the condition cc matches ; the current bit values in eflags
The various jcc
instructions are defined as follows
je label ; jump to label if op1 == op2 jne label ; jump to label if op1 != op2 jg label ; jump to label if op1 > op2 jng label ; jump to label if op1 <= op2 jge label ; jump to label if op1 >= op2 jnge label ; jump to label if op1 < op2 jl label ; jump to label if op1 < op2 jnl label ; jump to label if op1 >= op2 jle label ; jump to label if op1 <= op2 jnle label ; jump to label if op1 > op2
The call
and ret
instructions provide the basic mechanism for
transferring control. A call
pushes the
instruction pointer eip
(which contains the address of the instruction
following the call), then jumps. A ret
pops the value
on the top of the stack into eip
, which means execution continues at
the popped address.
call label ; push current instruction pointer and jump to label ; i.e., esp <- esp-4; memory[esp] <- eip; ; eip <- memory address of label ret ; pop top of stack into eip (jumping to that location) ; i.e., eip <- memory[esp]; esp <- esp+4
The call
and ret
instructions provide the basic mechanism to jump to a
function and get back, but additional conventions are needed to
specify how registers are saved and restored, how arguments are
transmitted, and how a function result is transferred back to the
caller.
To call a function named, for example, f
, the function arguments are
pushed on the stack, then a call
instruction is executed to branch to
a label placed at the first instruction of the function. The instruction
following a call should adjust esp
to pop the arguments off the stack
after the function returns (the called function does not pop the arguments). The scheme is
push arguments call f add esp,"size of arguments in bytes"
In a program with only integer and pointer variables, the quantity added to
esp
after the function returns
should be 4*(# of arguments). Each argument is a 32-bit quantity occupying 4 bytes.
The standard Win32 C calling sequence requires pushing the function arguments from right to left. For the CSE 413 project, it is probably easier to push the arguments from left to right. That will work fine provided that you do not attempt to call an external C routine that has more than one argument.
A function needs to begin with code to save the contents of some of the registers
it is going to use, set up ebp
to point to the stack frame, and
allocate space for local variables by decreasing the stack pointer
appropriately. The code generated for a return
statement
restores the conditions expected by the caller.
Here is a template for a function named f
f: push ebp ; save ebp on stack mov ebp,esp ; set frame pointer to current stack location sub esp,"size of locals in bytes" ; allocate local variables <code generated for the body of function f goes here> mov esp,ebp ; restore esp to value before local variables ; were allocated (i.e., pop locals) pop ebp ; restore ebp to value saved at beginning of f ret ; return to caller (instruction following call)
When the function returns, registers esp
, ebp
, ebx
,
esi
, and edi
should
have their original values. The above code properly saves and
restores esp
and ebp
. You shouldn't need to use any of the other
registers mentioned, but if you do, you need to save and restore them
(with a push
at the beginning of the function and a pop
at the end).
For functions that return int
or pointer values,
the value should be in eax
at the time the function returns.
Here is a short example of a C function call and function definition, with
source lines included as comments in the assembly language code. The code
generated by your compiler won't necessarily look like this; in particular, you
will probably want to push arguments from left to right (not right to left),
which affects the offsets of parameters in the function code, and there are
likely to be lots of extra generated push
and pop
instructions, among other things.
; n = sumOf(17,42); push 42 ; push arguments (C convention, R-to-L) push 17 call sumOf ; push return address and jump to label sumOf add esp,8 ; pop arguments after return mov [ebp+?],eax ; store function result in n (where ? is the ; offset of n in the caller's stack frame)
; int sumOf(int x, int y) { ; int a, b; sumOf: ; function label push ebp ; function prologue (see above) mov ebp,esp sub esp,8 ; allocate local variables ; a = y; mov eax,[ebp+12] mov [ebp-4],eax ; b = x; mov eax,[ebp+8] mov [ebp-8],eax ; return a+b mov eax,[ebp-4] ; a+b (value in eax) push eax mov eax,[ebp-8] pop ecx add eax,ecx ; result in eax mov esp,ebp ; function return epilogue pop ebp ret
For our purposes, every x86 MASM assembly language file should have this structure.
.386 .model flat, c public d_main extern get:near, put:near .code
<your generated code goes here>
end
The .386
directive at the beginning specifies that the program uses
the 32-bit instruction and register set introduced in the 80386, which is still
the core instruction set of the current Pentium processors. The .model
directive specifies that code should use a 32-bit
non-segmented memory address space (i.e., Win32, not DOS).
The public
and extern
directives specify
connections between routines (labels) defined in one file and used in
another. Labels defined in
this assembly language file and available to callers in other files must be
listed following the keyword public
. Labels defined in other files and referenced
here are specified as extern
. The get
and put
labels
will provide us with
access to simple routines for integer I/O - more details later. The ":near
" option specifies that
we're using a simple flat memory model, i.e., not calling functions in
other memory segments. (The default is "far
", which is left over from
DOS days when programs had to be chopped up into multiple segments,
each no larger than 64K. The 32-bit flat memory model, in contrast, includes all of
the program's code in one large segment, and jumps in that segment
are considered "near", even if they are a gigabyte away.)