CSE410 Assembler

Assembler Overview
The basic operation of the assembler is to do the tedious chore of converting assembly language programs into machine code (bit strings). The assembler takes in one or more assembler files and produces an executable file. By convention, the assembler files have file extension .asm and the executable is named a.exe.

Here is an example assembler file:


        # This is a comment.
        
        # Labels don't need to occur before uses.  (They can even
        # be in a separate file, so long as it's in one of the files
        # assembled together.)

        addi   r1 r0 datalabel   # this immediate has value equal to the
                                 # location of the label
        lw     r2 r0 r1          # fetch data
        printr r2

        # for (r2=0; r2<10; r2++) { printf("%d", r2); }
        
        addi   r1 r0 $10   # r1 = 10
        xor    r2 r0 r0    # r2 = 0
top:    printr r2
        addi   r2 r2 $1    # r2++
        cmp    r0 r2 r1    # r2 < 10 ?
        blt    :top        # the leading ':' means "subtract current PC"
                           # from value of the label
        
        # immediates can be 'label' or ':label' or
        # hex or decimal constants
        addi   r1  r0 $99        # decimal 99
        addi   r1  r0 0xff       # hex 0x00ff (the immediate is 12 bits)
        addi   r1  r0 0xfff      # hex 0xffff (the immediate is sign extended)

        stop

datalabel:
        nop                # a single byte of value 0x04
        stop               # a single byte of value 0x00
        
Labels
If the first string on a line ends with a ':', it is a label. The value of the label is the address at which the next byte will be loaded into memory when the executable file being produced is run. So, if the first lines of the first file given to the assembler on the command line were
first:
      xor  r1 r1 r1
second:
the value of first would be 0x0010 (the default initial load address) and the value of second would be 0x0012 (since an xor instruction is encoded in 2 bytes).
Specifying Immediate Vlaues
Immediates can be specified in four formats:
Assembler Invocation
The assembler is launched like this:
$ ./assembler.pl a.asm b.asm c.asm ...
This assembles all files on the command line. It does so exactly as if the lines they contain were in one big file (meaning labels in one file can be used in another). It produces file a.exe. Specify the --output filename.exe switch to produce file filename.exe Specify the --help switch to see a full list of options.
Known Bugs