CSE logo University of Washington Department of Computer Science & Engineering
 CSE 378, Winter 2007
 Machine Organization and Assembly Language Programming
  CSE Home  About Us    Search    Contact Info 

Homework #2: Linking / Loading

Due: Wednesday, 1/17/2007, beginning of class

FAQ

Consult this Wiki page to post a question, or to help others out by answering a question.

A page of additional information that should help with this assignment has been created.

Assignment Overview

There is no programming to be done in this assignment. Instead, you'll be performing the tasks of the linker and loader, by hand, on a tiny program. The goal is to make sure you understand what linkers and loaders do, and why patching is necessary.

The C Programs

The program consists of two C files, main.c and sub.c, shown here:
// main.c

// This line declares a C global variable.
// it is static, meaning it is allocated at load time
// (in the memory area pointed at by $gp).
int total;

// This line tells the C compiler that a variable named
// 'result' will be declared as a global static is some
// other file (so, when compiling this routine don't
// allocate space for result, as some other file will do that).
extern int result;

// main() should return an int.  But, because we haven't
// yet gotten to how to call subroutines or how to pass back
// values, we're going to gloss over how those things are
// done in this assignment.  In particular, I won't return
// a value from main().  (C actually doesn't care, it turns
// out...)
int main() {

  total = 0;
  sub();  // sub will set the value of the external variable, result
  total = total + result;

}

// sub.c

// result is a static, global.  (See the comments at the top
// of main.c)
int result;

// Because we haven't reached the material on subroutine call
// and function value return, this routine simply puts its result
// in a global variable (result) and main fetches it from there.
// P.S. NEVER DO THIS - it's horrible, horrible coding practice.
//      Return a value.
void sub() {
  result = 10;
}
This program doesn't do anything useful, obviously; it's the minimum plausible code that involves compiling two files and then linking.

It's important to remember when doing this assignment that the compiler completely compiles main.c by looking only at main.c, and similarly for sub.c. That is, the compiler does not examine at any code other that that contained in the single file it is currently compiling.

The Corresponding Assembler Programs

Below are main.s and sub.s, implementations of the C files above.

We have not yet gotten to procedure call and function value return, the details of which are irrelevant to this assignment. For that reason, I created the .s files by hand, performing procedure call and return in an unrealistically simplified way. The code here works, for this tiny example, but shouldn't be used as an indication of how those things are actually implemented.

# main.s

# static variables in the C program go in the .data
# section of the assembler program
.data
total:  .word   0      # initialized to 0, gratuitously

.text
        .global main   # this symbol will be available to the linker

#       assembler                 corresponding main.c code
#       ---------------------     --------------------------
main:   sw      $0, total($gp)    # total = 0 ;
        jal     sub               # sub() ;
        lw      $t0, total($gp)   #    fetch total
        lw      $t1, result($gp)  #    fetch result
        add     $t0, $t0, $t1     #    + total result
        sw      $t0, total($gp)   # total = total + result
        jr      $ra               # '}'

# sub.s

# static variables in the C program go in the .data
# section of the assembler program
.data
result:  .word   0      # initialized to 0, gratuitously

.text
        .global sub   # this symbol will be available to the linker

#       assembler                 corresponding main.c code
#       ---------------------     --------------------------
sub:    ori     $t0, $0, 10       #     10
        sw      $t0, result($gp)   # result = 10 ;
        jr      $ra               # '}'
We didn't talk about jal in class, and the details of what it does aren't important to understanding this assignment. You can just think of it as a j instruction (which is in fact basically what it does). What is important is what jal's 32-bit encoding is: it's identical to that for j, except that it has a distinct opcode.

What To Do

Answer the questions contained in the sections that follow. All of them are about what is required to completely encode assembler instructions into the 32-bit binary format required for execution by the processor. More particularly, they're about what information is missing at each step, resulting in instructions that are at best only partially encoded.

Remember that when the loader is done, and execution is about to begin, all instructions must be fully encoded. (So, any information not available must be filled in by some step. Part of the issue is which step.)

Assembling

Suppose main.s and sub.s is each assembled, resulting in corresponding .o files.
  1. Which instructions are not completely encoded in the .o files, and (very briefly) why?
    For all questions in this assignment, to identify an instruction use a format like this:
        main: jr $ra
    meaning the 32-bit instruction resulting from the jr in main.s.

Linking

Suppose the two .o files are linked like this:
$ ceblink main.o sub.o
  1. Which instructions, if any, are not completely encoded in the .exe file that is output, and (briefly) why?

Loading

Suppose a user requests execution of the .exe produced by the linker, and that the loader finds room for it in memory starting at address 0x1000.
  1. For each instruction identified as only partially encoded in your answers to questions 1 and 2, give the 32-bit encoding of the instruction as loaded into memory (i.e., at the point when execution is just about to begin). Your answers should be in hex.

Note

Note: The link command above won't actually work. That is intentional - I'd like you to work out the answer by hand, not just copy the output of the tools to a piece of paper.

Resources

The Cebollita Instruction Encoding page shows the encodings for all the instructions used above. They are also listed in the official MIPS Architecture Manual. The Cebollita page is a lot smaller; the MIPS manual has more thorough discussion.

Instruction encodings, and explanations, can also be found in the readings from the text.

Turnin

Hand in your answers, on paper, at the beginning of class on Wednesday.


Creative Commons License
This work is licensed under a Creative Commons Attribution-Share Alike 2.5 License.
Department of Computer Science & Engineering
University of Washington
Box 352350
Seattle, WA  98195-2350
(206) 543-1695 voice, (206) 543-2969 FAX
[comments to zahorjan]