CSE 413 Winter 2001

Assignment 7 -- D Code Generation

Electronic turnin of programming part due Thursday, March 8, by 10:00 pm.
Turnin receipt, test output, and writtem report due in class Friday, March 9.

You may work with a partner on this assignment. If you have a partner from the previous compiler assignments, you should continue working with that person. 

Overview

For this assignment, add code generation to your compiler. When the finished compiler is executed, it should open a D source program file, compile it, and produce a .asm text file containing an x86 assembly language version of the D program. Source lines from the D file, including comments and whitespace, should appear as comments in the .asm code, with each source line near the code generated from it.

Most of the work in this assignment consists of adding new code to the existing parser. While you might find it useful to create a few new classes for various data structures or utility routines, the bulk of the changes will be additions to existing parser methods.  A separate handout contains details about code generation for D programs.

Executing Compiled Code

The easiest way to run the compiled x86 code is to call it from a trivial C program as if it were an ordinary function.  That ensures that the stack is properly set up when the compiled code begins executing, and provides a convenient place to put functions that provide an interface between the compiled code and the external world (get and put).  Here are the details of how that works.

MASM Source File

Recall that for our purposes, every x86 MASM assembly language file should have this structure.

  .386
  .model flat, c
  public d$main
  extern get:near, put:near
  .code
    <your generated code goes here>
  end

The .386 and .model directives specify that this assembly program uses the flat, 32-bit address space and instruction set introduced with the 80386 many years ago.  The key directives for linking and running your program are public and extern.

C Bootstrap

The bootstrap program is named dtest.c (right click on the file name to download a copy). You should use this program to run your compiled code.  The bootstrap program is very small; here is a listing.

#include <stdio.h>

extern int d$main();   /* main function in compiled code */

/* Prompt for input, then return next integer from standard input. */
int get() {
  int k;
  printf("get: ");
  scanf("%d", &k);
  return k;
}

/* Write x to standard output with a title and yield value of x */
int put(int x) {
   printf("put: %d\n", x);
   return x;
}

/* Execute D program d$main and print value returned */
void main() {
   printf("\nValue returned from D main: %d.\n", d$main());
}

This test program takes advantage of a non-standard addition to C supported by Visual C++: the character $ is used in the function name d$main.

Executing MASM Code with Visual C++

Your compiler should produce a text file with a name ending in .asm containing the assembly language version of a D program. To execute this code, you will need to create a Visual C++ project.  Add to the project the C (not C++) main program (dtest.c) and your assembly language file. The resulting program can be run and debugged using Visual C++.

MASM is a complete programming environment for 16-bit assembly language programs. The assembler itself (ml.exe) can also assemble 32-bit code, but cannot link or execute it. For that, we need to use the regular Visual C++ environment. You may find it easiest to use the assembler from a command prompt window, but it is also possible to configure Visual C++ to use MASM to assemble the .asm file containing the translated D program. In any case, you'll need to use the normal Visual C++ 32-bit linker and debugger to execute the resulting program. Here's how to configure Visual C++ to use MASM to assemble and run your generated code:

  1. Create a new project in Visual C++ by selecting File>New; click on the Projects tab; select Win32 Console Application for the project type and Win32 for the project platform. Enter a name for your project and select the desired directory, then click OK.
  2. Add file dtest.c (which you get from us) and the .asm file generated by your compiler to the project. Pick Project>Select Files, then use the Insert Files into Project dialog to add the files. (You may have to change the type of files displayed to ``all files'' to see the  .asm file.)
  3. Configure the project to use MASM to assemble the .asm file. Select Project>Settings. In the dialog box that appears, be sure that Win32 Debug is displayed in the Settings: field. Expand the file list if needed, then select your .asm file -- and only this file. Click on the Custom Build tab.In the first line of the Build Command(s) field, enter the MASM command to be used to assemble the file. This includes the full path name for the MASM assembler, the assembly options to be used, and a macro that specifies the input path. If MASM has been installed in its default directory, this command should work:
      c:\masm611\bin\ml.exe /c /Cx /coff /Zi ${InputPath}

    If MASM has been installed in a different directory, you'll need to change the path name (c:\masm611) to whatever is appropriate  In the MSCC lab, ml.exe is located in c:\program files\98ddk\bin\win98\ml.exe.  (The executable file name ml.exe has a letter l in it, not a digit 1. The InputPath macro can be entered by clicking on button Files and selecting Input Path in the menu that appears.  If it doesn't work on your setup, you might find it useful to substitute the actual file name for ${Inputpath}.)

    Finally, you need to specify the output file name that MASM should use for the assembled object code. In the Output File(s) field, enter filename.obj, where filename is the name of your assembly source file (without the .asm suffix).

You should now be able to compile, link, and execute your program with the normal Visual C++ Build commands. Visual C++ will use MASM to assemble the .asm file as needed. You can even use the symbolic debugger to step through the assembly language code, set breakpoints in it, etc.

In the past some people have had trouble setting up MASM to work on their machine, but there doesn't seem to be any systematic reason (it's a Windows thing).  Try this early using a small hand-written .asm file to be sure you've got the configuration right, so it doesn't become a problem at the last minute.

Where to Start

As with previous assignments, it's helpful to figure out what piece of the compiler can be done first, without having to finish everything at once. Here are some suggestions.

The first thing that needs to be done is to figure out the details of stack frame layouts and the offsets assigned to parameters and local variables.  Then add code to the compiler to process parameters lists and variable declarations. Check your work by compiling some sample programs, print the local symbol tables, and verify that the offsets are correct.

There are several possible ways to go from here. Probably the most useful is to get function prologues and return working. That gives you enough to generate a working program with a d$main function that can be called and returns properly. (If you haven't implemented code generation for expressions yet, the value returned by main will be whatever random bits happen to be in eax, but that's ok.)

It's probably useful to tackle factor at this point, followed by assignment. That gives you enough to generate code for a=b; or x=17;. Extend this to handle code generation for arithmetic expressions, including function call. You've now got enough to execute straight-line programs, complete with input and output and functions -- even recursive ones.

Finally, look at code generation for conditions (rel-exp and bool-exp) and if and while statements. The issues here are getting the labels planted in the right place, and picking the right conditional jumps to the correct labels.

Test Programs

Several D test programs are available from the course web.  Feel free to create additional tests to demonstrate or debug your compiler.  If you create any new test programs, please feel free to share them with others or contribute them to the collection on the course web (send mail to cse413-staff@cs).

Project Turnin

Turn in your program electronically using this turnin form.

The electronic turnin should include all of the Java source code for your compiler.  It does not need to include test programs or output.

Turn in the following at the beginning of class, Friday, March 9: