CSE P 501 Project V - Code Generation

Due: By Sunday, December 13, at 11:00 pm. No late assignments will be accepted after that time so that all groups have the same cutoff regardless of when their project conference is scheduled, and so the staff has the opportunity to look at projects before conferences.

Turn in your project using the assignment drop box as before (links on the project page).

Overview

The purpose of this final part of the project is to complete the compiler by adding code generation and implementing the runtime support needed to executed the generated x86 assembly code.  We suggest that you use the simple code generation strategy outlined in class to be sure of finishing the project, although you are free to do something different (i.e., better) if you have time. Whatever strategy you use, remember that simple, correct, and working is better than clever, complex, and not done. You also will get more out of the project if you have simple implementations of most of the language rather than an optimized implementations of a small part. 

Implementation Strategy 

Code generation incorporates many more-or-less independent tasks. One of the first things to do is figure out what to implement first, what to put off, and how to test your code as you go along. The following sections outline one reasonable way to break the job down into smaller parts. We suggest that you tackle the job in roughly this order so you can implement the central part of the code generator first, and put off more peripheral topics until the core parts are done. Your experience implementing the first parts of the code generator should also give you insights that will ease implementation of the rest. 

Integer Expressions & System.out.println

Implement code generation for arithmetic expressions involving integer constants, and the MiniJava "System.out.println" statement, plus the basic prologue and return statement code for the MiniJava main method. This will give you enough to compile and run main programs that print out the value of an integer expression.

Object Creation and Method Calls

Next, try implementing objects with methods, but without instance variables, method parameters, or local variables. This includes:

Once you've gotten this far, you should be able to run programs that create objects and call their methods. These methods can contain System.out.println statements to verify that objects are created and that evaluation and printing of arithmetic expressions works in this context.

Variables, Parameters, & Assignment 

Next try adding:

Control Flow

This involves:

Classes and Instance Variables

Add the remaining code for classes that don't extend other classes, including calculating object sizes and assigning offsets to instance variables, and access to instance variables in expressions and as the targets of assignments. At this point, you should be able to compile and execute substantial programs.

Extended Classes

The main issue here is generating the right object layouts and method tables for extended classes, including handling method overiding properly. Once you've done that, dynamic dispatching of method calls should work, and you will have almost all of MiniJava working.

Arrays

We suggest you leave this until the end, since you can get everything else working without it.

The Rest 

Whatever is left, including any extensions you've added to the project, or items like storable Boolean values, which are not essential to the rest of the project.

C Bootstrap

As discussed in class, the easiest way to run the compiled x86 code is to call it from a trivial C program.  That ensures that the stack is properly set up when the compiled code begins, and provides a convenient place to put other functions that provide an interface between the compiled code and the external world. 

We have provided a small bootstrap program, boot.c, which we suggest you start with. Feel free to embelish this code as you wish. In particular, you may find that it is sometimes easier to have your compiler generate code that calls a C runtime function to do something instead of generating the full sequence of instructions directly in the .asm file.

Executing MASM Code with Visual Studio

To execute the .asm file produced by your compiler, you will need to create a Visual Studio project with the C (not C++) main program and the assembler code from the compiler. The resulting program can be run and debugged using Visual Studio.

The MASM assembler ml.exe has been included in Visual Studio starting with VS .NET through Visual Studio 2005 Professional, and it should exist in VS 2008, although we haven't verified that. MASM can assemble 32-bit code, which can then be linked and executed with other programs, in particular C code. You may find it easiest to use the assembler from a command line, but it is also fairly easy to configure Visual Studio to use MASM to assemble the .asm file containing the compiled program. Here are instructions that have worked in the past.

  1. Create a new Win32 Console Application project in Visual Studio
  2. Add file boot.c (or whatever main program you have created) and the .asm file generated by your compiler to the project. (You may have to change the type of files displayed in the dialog to ``all files'' to see the .asm file.)
  3. Configure the project to use MASM to assemble the .asm file. Select Project>Settings. In the dialog box that appears, be sure that Win32 Debug is displayed in the Settings: field. Expand the file list if needed, then select your .asm file -- and only this file. Click on the Custom Build tab.In the first line of the Build Command(s) field, enter the MASM command to be used to assemble the file.
      ml.exe /c /Cx /coff /Zi ${InputPath}

    (The executable file name ml.exe has a letter l in it, not a digit 1. The InputPath macro can be entered by clicking on button Files and selecting Input Path in the menu that appears.)

    Finally, you need to specify the output file name that MASM should use for the assembled object code. In the Output File(s) field, enter filename.obj, where filename is the name of your assembly source file (without the .asm suffix).

You should now be able to compile, link, and execute your program with the normal Visual Studio Build commands. Visual Studio will use MASM to assemble the .asm file as needed. You can use the symbolic debugger to step through the assembly language code, set breakpoints in it, etc.

Please post any corrections or suggestions on how to do this better on the class discussion board.

Executing x86 Code with gcc

If your compiler targets the GNU assembler as, you can compile and execute your generated code and the bootstrap program using gcc, and you can use gdb to debug it at the x86 instruction level.

What to Hand In

Your online turnin should include:

Your group will meet with the instructor after the project is done to discuss it.  This isn't a formal presentation (i.e., don't waste time preparing PowerPoint slides or anything like that).  At, or before this meeting, you should hand in brief written report summarizing what your compiler does, what was implemented and what was omitted, any extra features you added, and, if you are working in a group, a summary of how the work was divided and who was responsible for what.

If you are working with others, you should turn in only one assignment per group, and all group members should plan to attend the meeting if possible. More details about the writeup and this meeting will be posted separately.