CSE P 501 Project IV - Code Generation

Due: By Sunday, December 11, at 11:00 pm. No late assignments will be accepted after that time so that all groups have the same cutoff regardless of when their project conference is scheduled, and so the staff has the opportunity to look at projects before conferences.

Turn in your project using the assignment drop box as before (links on the project page).

Overview

The purpose of this final part of the project is to complete the compiler by adding code generation and implementing the runtime support needed to execute the generated x86-64 assembly code.  We suggest that you use the simple code generation strategy outlined in class to be sure of finishing the project, although you are free to do something different (i.e., better) if you have time. Whatever strategy you use, remember that simple, correct, and working is better than clever, complex, and not done. You will get much more out of the project if you have correct, simple implementations of most of the language rather than a broken, optimized implementations of a fragment.

Implementation Strategy 

Code generation incorporates many more-or-less independent tasks. One of the first things to do is figure out what to implement first, what to put off, and how to test your code as you go. The following sections outline one reasonable way to break the job down into smaller parts. We suggest that you tackle the job in roughly this order so you can get something working quickly, and add to it incrementally until you're done. Your experience implementing the first parts of the code generator also should give you insights that will ease implementation of the rest. 

Integer Expressions & System.out.println

Implement code generation for arithmetic expressions involving integer constants, the MiniJava System.out.println statement, and the basic prologue and return code for the MiniJava main method. This will give you enough to compile and run main programs that print the value of an integer expression.

Object Creation and Method Calls

Next, try implementing objects with methods, but without instance variables, method parameters, or local variables. This includes:

Once you've gotten this far, you should be able to run programs that create objects and call their methods. These methods can contain System.out.println statements to verify that objects are created and that evaluation and printing of arithmetic expressions works in this context.

Variables, Parameters, & Assignment 

Next try adding:

Suggestion: part of the complexity of the project is figuring out how to handle the register-based parameter conventions for methods in 64-bit code. We suggest you do this incrementally - first a single simple parameter, then multiple parameters, then parameters that themselves include a method call.

Control Flow

This involves:

Classes and Instance Variables

Add the remaining code for classes that don't extend other classes, including calculating object sizes and assigning offsets to instance variables, and access to instance variables in expressions and as the targets of assignments. At this point, you should be able to compile and execute substantial programs.

Extended Classes

The main issue here is generating the right object layouts and method tables for extended classes, including handling method overiding properly. Once you've done that, dynamic dispatching of method calls should work, and you will have almost all of MiniJava working.

Arrays

We suggest you leave this until the end, since you can get everything else working without it.

The Rest 

Whatever is left, including any extensions you've added to the project, or items like storable Boolean values, which are not essential to the rest of the project.

C Bootstrap

As discussed in class, the easiest way to run the compiled x86-64 code is to call it from a trivial C program.  That ensures that the stack is properly set up when the compiled code begins, and provides a convenient place to put other functions that provide an interface between the compiled code and the external world. 

We have provided a small bootstrap program, boot.c, that we suggest you start with. Feel free to embelish this code as you wish. In particular, you may find that it is sometimes easier to have your compiler generate code that calls a C runtime function to do something instead of generating the full sequence of instructions directly in the assembly code. You can add such functions to the .c file.

Executing x86-64 Code With gcc

If your group is generating Linux/GNU assembler code, it should produce a .s output file containing x86-64 assembly language code suitable as input to the GNU assembler as. (It's fine if you just write the compiled code to standard output.) You can compile and execute your generated code and the bootstrap program using gcc, and you can use gdb to debug it at the x86-64 instruction level.

There is a sample assembler file demo.s that demonstrates the linkage between boot.c and assembler code. This demo file does not contain a full MiniJava program, and the code produced by your compiler will be different, but it should give you a better idea of how the setup is designed to work. Use this and boot.c as input to gcc to generate an executable demo program. You can also use gcc to generate additional examples of x86-64 assembly code. If foo.c contains C code, gcc -S foo.c will compile it and create a file foo.s with the corresponding x86-64 code.

If your compiler generates Linux/GNU assembler code, it should compile and run on 64-bit linux systems. Our baseline system for testing is attu, which is the same setup as the linux workstations in the CSE labs. If you would like to run this environment your own machine, the CSE lab provides a downloadable virtual machine image that you can use (Take a Lab Machine Home!). See the CSE Home Virtual Machines page for details. The image contains some very large files, over 5GB total. You will probably want to download it in the labs over a fast connection and copy it to a USB drive if you need to transport a copy to your home machine.

Executing MASM Code with Visual Studio

If you are using Visual Studio to develop your compiler you can still generate Linux/GNU assembler code and execute it as described above. If you wish, however, you can instead generate Microsoft MASM assembler code and run it under Visual Studio. Instructions we have used in previous years are located on this page. We have not had a chance to verify the instructions using 64-bit code on Visual Studio, but they should be reasonably close. Please post corrections, hints, and questions on the discussion board.

If you use MASM you will want to follow the strategy outlined above for gcc, using the boot.c program to start execution and transfer control to your compiled code. When you submit your project, you should include instructions for running your compiler to translate MiniJava programs to assembler and to run the generated code.

What to Hand In

As usual, if you are using Java to implement your compiler, run ant clean, then bundle up your compiler directory in a tar file and turn that in. The code should run on attu when built with ant. In addition, please be sure that you do the following:

If you are implementing your compiler in a different language, you should do something similar for your particular language and environment..

Your online turnin should include:

Your group will meet with the instructor after the project is done to discuss it.  This isn't a formal presentation (i.e., don't waste time preparing PowerPoint slides or anything like that).  At, or before this meeting, you should hand in brief written report summarizing what your compiler does, what was implemented and what was omitted, any extra features you added, and, if you are working in a group, a summary of how the work was divided and who was responsible for what. Details of this writeup and the meeting will be posted separately.

If you are working with others, you should turn in only one assignment per group with your names listed in the same order as usual in the INFO file.