
CSE 401 15wi - Project IV - Code Generation
Due: Tuesday, March 3 at 11:00 pm. You should turn in your project using the assignment drop box (see link on the course home page). Further instructions are at the bottom of this writeup.
Added 2/27: Place your version of the boot.c
file in src/boot.c
so that we can find it and use it to run your compiled code.
Overview
The purpose of this part of the project is to add code generation to the compiler so that it can produce x86-64 assembly code, and add the runtime support needed to execute compiled programs.
We suggest that you use the simple code generation strategy outlined in lectures and sections to be sure you get running code on time, although you are free to do something different (i.e., better) if you have time. Whatever strategy you use, remember that simple, correct, and working is better than clever, complex, and not done.We also strongly suggest thorough testing after you implement each part of the code generator. Debugging of code generators can be difficult, and you will make your life easier if you find bugs early, before your generator is too complex. Using a test-driven development approach has also been effective for some groups in the past - i.e., writing tests for particular language features prior to writing the code generation that implements them.
Modify your MiniJava main
program so that when it is
executed using the command
java MiniJava filename.javawith no options, it will read the MiniJava program from the named input file, parse it and perform semantics checks, then print on standard output a x86-64 gcc-compatible assembly-language translation of the input program.
If translation is successful, the compiler should terminate with
System.exit(0)
. If any errors are detected in the input
program, including static semantics or type-checking errors, the
compiler should terminate using System.exit(1)
. If
errors are detected, the compiler does not need to produce any
assembly language code.
If no filename is specified on the command, the compiler should read from standard input as it has done in previous parts of the project.
The java
command shown above will also need
a -cp
argument or CLASSPATH
variable as
before to locate the compiled .class
files and
libraries. See the scanner assignment if
you need a refresher on the details
Your MiniJava
compiler should still be able to print
out scanner tokens if the -S
option is used; the
-P
and -A
options should continue to print
the AST; and -T
should still cause the compiler to print
symbol tables with information gathered during the static semantics
phase. There is no requirement for how your compiler should behave if
more than one of -A
, -P
, -S
or -T
is specified at the same time. That is up to
you.
Implementation Strategy
Code generation incorporates many more-or-less independent tasks. One of the first things to do is figure out what to implement first, what to put off, and how to test your code as you go. The following sections outline one reasonable way to break the job down into smaller parts. We suggest that you tackle the job in roughly this order so you can get something compiled and running quickly, and add to it incrementally until you're done. Your experience implementing the first parts of the code generator also should give you insights that will ease implementation of the rest.
Integer Expressions & System.out.println
Get a main program containing System.out.println(17)
to run. Then add code generation for basic arithmetic expressions
including only integer constants, +
, -
,
*
and parentheses. You will also need the the basic
prologue and return code for the MiniJava main
method, which uses the x86-64 C language conventions.
Object Creation and Method Calls
Next, try implementing objects with methods, but without instance variables, method parameters, or local variables. This includes:
- Operator new (i.e., allocate an object with a method table pointer, but no fields)
- Generation of method tables for simple classes that don't extend other classes
- Methods with no parameters or local variables.
Once you've gotten this far, you should be able to run programs that create objects and call their methods. These methods can contain System.out.println statements to verify that objects are created and that evaluation and printing of arithmetic expressions works in this context.
Variables, Parameters, & Assignment
Next try adding:
- Integer parameters and variables in methods, including assigning stack frame locations for variables.
- Parameters and variables in expressions
- Assignment statements involving parameters and local variables
Suggestion: Some of the complexity dealing with methods is handling registers during method calls. It can help to develop and test this incrementally -- first a single, simple function argument, then multiple arguments, then arguments that require evaluation of nested method calls.
Control Flow
This includes:
- While loops
- If statements
- Boolean expressions, but only in the context of controlling conditional statements and loops.
Classes and Instance Variables
Add the remaining code for classes that don't extend other classes, including calculating object sizes and assigning offsets to instance variables, and using instance variables in expressions and as the target of assignments. At this point, you should be able to compile and execute substantial programs.
Extended Classes
The main issue here is generating the right object layouts and method tables for extended classes, including handling method overriding properly. Once you've done that, dynamic dispatching of method calls should work, and you will have almost all of MiniJava working.
Arrays
We suggest you leave this until late in the project, since you can get most everything else working without arrays.
The Rest
Whatever is left, including any extensions you've added to the project, and items like storable Boolean values, which are not essential to the rest of the project.
C Bootstrap
As discussed in class, the easiest way to run the compiled code is to call it from a trivial C program. That ensures that the stack is properly set up when the compiled code begins execution, and provides a convenient place to put other functions that provide an interface between the compiled code and the external world.
We have provided a small bootstrap
program, boot.c
, which we suggest
you start with. Feel free to embellish this code as you wish. In
particular, you may find that it is sometimes easier to have your
compiler generate code that calls a
C runtime function to do something instead of generating the full sequence
of instructions directly in the assembly code. You can
add such functions to the .c file.
New 2/27: Please store the version of boot.c
used by your compiled code, with any additons or changes you've made, in src/boot.c
in your project tree. We will use this file when running and testing your compiled code.
Executing x86-64 Code with gcc
Your compiler should produce output containing x86-64 assembly
language code suitable as input to the GNU
assembler as
. You can compile and execute your
generated code and the bootstrap program using gcc
, and
you can use gdb
to debug it at the x86-64 instruction
level.
There is a sample assembler
file demo.s
that demonstrates the
linkage between boot.c
and assembler code. This demo
file does not contain a full MiniJava program, and the code produced
by your compiler will be different, but it should give you a decent
idea of how the setup is designed to work. Use this
and boot.c
as input to gcc
to generate an
executable demo program. You can also use gcc
to
generate additional examples of x86-64 assembly
code. If foo.c
contains C code, gcc -S
foo.c
will compile it and create a file foo.s
with the corresponding x86-64 code.
The output produced by your compiler should compile and run on 64-bit linux systems. Our baseline system for testing is attu, which is the same setup as the linux workstations in the CSE labs. You can use a CSE Linux VM to test code on your own computer (see the CSE Home Virtual Machines page for details.)
You should test your compiler by processing several MiniJava
programs. By the time you're done you should be able to compile any
of the MiniJava example programs distributed with the starter code. Since
every legal MiniJava program is also a legal full Java program, you
can compare the behavior of programs compiled and executed by your
MiniJava compiler with the results produced when the same program is
compiled and executed using javac
/java
As before, your compiler should only use language features available in Java 7, which is the environment that will be used to test your code.
You should continue to use your CSE 401 gitlab repository to store the code for this and remaining parts of the compiler project.
What to Hand In
As usual, your code should run on the linux lab machines
or attu
when built with ant
. You should do
an ant clean
, then bundle up your compiler directory in
a tar
file and turn that in. That will ensure that we
have all the pieces of your compiler if we need to check something.
To create the tar file, run the following commands starting in your
main project directory (the one that
contains build.xml
):
ant clean cd .. tar cvfz codegen.tar.gz your_project_directory_name
Then turn in the codegen.tar.gz
file.
Be sure that your boot.c
runtime code is in src/boot.c
and that it is included in the tar file you submit.
You and your partner should turn in only a single copy of the
project using one of your UW netids, preferably the same one you
used for previous parts of the project, although this is not
required. Your INFO
file should include your names and
uw netids so we can correctly identify everyone involved in the
group and get feedback to you. Multiple turnins are fine, as usual -
we'll grade the last one you give us. In particular, if you plan on
adding any extra features, turn in a copy of the working, basic
assignment first before you add these, then turn in the enhanced
compiler later once your additions are working.
Your INFO
file should describe anything unusual about
your project, including notes about extensions, clever code
generation strategies, or other interesting things in this phase
of the compiler. You should be sure to describe how much is
working and any major surprises (either good or bad) you
encountered along the way. In particular, if this phase of the
project required going back and making changes to
previously implemented parts, give a brief description of what was
done and why it was needed.
To be sure that everything is in working order, we strongly suggest
that before you create the tar file you first run ant clean;
ant
to rebuild your project from scratch, then run any tests
you want, then run the commands given above to create the actual tar
file to be turned in. That will also verify that your code will
compile using Java 7, assuming that you run this on a lab machine that has Java 7 installed.
Computer Science & Engineering University of Washington Box 352350 Seattle, WA 98195-2350 (206) 543-1695 voice, (206) 543-2969 FAX
Comments to adminanchor