Mini Java Programming Project
CSE 401 Winter 2014
Robert R. Henry
rrh@cs.washington.edu
cse401-staff@cs.washington.edu
Note
This document is subject to change, especially early on in the quarter. We’ll try our best to let you know via email the gist of the changes.
The master copy of this document is held in a docs.google.com document, so it may render a little funny after it is copied to the CSE course web.
Overview
The course project is to implement an extended version of the MiniJava language specified in the Appendix of Appel's Modern Compiler Implementation in Java, 2nd edition, with amendments and modifications given in this assignment, or in modifications to this assignment.
We will supply you with the Java and C source code for a seed for your project, which includes:
- An implementation of a mostly full AST; you will have to make some obvious changes and extensions to support some of the extensions to the language.
- An implementation of a very simple source code pretty printer.
- An implementation of a very simple expression code generator, for a tiny impure subset of Mini Java. It can generate code for function entry, function exit, the scaffolding “display” statement, and integer expression evaluation using addition.
- Some sample programs, including ones implementing classical data structures and algorithms, others doing integer numerical computations, and others doing double precision numerical computations.
- A Makefile to compile the compiler, as well as compile and run the test program(s).
- A startup and runtime library written in the C in the file “boot.c” and “number_converter.c”. You should know enough C code by now to understand all that goes on in “boot.c”, but you are not responsible for the heavy numerics in the other files.
The basic grammar is described on the MiniJava web site. We will implement more than this grammar as we have additional extensions. MiniJava is a subset of Java and the meaning of a MiniJava program is given by its meaning as a Java program.
As compared to full Java, a MiniJava program consists of a single class containing a static main method followed by zero or more other classes.
- There are no other static methods or variables
- Classes may extend other classes.
- Method overriding is included, but method overloading is not supported.
- All classes making up a MiniJava program are included in a single source file.
- The only types available are int, double, boolean, int[] and double[].
- "System.out.println(...);" is a statement and can print either integers or doubles: it is not a normal method call, and does not refer to a static method of a class.
- All methods are value-returning and must end with a single return statement. (You will find this awkward to program with.)
- The only other available statements are if, while, and assignment.
The basic MiniJava grammar is given on the project web site and in the Appendix of Appel's Modern Compiler Implementation in Java (2nd ed). Look at the grammar carefully to see what is and is not included in the language subset.
You will have to implement several binary operators over and beyond what’s given in the basic MiniJava grammar, just because it will make programming real useful programs in MiniJava that much easier.
You should implement (or not implement…) the full MiniJava as described there, but with these changes:
- You must implement // style comments (// to end of line is a comment)
- You do NOT need to implement /* comments */.
- Implement “int” as if it were a “long”, eg all ints (and pointers and doubles…) are 64 bits long.
- You must implement both short circuit “and” (&&) and short circuit “or” (||) operators; the core MiniJava has only &&.
- You must implement all 6 relational operators: <, <=, ==, !=, >= and >; once you figure out how to do one of them, the others fall into place. Note that the core MIniJava has only <. Note also that comparing doubles is tricky; see below.
- You must implement binary operators for +, -, *, / and %. (The core MiniJava has only +, - and *; the division operators were originally omitted because they would require exceptions, and the x86_64 ISA is tricky.)
- You do NOT need to worry about integer division by zero, and do not need to implement exception throwing just to support integer division by zero.
- You must implement basic support for double precision floating point numbers, using IEEE-754 64-bit floating point numbers, corresponding to the “double” type in Java. You will need to implement double precision literals, the “double” keyword, double arrays, evaluate expressions involving double arithmetic, assign to doubles, pass double arguments to functions, and return doubles from functions. Do NOT use the obsolete x87 instruction set.
- You do NOT need to implement explicit or implicit type conversions from ints to doubles.
- Generating code to compare doubles is tricky, due to bizarre asymmetry in the x86_64 SSE instruction set. Instead, call a library function to do the work, and rely on gcc to use the right set of tricks. You’ll need 6 library functions which you should put into boot.c.
- There is a function in boot.c named “put_double” which prints a double precision number, using the Java rules here http://docs.oracle.com/javase/6/docs/api/java/lang/Double.html#toString(double). You do not need to understand what happens in “put_double”, because it is very tricky to write (and read) floating point numbers accurately without loss of precision. The code in “put_double” should produce identical results to what Java does for all numbers you are likely to encounter in practice; it is “off” for very small denormalized floating point numbers.
- You do NOT need to worry about functions with more than 6 parameters, including the implicit “this” parameter.
- You must implement array bounds checking, and can just exit the program when the index is out of bounds (eg, don’t worry about exceptions).
- You must implement a statement counting profiler; see below.
Implementation
- Please do your best to follow the Google Java coding guidelines: http://google-styleguide.googlecode.com/svn/trunk/javaguide.html#s3.3-import-statements; the code you are given may deviate, like any hypocritical set of rules.
- You must use a source code control system. We encourage you to use “git” since that’s how we’re going to be distributing the base version of the compiler.
- Your MiniJava compiler should be implemented in Java using the JFlex/CUP scanner/parser tools.
- Your MiniJava compiler should generate x86_64 assembly code that when executed produces the expected output, where “expected” is defined by the standard Java execution.
- You must follow the x86_64 ABI conventions for calling and returning from functions.
- We don’t care how efficient your generated code is, just that it is correct. Memory is cheap, the stack is unbounded, etc. Calling library functions may be appropriate.
- You are free to use any development platform and environment you wish, but your resulting project should build (using ant) and run on attu, the lab linux machines.
- If you use Eclipse, you should be able to create a suitable project from the starter code by selecting a new "Java Project from Existing Ant Buildfile". Don't create a new generic Java project - it doesn't seem to set things up correctly.
- You must be able to compare the reference presumed-good output of standard Java’s execution of a Mini Java program with your execution of the same program. Naturally, we expect the outputs to be identical. (You may find that using GNU make is better suited for this than using ant.)
Suggested General Implementation Order
This is merely a suggestion. You may find that you bounce around the matrix of (extensions * compiler phases), based on what you are working on at the time, and what you know, or what’s fun at the moment.
- Implement // style comments.
- Get expression evaluation to work for integers; leave doubles for last.
- Generate and use values for comparison operators.
- Implement control flow statements.
- Implement short circuit operators.
- Implement scanning and parsing for the complete grammar, using existing AST.
- Implement class and method interface symbol tables or equivalent.
- Implement function call/return without type checking and without any vtbls.
- Add type checking of all expression operators for ints and bools.
- Add type checking of function call/return across arguments.
- Implement stack frame layout for local variables.
- Implement data layout for objects.
- Implement object data member references.
- Implement new.
- Implement vtbls and dynamic dispatch.
- Implement the “extends” keyword to implement subclassing.
- Implement array allocation, rval indexing and lval indexing, without bounds checking.
- Implement array bounds checking; just exit on failure.
- Add statement counting.
- Add doubles; by now you’ve done everything for integers, so know how to proceed.
- Implement comparison of doubles by function calls to C functions in the library.
Statement Counting Profiler
You must implement a statement counting profiler. A collaboration between your compiler, your runtime system, and your post execution system must be able to print out the original source code of the compiled Mini Java program, with annotations showing how many times each statement was executed. It is OK to assume the Mini Java program has a maximum number of lines. If there is more than one statement per line, just show the counts for one of the statements on the line. (But who would write code like that anyway, since it violates the Java style guidelines here?)
Extensions
The basic project requirements are small enough to be tractable in a one-quarter course project, but include enough to cover the core ideas of compiling an object-oriented (as well as procedural) language.
If you are feeling ambitious and have the time, you are invited to add additional parts of Java to the language. While this sounds daunting now, once you figure out the basics of the compiler architecture, it becomes a simple matter of adding a handful of lines of Java code to your compiler in each of half a dozen places, emulating what’s already there.
Here are a few suggestions.
Some Simple Ideas for Extensions
- Add /* ... */ comments; don’t bother with nested comments, as they aren’t in Java.
- Add a for statement.
- Add null as a constant expression and support == and != for object reference types.
- Allow return statements anywhere in a method.
- Allow calls of local methods without the explicit use of this.
- Allow explicit p.v references to instance variables.
- Relax the ordering of declarations so that variables can be declared anywhere in a method body.
- Allow initialization of variables in declarations.
- Add void methods, a return statement with no expression, and appropriate type checking rules.
- Support public and private declarations on both methods and instance variables, and check to ensure that access restrictions are not violated.
- Add op= operators, such as +=, *=, etc.
- Add a unary negation operator (note there’s already a unary “not” operator for booleans)
- Support implicit automatic promotion of integers to doubles.
Interesting and more Sophisticated ideas for Extensions
- Support instanceof and type casts (which can require a runtime instanceof check).
- Support super. as a prefix in ordinary method calls.
- Add constructor declarations with optional parameters. To make this useful, you probably want to include the use of super(...) at the beginning of the body of a constructor.
- Support arrays of objects (no harder to implement than arrays of ints, except the type checking rules are more interesting).
Of the suggested extensions, adding instanceof, type casts, and super are particularly instructive.
Some small amount of extra credit will be awarded to projects that go beyond the basic requirements.
Grading
Evaluating and grading the project will be done with a combination of code reading by the TAs; code walk throughs with the TAs and possibly the instructor; and end of quarter face to face discussions between the instructor, the TAs and members of each group.