CSE 401 15wi - Project V - Compiler Additions

Due: Thursday, March 12 at 11 pm. You should turn in your project using the assignment drop box (see link on the course home page). Further instructions are at the bottom of this writeup.

Overview

For the final part of the project, extend your MiniJava compiler by adding a new data type to MiniJava: either String or double. You should implement one or the other, not both. As with the rest of MiniJava, the resulting language will be a proper subset of standard Java, and programs executed with your extended MiniJava compiler should have the same behavior that they have when executed using javac/java.

You should make appropriate changes to your compiler to scan, parse, type-check, and generate code for programs that use the new type.

As before, your compiler should only use language features available in Java 7, which is the environment that will be used to test your code.

You should continue to use your CSE 401 gitlab repository to store the code for this part of the compiler project.

Choice I. Add `String`s

Add basic support for data of type String to MiniJava including constants, output, assignment, and String concatenation using +.

Requirements

Add Type ::= "String" as a new production. For MiniJava we will treat "String" as a reserved word instead of the name of a class in order to simplify the implementation.
Add Expression ::= <STRING_LITERAL> as a new production.
Allow parameters, variables, and fields of type String, and use of String values in assignment statements and as arguments and return values in method calls.
Overload the operator "+" to support String concatenation.
Overload System.out.println so it can print String values as well as integers.

Restrictions and simplifying assumptions

These changes simplify the implementation of MiniJava, while still retaining source compatibility with full Java.

"String" is treated as a reserved word
A <STRING_LITERAL> consists of a single String surrounded by double quotes (") and consisting only of printable ASCII characters rather than 16-bit Unicode characters. You do not need to support escape codes like \", \n or \t (although you may add these if you like).
Strings are not objects and do not have member functions such as s.length().

Implementation

We suggest you store Strings in memory as ordinary \0-terminated C Strings. A MiniJava String value can then be represented as a 64-bit pointer to the character array.
<STRING_LITERAL>s can be included in assembly code using the .ascii or .asciz assembler directives. These constants should go in the .data segment.
A reasonable way to implement "+" for Strings is to allocate space on the heap and store the result there.
You will definitely want to add code to boot.c to implement System.out.println for String values. You should also consider adding routines to boot.c to implement operations like "+" using C function(s) that can be called from compiled code.

Choice II. Add `double`s

Add support for IEEE-754 64-bit floating point numbers, which in Java have type double. The support should include constants, output, assignment, and basic operations. You do not need to implement conversions between ints and doubles; doubles essentially live in a parallel numeric universe similar to that occupied by ints, but not interacting with them.

Requirements

Add Type ::= "double" as a new production. "double" is a new reserved word.
Add Expression ::= <DOUBLE_LITERAL> as a new production.
Allow parameters, variables, and fields of type double, and use of double values in assignment statements and as arguments and return values in method calls.
Overload the operators "+", "-", "*" and "<" to support double values.
Overload System.out.println so it can print double values as well as integers. Numeric output should be formatted as it is in standard Java. We have provided a set of C routines to convert values of type double to strings in the proper format (see the implementation section below).

Restrictions and simplifying assumptions

A <DOUBLE_LITERAL> should have the form of a Java decimal floating-point literal containing decimal digits, an optional decimal point, and an optional exponent part consisting of the letter "e" or "E" followed by a signed integer exponent. As in Java, there must be at least one digit and either a decimal point or an exponent (or both). You are not required to implement Java’s "float suffix" ("f", "F", "d", or "D" following the rest of the literal).
You do not need to support implicit or explicit conversions between doubles and integers, including implicit conversions in assignments or method calls, or implicit conversions needed to support mixed-mode arithmetic such as 1 + 2.0.
You do not need to deal with unusual double values like NaN.

Implementation

You should use the x86-64 SSE registers and instructions to implement doubles using 64-bit IEEE floating-point arithmetic. The web site for the Bryant/O’Hallaron textbook used in CSE 351 has a good introduction to and description of this part of the x86-64 architecture. See http://csapp.cs.cmu.edu/public/waside/waside-sse.pdf. You also may find it useful to write small C functions using doubles and look at the assembly code generated by gcc -S.
Generating code to compare doubles can be somewhat tricky, although since you are not required to deal with NaNs, it may not be much more difficult than dealing with integers. However, one easy way to handle comparison is to add a library function to boot.c that does the work.
You should use the x86-64 C language calling conventions for functions with double-precision floating point values in their argument lists. Again, see the Bryant/O’Halloron SSE floating-point discussion and, if needed, the AMD64 Application Binary Interface documents (linked on the project web page) for details. Note that doubles have a separate set of registers and you will need to be careful with parameter lists that have a mixture of doubles and other values.
Output of doubles is tricky because the formatting needed to convert a double to a string that matches the one used in Java is quite complex. We have provided two files, number_converter.h and number_converter.c, that contain a function convert_double that creates the string representation of a double value using the rules defined by Java. These are found in http://docs.oracle.com/javase/7/docs/api/java/lang/Double.html#toString(double). You do not need to understand how this code works, but it is interesting to see how tricky it is to do these conversions correctly.
You will definitely want to add code to boot.c to implement System.out.println for doubles.
In your compiler code, take advantage of Java library methods to do the conversion from <DOUBLE_LITERAL> strings to binary values of type double if you need to do this. If you need to include a <DOUBLE_LITERAL> value in generated assembly code, take advantage of the assembler’s .double directive to do the conversion for you.

Extra credit

A small amount of extra credit will be awarded for extensions that go beyond these basic requirements. However, do not attempt extra credit extensions until you have your chosen compiler addition working properly. Extra credit will not be awarded to projects that do not include a substantially correct implementation of one of String or double.

Some possibilities

Implement both Strings and doubles
Add some simple String member functions like s.length(), or more complicated ones like s.substring(start,end), s.concat(other), and so on.
Support mixed-mode concatenation of a String and an int so that "thing" + 42 yields "thing42".
Add arrays of Strings. Then, once you have that, you could go further and provide a mechanism to pass command-line arguments to main as a String array, and allow code to access elements of that array or pass it as a parameter to some method.
Add arrays of doubles. (This one should be quite simple, actually.)
Support mixed-mode arithmetic using doubles and ints, and widening conversions from int to double in assignment statements and method argument lists.

What to Hand In

As usual, your code should run on the linux lab machines or attu when built with ant. You should do an ant clean, then bundle up your compiler directory in a tar file and turn that in. That will ensure that we have all the pieces of your compiler if we need to check something. To create the tar file, run the following commands starting in your main project directory (the one that contains build.xml):

  ant clean
  cd ..
  tar cvfz additions.tar.gz your_project_directory_name

Then turn in the additions.tar.gz file.

Be sure that your boot.c runtime code is in src/boot.c and that it is included in the tar file you submit.

If you add doubles to your compiler, be sure to that the number_converter.h and number_converter.c files are also in your src directory.

You and your partner should turn in only a single copy of the project using one of your UW netids, preferably the same one you used for previous parts of the project, although this is not required. Your INFO file should include your names and uw netids so we can correctly identify everyone involved in the group and get feedback to you. Multiple turnins are fine, as usual - we'll grade the last one you give us. In particular, if you plan on adding any extra features, turn in a copy of the working, basic assignment first before you add these, then turn in the enhanced compiler later once your additions are working.

Your INFO file should describe anything unusual about your project, including notes about extensions, clever code generation strategies, or other interesting things in this phase of the compiler. You should be sure to describe how much is working and any major surprises (either good or bad) you encountered along the way. In particular, if this phase of the project required going back and making changes to previously implemented parts, give a brief description of what was done and why it was needed.

To be sure that everything is in working order, we strongly suggest that before you create the tar file you first run ant clean; ant to rebuild your project from scratch, then run any tests you want, then run the commands given above to create the actual tar file to be turned in. That will also verify that your code will compile using Java 7, assuming that you run this on a lab machine that has Java 7 installed.

CSE 401 15wi - Project V - Compiler Additions

Overview

Choice I. Add Strings

Requirements

Restrictions and simplifying assumptions

Implementation

Choice II. Add doubles

Requirements

Restrictions and simplifying assumptions

Implementation

Extra credit

Some possibilities

What to Hand In

Choice I. Add `String`s

Choice II. Add `double`s