
CSE 401 15wi - Project V - Compiler Additions
Due: Thursday, March 12 at 11 pm. You should turn in your project using the assignment drop box (see link on the course home page). Further instructions are at the bottom of this writeup.
Overview
For the final part of the project, extend your MiniJava compiler by
adding a new data type to MiniJava: either String
or double
. You should implement one or the other, not
both. As with the rest of MiniJava, the resulting language will be a
proper subset of standard Java, and programs executed with your
extended MiniJava compiler should have the same behavior that they
have when executed using javac
/java
.
You should make appropriate changes to your compiler to scan, parse, type-check, and generate code for programs that use the new type.
As before, your compiler should only use language features available in Java 7, which is the environment that will be used to test your code.
You should continue to use your CSE 401 gitlab repository to store the code for this part of the compiler project.
Choice I. Add String
s
Add basic support for data of type String to MiniJava including
constants, output, assignment, and String
concatenation
using +.
Requirements
- Add Type ::= "
String
" as a new production. For MiniJava we will treat "String
" as a reserved word instead of the name of a class in order to simplify the implementation. - Add Expression ::= <STRING_LITERAL> as a new production.
- Allow parameters, variables, and fields of type
String
, and use ofString
values in assignment statements and as arguments and return values in method calls. - Overload the operator "+" to support
String
concatenation. - Overload
System.out.println
so it can printString
values as well as integers.
Restrictions and simplifying assumptions
These changes simplify the implementation of MiniJava, while still retaining source compatibility with full Java.
- "
String
" is treated as a reserved word - A <STRING_LITERAL> consists of a
single
String
surrounded by double quotes (") and consisting only of printable ASCII characters rather than 16-bit Unicode characters. You do not need to support escape codes like\"
,\n
or\t
(although you may add these if you like). String
s are not objects and do not have member functions such ass.length()
.
Implementation
- We suggest you store
String
s in memory as ordinary\0
-terminated CString
s. A MiniJavaString
value can then be represented as a 64-bit pointer to the character array. - <STRING_LITERAL>s can be included in assembly code
using the
.ascii
or.asciz
assembler directives. These constants should go in the.data
segment. - A reasonable way to implement "+" for
String
s is to allocate space on the heap and store the result there. - You will definitely want to add code to
boot.c
to implementSystem.out.println
forString
values. You should also consider adding routines toboot.c
to implement operations like "+" using C function(s) that can be called from compiled code.
Choice II. Add double
s
Add support for IEEE-754 64-bit floating point numbers, which in
Java have type double
. The support should include
constants, output, assignment, and basic operations. You do not need
to implement conversions between int
s
and double
s; double
s essentially live in a
parallel numeric universe similar to that occupied
by int
s, but not interacting with them.
Requirements
- Add Type ::= "
double
" as a new production. "double
" is a new reserved word. - Add Expression ::= <DOUBLE_LITERAL> as a new production.
- Allow parameters, variables, and fields of
type
double
, and use ofdouble
values in assignment statements and as arguments and return values in method calls. - Overload the operators "+", "-", "*" and "<" to
support
double
values. - Overload
System.out.println
so it can printdouble
values as well as integers. Numeric output should be formatted as it is in standard Java. We have provided a set of C routines to convert values of typedouble
to strings in the proper format (see the implementation section below).
Restrictions and simplifying assumptions
- A <DOUBLE_LITERAL> should have the form of a Java decimal floating-point literal containing decimal digits, an optional decimal point, and an optional exponent part consisting of the letter "e" or "E" followed by a signed integer exponent. As in Java, there must be at least one digit and either a decimal point or an exponent (or both). You are not required to implement Java’s "float suffix" ("f", "F", "d", or "D" following the rest of the literal).
- You do not need to support implicit or explicit conversions
between
double
s and integers, including implicit conversions in assignments or method calls, or implicit conversions needed to support mixed-mode arithmetic such as 1 + 2.0. - You do not need to deal with unusual
double
values like NaN.
Implementation
- You should use the x86-64 SSE registers and instructions to
implement
double
s using 64-bit IEEE floating-point arithmetic. The web site for the Bryant/O’Hallaron textbook used in CSE 351 has a good introduction to and description of this part of the x86-64 architecture. See http://csapp.cs.cmu.edu/public/waside/waside-sse.pdf. You also may find it useful to write small C functions usingdouble
s and look at the assembly code generated bygcc -S
. - Generating code to compare doubles can be somewhat tricky,
although since you are not required to deal with NaNs, it may not be
much more difficult than dealing with integers. However, one easy
way to handle comparison is to add a library function
to
boot.c
that does the work. - You should use the x86-64 C language calling conventions for
functions with
double
-precision floating point values in their argument lists. Again, see the Bryant/O’Halloron SSE floating-point discussion and, if needed, the AMD64 Application Binary Interface documents (linked on the project web page) for details. Note thatdouble
s have a separate set of registers and you will need to be careful with parameter lists that have a mixture ofdouble
s and other values. - Output of
double
s is tricky because the formatting needed to convert a double to a string that matches the one used in Java is quite complex. We have provided two files,number_converter.h
andnumber_converter.c
, that contain a functionconvert_double
that creates the string representation of adouble
value using the rules defined by Java. These are found in http://docs.oracle.com/javase/7/docs/api/java/lang/Double.html#toString(double). You do not need to understand how this code works, but it is interesting to see how tricky it is to do these conversions correctly. - You will definitely want to add code to
boot.c
to implementSystem.out.println
fordouble
s. - In your compiler code, take advantage of Java library methods to
do the conversion from <DOUBLE_LITERAL>
strings to binary values of
type
double
if you need to do this. If you need to include a <DOUBLE_LITERAL> value in generated assembly code, take advantage of the assembler’s .double
directive to do the conversion for you.
Extra credit
A small amount of extra credit will be awarded for extensions that
go beyond these basic requirements. However, do not
attempt extra credit extensions until you have your chosen compiler
addition working properly. Extra credit will not be awarded to
projects that do not include a substantially correct implementation of
one of String
or double
.
Some possibilities
- Implement both
String
s anddouble
s - Add some simple
String
member functions likes.length()
, or more complicated ones likes.substring(start,end)
,s.concat(other)
, and so on. - Support mixed-mode concatenation of a
String
and anint
so that"thing" + 42
yields"thing42"
. - Add arrays of
String
s. Then, once you have that, you could go further and provide a mechanism to pass command-line arguments tomain
as aString
array, and allow code to access elements of that array or pass it as a parameter to some method. - Add arrays of
double
s. (This one should be quite simple, actually.) - Support mixed-mode arithmetic using
double
s andint
s, and widening conversions fromint
todouble
in assignment statements and method argument lists.
What to Hand In
As usual, your code should run on the linux lab machines
or attu
when built with ant
. You should do
an ant clean
, then bundle up your compiler directory in
a tar
file and turn that in. That will ensure that we
have all the pieces of your compiler if we need to check something.
To create the tar file, run the following commands starting in your
main project directory (the one that
contains build.xml
):
ant clean cd .. tar cvfz additions.tar.gz your_project_directory_name
Then turn in the additions.tar.gz
file.
Be sure that your boot.c
runtime code is in src/boot.c
and that it is included in the tar file you submit.
If you add double
s to your compiler, be sure to that the
number_converter.h
and number_converter.c
files
are also in your src
directory.
You and your partner should turn in only a single copy of the
project using one of your UW netids, preferably the same one you
used for previous parts of the project, although this is not
required. Your INFO
file should include your names and
uw netids so we can correctly identify everyone involved in the
group and get feedback to you. Multiple turnins are fine, as usual -
we'll grade the last one you give us. In particular, if you plan on
adding any extra features, turn in a copy of the working, basic
assignment first before you add these, then turn in the enhanced
compiler later once your additions are working.
Your INFO
file should describe anything unusual about
your project, including notes about extensions, clever code
generation strategies, or other interesting things in this phase
of the compiler. You should be sure to describe how much is
working and any major surprises (either good or bad) you
encountered along the way. In particular, if this phase of the
project required going back and making changes to
previously implemented parts, give a brief description of what was
done and why it was needed.
To be sure that everything is in working order, we strongly suggest
that before you create the tar file you first run ant clean;
ant
to rebuild your project from scratch, then run any tests
you want, then run the commands given above to create the actual tar
file to be turned in. That will also verify that your code will
compile using Java 7, assuming that you run this on a lab machine
that has Java 7 installed.
Computer Science & Engineering University of Washington Box 352350 Seattle, WA 98195-2350 (206) 543-1695 voice, (206) 543-2969 FAX
Comments to adminanchor