
CSE 401 15wi - Project III - Semantics+Types
Due: Thursday, February 19 at 11:00 pm. You should turn in your project using the assignment drop box (see link on the course home page). Further instructions are at the bottom of this writeup.
Overview
Add static semantics checking to your compiler. In particular, you should do the following:
- Add global symbol table(s) storing information about classes and their members, and local symbol tables for each method to store information about parameters and local variables.
- Calculate and store type information for classes, class members, parameters, and variables.
- Calculate type information for expressions and other appropriate parts of the abstract syntax.
- Add error checking to verify at least the following properties:
- Every name is properly declared.
- Components of expressions have appropriate types
(e.g.,
+
is only applied to values of typeint
,&&
is only applied to values of typeboolean
, the expression in parentheses in anif
orwhile
statement isboolean
, etc.). - If a method is selected from a value with a reference type, then that name is defined as a member of that type.
- Methods are called with the correct number of arguments.
- In assignment statements and method call parameter lists, the values being assigned have appropriate types (i.e., the value either has the same type as the variable or is a subclass of the variable's class).
- If a method in a subclass overrides one in any of its superclasses, the overriding method has the same parameter list as the original method, and the result type of the method is the same as the result type of the original one, or a subclass if the original result type is a reference type.
- If your parser grammar accepts input that is not part of MiniJava (i.e., uses a covering grammar or otherwise parses constructs that are syntactically legal but have semantic errors), check that the program is actually legal. If you implemented extensions like casts, you should add appropriate error and type checking.
- Print suitable messages describing any errors detected. It's fine to suppress useless error messages - for instance, feel free to complain only once about an undeclared variable instead of repeating the message each time the variable is used in the code.
- However, your semantics checking should generally continue to report multiple unrelated errors in a source program rather than stopping immediately after the first error is encountered. But feel free to put in some threshold if you want to terminate the compiler after it has produced a large number of error messages.
Modify your MiniJava main
program so that when it is
executed using the command
java MiniJava -T filename.javait will parse the MiniJava program in the named input file, perform semantic checks as described above, and print the contents of the compiler symbol tables. We do not specify the detailed format of the symbol table output, but there should be one table for each scope, clearly labeled to identify the scope (class or method in most cases), and showing the names declared in that scope, their types, and any other important information. The output should not be any more verbose than necessary.
If no filename is specified on the command, the compiler should read from standard input as it has done in previous parts of the project.
The java
command shown above will also need
a -cp
argument or CLASSPATH
variable as
before to locate the compiled .class
files and
libraries. See the scanner assignment if
you need a refresher on the details
You should also modify your MiniJava
main method so
that if any error is encountered in the source program being compiled
(including in the scanner, parser, or type checker) then when the
compiler terminates, it should use System.exit(1)
to
return a status code of 1 to the system. If no errors are found, the
compiler should terminate with
System.exit(0)
.
Your MiniJava
compiler should still be able to print
out scanner tokens if the -S
option is used instead
of -T
, and
-P
or -A
should continue to print the AST in
the requested format. There is no requirement for how your compiler
should behave if more than one
of -A
, -P
, -S
and -T
are specified at the same time. That is up to
you. You could treat that as an error, or, maybe more useful, the
compiler could print an appropriate combination of tokens, trees, and
symbol tables.
Details and Suggestions
It's probably easiest to collect the type information in muptiple passes over the AST. An initial pass should collect information about classes and fields (both data and methods), and build the global symbol tables. A later pass would then analyze method bodies, build the local symbol tables, and perform type and other error checking. You might find it more convenient to break this down into more passes each of which does fewer things, particularly for the initial pass, where it might be easier to build a global symbol table of class names before processing individual classes to build class symbol tables with information about variables, methods, and their types.
You should add appropriate fields in some or all AST nodes to store references to type and other information as necessary. But remember that you should have a separate data abstraction (ADT) to represent type information used for semantics checking in the compiler, and not confuse this information with the source program type declarations in the AST.
Use the visitor pattern! This is where it pays off to have gone to the trouble to set up the visitor machinery. Provide new implementations of the Visitor interface for the various semantics checks.
Take advantage of the standard library container classes and data
structures in Java to simplify your implementation.
Class HashMap
should be particularly useful for
symbol tables. Use the List
classes
(ArrayList
or LinkedList
) for things
like argument and parameter lists. Don't reinvent any more wheels
than necessary.
It can be useful to include a few auxiliary methods that perform common operations on types. Possibilities include a method that returns true if two types are the same, and a method that returns true if a value of one type is assignable to another. Also possibly useful: a method that tries to add an entry to a symbol table and reports an error if the name is already declared, and another that looks up an identifier and reports an error if it is not found (and maybe adds it to the symbol table with an "undefined" type, which can be used to suppress additional redundant error messages about the same identifier).
You should test your compiler by processing several MiniJava programs, both correct ones and ones with errors. Be sure to check some examples that are syntactically legal (i.e., can be parsed with no errors) but contain semantic errors.
As before, your compiler should only use language features available in Java 7, which is the environment that will be used to test your code.
You should continue to use your CSE 401 gitlab repository to store the code for this and remaining parts of the compiler project.
What to Hand In
As usual, your code should run on the linux lab machines
or attu
when built with ant
. You should do
an ant clean
, then bundle up your compiler directory in
a tar
file and turn that in. That will ensure that we
have all the pieces of your compiler if we need to check something.
To create the tar file, run the following commands starting in your
main project directory (the one that
contains build.xml
):
ant clean cd .. tar cvfz semantics.tar.gz your_project_directory_nameThen turn in the
semantics.tar.gz
file.
You and your partner should turn in only a single copy of the
project using one of your UW netids, preferably the same one you
used for the scanner and parser, although this is not
required. Your INFO
file should include your names and
uw netids so we can correctly identify everyone involved in the
group and get feedback to you. Multiple turnins are fine, as usual - we'll
grade the last one you give us. In particular, if you plan on adding
error handling or other extra features, turn in a copy of the
working, basic assignment first before you add these, then turn in
the enhanced compiler later once your additions are working.
Your INFO
file should describe anything unusual about
your project, including notes about extensions, additional checks
performed, or other interesting things in this phase of your
compiler. You should be sure to describe how much of the requested
checking was actually implemented, and any major surprises (either
good or bad) you encountered along the way. In particular, if this
phase of the project required going back and making changes to
previously previously implemented parts, give a brief description
of what was done and why it was needed.
To be sure that everything is in working order, we strongly suggest
that before you create the tar file you first run ant clean;
ant
to rebuild your project from scratch, then run any tests
you want, then run the commands given above to create the actual tar
file to be turned in. That will also verify that your code will
compile using Java 7, assuming that you run this on a lab machine that has Java 7 installed.
Computer Science & Engineering University of Washington Box 352350 Seattle, WA 98195-2350 (206) 543-1695 voice, (206) 543-2969 FAX
Comments to adminanchor