Due: Thursday, November 14 at 11:59 pm. You will "turn in" your project as you did with previous assignments by pushing it to your GitLab repository and providing a suitable tag. See the end of this writeup for details.
During sections on Thursday, November 7, you and your partner should show your TA the core data structure APIs for this part of the project. These are the symbol table classes (global, class, and methods) and the "Type ADT" classes that you plan to use to represent types and perform operations on them such as checking for compatible types in assignment statements and elsewhere. You do not need to have implemented these APIs yet (although it would be good if you were reasonably far along at this point), but the specification needs to be plausible and reasonably complete (javadoc optional but very useful). If you and your partner normally attend different sections, both of you should try to go to the same section for the checkin, but if that is not possible, one of you can show your work for the group. A successful checkin will be awarded some sort of point(s), or appropriate other token to indicate successful completion. Of course, as you work on the rest of this part of the compiler you are free to change these APIs, but having them in a reasonable state by this time will help keep the project on pace.Add static semantics checking to your compiler and add code to print the resulting symbol tables so the semantics and type information can be verified. In particular, you should do and check for the following:
+
is only applied to values of
type int
,
&&
is only applied
to values of type boolean
,
the expression in parentheses in an if
or while
statement has type boolean
,
etc.). m
in value.m(...)
),
then that name is defined as a member of that type
(either locally or inherited from a superclass type).stderr
,
the standard error stream (System.err
in Java).The MiniJava MainClass
class, which contains the publicstaticvoid
main
method, is not an ordinary class and will need special handling.
This class is included in
MiniJava to provide a starting point for program execution using syntax that
matches that needed for an ordinary main
method in full Java,
but this class does not have instance variables or methods besides
main
. MiniJava programs do not need to support creating new
instances
of this class or extending this class with subclasses.
The main
method itself cannot contain any declarations.
Because of these differences, the semantic checking needed for this class
will likely be more limited than the full checks required for other classes.
The name of this class should, of course, be different
from the names of all other classes in the program. The statement in the
body of method main
needs to be checked thoroughly to be sure
it is correct. It's ok to ignore the String[]
parameter to
main
since there is no way to use it because String
values are not included in MiniJava.
Modify your MiniJava main
program so that when it is
executed using the command
java MiniJava -T filename.javait will parse the MiniJava program in the named input file, perform semantic checks as described above, and print the contents of all compiler symbol tables (global, class, and method tables) to
stdout
(System.out
).
Print as much of the symbol tables as possible even if there are semantic errors
in the program,
including cases where there are especially severe errors or so many errors that some error
messages are suppressed.
(i.e., print everything that is discovered about the program's semantics - don't
omit things even if there are errors in the program being compiled.)
We do not specify the detailed format of the symbol table output, but there should be appropriate table(s) for each scope, clearly labeled to identify the scope (class or method in most cases, including the class or method name), and showing the names declared in that scope, their types, and any other important information. The symbol table output should not be more verbose than necessary, but it should be detailed enough and clearly formatted so that someone reading the table output can check the semantics and types of any MiniJava expression given the scope in which the expression appears and the information shown in the relevant symbol tables. It should not be necessary to view any of the remaining source code of the program to verify that the symbol tables contain the necessary information.
The java
command shown above may also need
a -cp
argument or CLASSPATH
variable as
before to locate the compiled .class
files and
libraries. See the scanner assignment if
you need a refresher on the details
As with previous parts of the compiler project, the compiler main method should return an exit or status code of 1 if any errors are detected in the source program being compiled (including errors detected by the scanner or parser, as well as semantics and type checking). If no errors are found, the compiler should terminate with a result code of 0.
Your MiniJava
compiler should still be able to print
out scanner tokens if the -S
option is used instead
of -T
, and the options
-P
and -A
should continue to print the AST in
the requested format. There is no requirement for how your compiler
should behave if more than one
of -A
, -P
, -S
or -T
are specified at the same time. That is up to
you. You could treat that as an error, or, maybe more useful, the
compiler could print an appropriate combination of tokens, trees, and
symbol tables.
Now would be a good time to go back and re-read the MiniJava project overview and also recheck the language grammar to remind yourself of what is and is not included in the MiniJava subset of Java. Your compiler may, of course, implement extensions to MiniJava, but be sure to refresh your memory about what is contained in the core language.
It's probably easiest to collect the type information in multiple passes over the AST. An initial pass could collect information about classes and fields (both data and methods), and build the global symbol tables. A later pass would then analyze method bodies, build the local symbol tables, and perform type and other error checking. You might find it more convenient to break this down into more passes, each of which does fewer things, particularly for the initial pass where it might be easier to build a global symbol table of class names before processing individual classes to build class symbol tables with information about variables, methods, and their types. Remember that classes can be declared in any order, and similarly methods inside classes can appear in any order, which means that method and class names can be used in parts of the code that appear before their actual declarations in the source file.
You should add appropriate fields in some or all AST nodes to store references to type and other information as necessary. For example, it likely will be easier to process the AST later in the compiler if nodes for identifiers in the tree have a direct link to the information about that identifier that is found in the relevant symbol table. Similarly, each node in an expression or sub-expression should likely have a "type" field pointing to type information for that node. Remember that you should have a separate data abstraction (ADT) to represent type information used for semantics checking in the compiler, and not confuse this information with source program type declarations in the AST.
Use the visitor pattern! This is where it pays off to have gone to the trouble to set up the visitor machinery. Provide new implementations of the Visitor interface as needed to do the semantics checks.
Take advantage of the standard library container classes and data
structures in Java to simplify your implementation.
Class HashMap
should be particularly useful for
symbol tables. Use the List
classes
(ArrayList
or LinkedList
) for ordered collections
like argument and parameter lists. Don't reinvent any more wheels
than necessary.
It should be helpful to create a few auxiliary methods that perform common operations on types. Possibilities include a method that returns true if two types are the same, and a method that returns true if a value of one type is assignable to another. Also possibly useful: a method that tries to add an entry to a symbol table and reports an error if the name is already declared, and another that looks up an identifier and reports an error if it is not found (and maybe adds it to the symbol table with an "undefined" type, which can be used to suppress additional redundant error messages about the same identifier).
You should test your compiler by processing several MiniJava programs, both correct ones and ones with errors. Be sure to check some examples that are syntactically legal (i.e., can be parsed with no errors) but that contain semantic errors. Be sure that your compiler exits with the correct return code of 0 if no errors are detected and 1 if any errors are found in the source program by any phase of the compiler.
You should continue to use your CSE 401 GitLab repository to store the code for this and remaining parts of the compiler project.
For this phase of the project we will be looking to see if your compiler
properly performs at least the semantics checks listed in the Overview section above,
and can print the requested symbol tables
and information to stdout
in a reasonable format.
We also will check whether your compiler can handle MiniJava
programs containing errors as well as ones that are legal,
and properly print any error messages to stderr
.
You should include a brief semantics-notes.txt
file
in the Notes/
top-level directory of your project
describing any additional checks or other extensions you included
in this phase of the compiler. You should also give a brief explanation
of any changes you needed to make in previous parts of the project
(scanner, parser, ASTs) as you implemented the semantics checks.
As with previous parts of the project, your semantics-notes.txt
file should describe briefly how you and
your partner managed the work for this part of the project:
how the work was organized, who did what, and how much of the work
was done by each partner.
As before, you will submit this part of the project by pushing code
to your GitLab repository. Once you are satisfied that everything is working
properly,
create a semantics-final
tag and push that to the repository.
Then we strongly suggest that you create a fresh clone of your
repository in some completely different temporary directory, checkout
the semantics-final
tag, and verify that everything
works as expect. If necessary, fix any problems back in your regular
working copy of the code, push the changes to
the repository, and update the semantics-final
tag to
refer to the correct commit in the repository.
Then make a fresh clone and double check that everything is correct.
When you are satisfied that the semantics-final
tag in the
repository correctly identifies the finished project you are done.