
CSE 401 15wi - Project I - Scanner
Due: Thursday, Jan. 22 at 11:00 pm. You should turn your project in using the assignment drop box (see link on the course home page) following the instructions at the bottom of this writeup.
Added 1/19: New information about class paths and other details needed to run the main compiler program from a command line. Also added more specific details about preparing the tar
file to be handed in.
Added 1/20: Use Java 7 only, not Java 8. Your project will be tested using Java 7.
Overview
The purpose of this assignment is to construct a scanner for MiniJava. You should use the JFlex scanner generator, which is a Java version of the venerable Lex family of programs. The ideas behind JFlex and the input language it supports are taken directly from Lex and Flex, which are described in most compiler books and have extensive documentation online. Links to JFlex and other tools are available on the main CSE 401 project page, and there is a starter project there that you should use to see how these tools work together. These programs work with the CUP parser generator, which we will use for the next phase of the project. Although this phase of the project does not use the CUP grammar, it does require specifying the tokens in the CUP input file. You will need to update those definitions to ones appropriate for the full MiniJava language so they can be used by your scanner. Both JFlex and CUP are included in the starter code.
To get started, one person in your group should download the starter project, unpack the files, then add and push them to your group's gitlab repository. The other person should then pull from the repository to get their copy of the files. See the CSE 401 git tutorial for basic information about working with gitlab for the course project.
You will need to examine the MiniJava source grammar to decide which symbols are terminals and which are non-terminals (hint: be sure to include operators, brackets, and other punctuation -- but not comments and whitespace -- in the set of terminal symbols).
The starter code contains a TestScanner
program that reads a file from
standard input and prints a readable representation of the tokens in
that file to standard output. You can run it with the command ant test-scanner
, or the equivalent command from inside a programming environment like eclipse. This test program is intended to show
how to use a JFlex scanner and you will want to study it to see how
that works. But for the compiler itself you should create a more
appropriate main program.
You should create a Java class named MiniJava
with a
main
method that controls execution of your compiler.
This method should examine its arguments (the String
array parameter that is found in every Java
main
method) and work as follows. The idea is that when this method is executed using the
command
java MiniJava -S filename.javathe compiler should open the named input file and read tokens from it by calling the scanner repeatedly until the end of the input file is reached. The tokens should be printed on standard output (Java's
System.out
) using a format similar to the one
produced by the TestScanner
program in the starter code.
If the MiniJava
main program is executed with the -S option but with no input filename, it should read from standard input (System.in
)
and print tokens to standard output as before.
In case it's useful, we've provided a small demonstration program
OptionalFile.java
as an example of how a program can read from standard input
or a named file depending on whether a filename is provided.
The source code for MiniJava.java
should be in the top-level project src
folder, and ant
will compile it automatically along with all the other project files when needed.
The actual details of running MinjJava
's main
method from a command prompt are a bit more complicated, because the Java virtual machine needs to know where the compiled classes and libraries are located. The following commands should recompile any necessary files and run the scanner:
ant java -cp build/classes:lib/java-cup-11b.jar MiniJava -S filename.javaIf you set the
CLASSPATH
environment variable to point to the library jar
file and compiled classes directory,
you should not need to provide the -cp
argument on the java
command.
The build.xml
file processed by ant
already contains options to specify the
class path, which is why you don't have to specify those things to run targets like test-scanner
using ant
.
You can add similar targets to build.xml
to run your MiniJava
program or other test programs using ant
,
and you can use additional ant
options in build.xml
to specify program arguments like -S
.
To test your scanner, you should use a variety of input files, including some that contain legal MiniJava programs and others that contain random input. Be sure your scanner does something reasonable if it encounters junk in the input file. (Crashing, halting immediately on the first error, or infinite looping is not reasonable; complaining, skipping the junk, and moving on is.) Remember, it is up to the parser to decide if the tokens in the input make up a well-formed MiniJava program; the scanner's job is simply to deliver the next token whenever it is called.
This assignment only asks you to implement the scanner part of the project. The parser, abstract syntax trees, and CUP grammar rules will come in the next part.
Your code should only use language features available in Java 7, which is the environment that will be used to test your compiler project..
You should use your CSE 401 gitlab repository to store the code for this and remaining parts of the compiler project.
What to Hand In
The main information we will examine for this phase of the project
is your JFlex and CUP specification files, your MiniJava
class and main program,
and your test input files.
Include example source files that demonstrate the abilities
of your scanner, including at least one with an error in the middle
of the file. You should not hand in the intermediate file(s)
produced by the JFlex scanner generator -- machine generated code is
generally unenlightening, consisting of a bunch of tables and
uncommented code, if it is readable at all.
Your code should run on the lab linux machines (or attu) when built
with ant
. You should do an ant clean
, then bundle up
your compiler directory in a tar file and turn that in. That will
ensure that we have all the pieces of your compiler if we need to
check something, and we will use the same procedure for later phases
of the project. To create the tar file, run the following commands starting in your main project directory (the one that contains build.xml
)
ant clean cd .. tar cvfz scanner.tar.gz your_project_directory_nameThen turn in the
scanner.tar.gz
file.
You and your partner should turn in only a single copy of the
project using one of your UW netids. You should include a file named
INFO
at the top level of your directory with your names and uw
netids so we can correctly identify everyone in the group
and get feedback to you.
To be sure that everything is in working order, we strongly suggest that before you create the tar file you first run ant clean; ant
to rebuild your
project from scratch, then run any tests you want, then run the
commands given above
to create the actual tar file to be turned in.
Computer Science & Engineering University of Washington Box 352350 Seattle, WA 98195-2350 (206) 543-1695 voice, (206) 543-2969 FAX
Comments to adminanchor