CSE P 501 Wi08 - Project I - Scanner

Due: Wednesday, January 23, at 11:00 pm. You should turn your project in using the assignment drop box (links on the assignments page).

Overview

The purpose of this assignment is to construct a scanner for MiniJava. We suggest you use the JFlex scanner generator, which is a Java version of the venerable Lex/Flex programs. Links to these tools as well as other resources are available on the main CSE P 501 project page. These programs work with the CUP parser generator, which we will use for the next phase of the project. (You will want to download CUP and install it now, since it is probably easiest to define lexical classes in an otherwise empty CUP specification.) The ideas behind JFlex and the input language it supports are taken directly from Lex, which is described extensively in the dragon book.

You will need to examine the source grammar to decide which symbols are terminals and which are non-terminals (hint: be sure to include operators, brackets, and other punctuation -- but not comments and whitespace -- in the terminal symbols).

You should test your scanner on a variety of files, including some that contain legal programs and others that contain random input. Be sure your scanner does something reasonable if it encounters junk in the input file. (Crashing or infinite looping is not reasonable; complaining, skipping the junk, and moving on is.) Remember, it is up to the parser to decide if the tokens in the input make up a well-formed MiniJava program; the scanner's job is simply to deliver the next token whenever it is called.

What to Hand In

You should hand in your source files (JFlex specification) and some sample source files, and the files containing the corresponding token stream that your scanner test program produces. You should not hand in the intermediate file(s) produced by by the scanner generator -- machine generated code is generally unenlightening, consisting of a bunch of tables and uncommented code, if it is readable at all.