CSE P 501 Au05 - Project I - Scanner

Due: Wednesday, October 26, at 11:00 pm. You should turn your project in using this online turnin form.

Overview

The purpose of this assignment is to construct a scanner for MiniJava. We suggest you use the JLex or JFlex scanner generator tools, which are Java versions of the venerable Lex/Flex programs. Links to these tools as well as other resources are available on the main CSE P 501 project page. These programs work with the CUP parser generator, which we will use for the next phase of the project. JLex is described in the first edition of the Appel book and handouts will be available in class on Tuesday, October 18. JFlex is a more recent tool, but the core ideas are the same from the user's perspective. Meanwhile, there are tons of resources about these tools on the CSE P 501 linked web pages as well as all over the web.

Alternatively, you might want to use the SableCC compiler generator described in the second edition of the book, although students in the past have found the documentation for this to be sketchy at best, requiring a fair amount of experimentation to figure it out. But the input language to SableCC is basically the same as JLex/JFlex and related tools, so it does not particularly matter which tool you use.

You will need to examine the source grammar to decide which symbols are terminals and which are non-terminals (hint: remember to include operators, brackets, and other punctuation in the terminal symbols).

You should test your scanner on a variety of files, including some that contain legal programs and others that contain random input. Be sure your scanner does something reasonable if it encounters junk in the input file. (Crashing is not reasonable; complaining, skipping the junk, and moving on is.)

What to Hand In

You should hand in your source files (JLex/JFlex or equivalent specification) and some sample source files and the files containing the corresponding token stream that your scanner test program produces. You do not need to hand in the intermediate file(s) produced by by the scanner generator -- machine generated code is generally unenlightening, consisting of a bunch of tables and uncommented code, if it is readable at all.

A turnin form is available online (see link at the top of this page). You can either use it to turn in your files individually, or you can bundle all of your files in an archive (zip, jar, or tar format) and turn that in.