else
,
for
,
to
,
break
,
return
,
array
,
of
,
bool
,
true
,
false
,
and
,
or
.
[
and ]
.
-t
option that prints tokens as they are read, and
perhaps the -c
(print characters) and
-l
(print lines) options.
For all implementation projects, you will be graded on correctness of your implementation, on clarity and good design of your implementation, and on sufficiency of your test cases.
/cse/courses/cse401/CurrentQtr/pl0_lex
contains the few files that differ between the
pl0_base
scanner and the lex version. (A new
scanner.c
, Makefile
, the lex
source file pl0.l
, and a lex library file
ncform
. You should not change
ncform
.)
Set up a new directory "hw2b
", copy
pl0_base
into it, then copy pl0_lex
into it to replace those few files. Again, you can test your
scanner via plzero -T
, but note that the
-c
and -l
options do not work
in the lex version. (This is due the the kludgey way I grafted
the C code from lex to the C++ code of PL/0.)
Remaining parts of the project this quarter will not be based on the lex-scanner, so be careful to keep these Part B files separate from the main project files (Part A).
To really understand lex fully, you should read the (hardcopy) documentation on it in the Unix manuals. Fortunately, Section 3.5 and Question 3.10 in ASU provides a sufficient overview for this assignment. To make things even simpler, I'll describe here just enough that you should be able to manage this assignment.
The lex program you'll need looks like
%% <regular expression0> {return(token0_code);} <regular expression1> {return(token1_code);} ...The regular expressions are just like you've seen in class, using the extensions like "[a-z]" and "+". lex uses many punctuation marks to have special meanings in the regular expresssions; if you get a syntax error, you've probably inadvertently used one. To create a literal string, put it in quotes. For example, "<" will match the left-angle-bracket, whereas simply typing the left-angle-bracket as the pattern will result in a syntax error.
What lex does is to generate a C program, always called
lex.yy.c, that implements a finite state machine capable
of recognizing the regular expressions you've input. When
called at its main entry point yylex()
, the lexer
reads input until an expression is matched, then executes the C
code given after that expression, which in the sample above just
returns an integer code to the caller indicating which token was
found. The caller can see the actual lexeme matched by looking
in the char array yytext
. (The finite state
machine always makes the longest match possible from all the
regular expressions, and in the case of multiple matches of
identical length uses the rule given first in the
lex program.)
The supplied Makefile
should take care of
the details of building lex.yy.c from pl0.l, linking it to the
scanner, etc.
Electronically turn in your entire working directory, including
both the part A and part B code, and your test case files,
clearly labeled. Please "clean up" before you turn in: delete
backup versions of source files, all .o files, executable files,
etc. ("make clean
" might be hepful here.) Please
arrange your directories so that your RCS files are NOT
turned in; see instructions on
Turnin in the
Help section of the web.
Also, turn in hard copy printouts of the source files you
changed, with changes marked, as described in the
instructions on
Printing Code in the
Help section.
cse401-webmaster@cs.washington.edu (Last modified: 10/08/99)