Write a regular expression or set of regular expressions that generate C integer constants as described above.An integer constant consisting of a sequence of digits is taken to be octal if it begins with 0 (digit zero), decimal otherwise. Octal constants do not contain the digits 8 or 9. A sequence of digits preceded by 0x or 0X (digit zero) is taken to be a hexadecimal integer. The hexadecimal digits include a or A through f or F with values 10 through 15.
An integer constant may be suffixed with the letter u or U, to specify that it is unsigned. It may also be suffixed by the letter l or L to specify that it is long.
This is the first of two programming assignments to build an interpreter for the language specified in the Calculator Language handout. We will build the interpreter in two logical parts - a scanner that reads the calculator program from the input stream and breaks the input into tokens, and a parser/evaluator that parses the input token stream according to the specifications in the grammar and executes the program. The program should be implemented in Ruby. For the most part it will just be a collection of top-level functions, but you should create classes when these are helpful in organizing the code.
In this part of the program you should implement a scanner that provides a
single function next_token
. Each time next_token
is
called it should return a new Token
object that describes the
next terminal symbol read from the input. Objects of class Token
should
respond to the following messages:
kind
- return the lexical class of the token as a string.
This should be a distinct string for each lexical class in the program,
possibly just the operator or keyword itself. However, all identifiers should
be treated as instances of a single lexical class and the kind
method
should return the same value for every identifier. Similarly, all numbers
should be treated as a single lexical class. You will also want to have
a lexical class to represent the end of an input line, since end-of-line
is
semantically meaningful - it indicates the end
of a statement.value
- if the token kind
is either an identifier
or number, then this message should return the actual identifier or floating-point
value. Its value is not defined for other lexical classes.to_s
- the standard Ruby "to string" method. This
should produce a descriptive string representation of the token, including
the associated value if the token is an identifier or a number.To test the scanner, you shoud write a small program that calls next_token
repeatedly
to get the next token from the input and prints the result. After reading and
printing a quit
or exit
token, the test program should stop.
Feel free to take advantage of Ruby's string and regular expression classes and methods to chop the input into tokens.
Be sure to include your name and other identifying information as comments
in your code. There should also be descriptive comments as needed; in particular
for your
Token
class to describe the possible values returned
by the
kind
method.
Electronic: Turn in your program electronically using the link on the assignments page.
Paper: Turn in answers to the written problems, a copy of your code, and some examples of test input and output that demonstrate that your scanner works.