handout #15
CSE341—Programming Languages
Programming
Assignment #7 (Scheme)
due 11 pm Wednesday, 5/20/09
In this
assignment you will write a set of procedures for parsing various token
sequences that appear in programs written in BASIC. We will be using an approach known as
recursive descent parsing in which we define a series of procedures based on
the underlying grammar with one procedure for each production from the
grammar. The grammar we will be using
can be described using the following EBNF productions:
<test> ::= <expression>
("<" | ">" | "<=" | ">=" |
"=" | "<>") <expression>
<expression> ::= <term>
{("+" | "-") <term>}
<term> ::= <element>
{("*" | "/") <element>}
<element> ::= <factor>
{"^" <factor>}
<factor> ::= <number> | -
<number> | ( <expression> ) | <f> ( <expression> )
<f> ::= SIN | COS | TAN | ATN |
EXP | ABS | LOG | SQR | RND | INT
In EBNF the
pipe character (“|”) indicates “or” and curly braces indicate “0 or more
of.” For example, the production for
<expression> indicates that it is composed of a <term> followed by
0 or more “+ <term>” or “- <term>”.
So it would match sequences like this:
<term> + <term> + <term> - <term> + <term> - <term> - <term>
Of course,
like <sentence>, <term> is also a nonterminal,
so it would be filled in with appropriate terminal symbols.
You should
define a procedure to parse each of the five major elements described
above. The sixth rule is slightly
different (described below). Your
procedures will take a list of tokens as an argument. It should consume tokens from the list and
replace those tokens with the value obtained by evaluating the tokens. For example, parse-term evaluates the
multiplicative operators * and /. If it
is passed the list (2.5 * 3.4 / 2 THEN 210), it will evaluate 2.5 * 3.4 / 2,
obtain 4.25 and replace the tokens it parsed with 4.25, returning the list
(4.25 THEN 210). These procedures will
be “greedy” in the sense that they will consume as many tokens from the front
of the list as they can that can be part of the given grammar element. The token “THEN” can’t be part of a term, so
the procedure stops consuming tokens when it encounters it. Even though it is greedy, it should pay
attention only to tokens that appear at the front of the list.
Parentheses
are difficult to handle as symbols, so the lists you will be processing use the
tokens lparen and rparen
for left parenthesis and right parenthesis.
The sixth rule
of the grammar indicates a list of legal function names. Because it doesn’t mention any other grammar
elements, it doesn’t need its own parsing procedure. You are welcome to define a helper procedure
that processes it, but it’s not a requirement.
In evaluating expressions involving these functions, you will need to
translate the named function into a Scheme procedure. The skeleton file for the assignment includes
the following association list that will help you to accomplish this task:
(define functions
'((SIN . sin) (COS . cos) (TAN . tan) (ATN . atan) (EXP . exp) (ABS . abs)
(LOG . log) (SQR . sqrt) (RND . random) (INT . truncate)))
Each element
of the association list is a dotted pair where the car is a key and the cdr is a value (like a map in Java). Scheme has a standard procedure called assoc
that will search for a particular key.
For example, given the list above, (assoc ‘ATN functions) returns (ATN . atan). To get the “atan”
symbol from this, you’d take the cdr
of it. Because it’s a dotted pair and
not a list, you use cdr instead of cadr. If the key is
not found, as in the call (assoc ‘FOO functions), you get #f as a result.
You are to
write the following parsing procedures:
1.
(30 points) Write a procedure
parse-factor that parses a factor at the front of a list, replacing the tokens
that were part of the factor with the numeric value of the factor. As indicated in the grammar above, a factor
is either a simple number or a minus followed by a number (the negation of a
number) or a parenthesized expression or a parenthesized call on a
function. For example:
> (parse-factor '(3.5 2.9 OTHER STUFF))
(3.5 2.9 OTHER STUFF)
> (parse-factor '(- 7.9 3.4 * 7.2))
(-7.9 3.4 * 7.2)
> (parse-factor '(lparen 7.3 - 3.4 rparen + 3.4))
(3.9 + 3.4)
> (parse-factor '(SQR lparen 12 + 3 * 6 - 5 rparen))
(5)
2.
(15 points) Write a procedure
parse-element that parses an element at the front of a list, replacing the
tokens that were part of the element with the numeric value of the
element. As indicated in the grammar
above, an element is a series of factors separated by the token ^ which indicates
exponentiation. Your procedure should
consume as many tokens as possible that could be part of an element. The operators should be evaluated
left-to-right. You can call the standard
expt procedure to compute the exponent. For example:
> (parse-element '(2 ^ 2 ^ 3 THEN 450))
(64 THEN 450)
> (parse-element '(2.3 ^ 4.5 * 7.3))
(42.43998894277659 * 7.3)
> (parse-element '(7.4 + 2.3))
(7.4 + 2.3)
3.
(15 points) Write a procedure
parse-term that parses a term at the front of a list, replacing the tokens that
were part of the term with the numeric value of the term. As indicated in the grammar above, a term is
a series of elements separated by the tokens * and / (multiplication and
division). Your procedure should consume
as many tokens as possible that could be part of a term. The operators should be evaluated
left-to-right. Scheme
will potentially returning a rational when we use division, which we
don’t want. So for each division, return
the value obtained by passing the result to the standard Scheme procedure
exact->inexact, as in (exact->inexact 7/8), which returns 0.875. For example:
> (parse-term '(2.5 * 4 + 9.8))
(10.0 + 9.8)
> (parse-term '(38.7 / 2 / 3 THEN 210))
(6.45 THEN 210)
> (parse-term '(7.4 * lparen 2.4 - 3.8 rparen / 4 - 8.7))
(-2.59 - 8.7)
> (parse-term '(3 / 4 + 9.7))
(0.75 + 9.7)
4.
(15 points) Write a procedure
parse-expression that parses an expression at the front of a list, replacing
the tokens that were part of the expression with the numeric value of the
expression. As indicated in the grammar
above, an expression is a series of terms separated by the tokens + and –
(addition and subtraction). Your
procedure should consume as many tokens as possible that could be part of an
expression. The operators should be
evaluated left-to-right. For example:
> (parse-expression '(12.4 - 7.8 * 3.5 THEN 40))
(-14.9 THEN 40)
> (parse-expression '(2 + 3.4 - 7.9 <= 7.4))
(-2.5 <= 7.4)
> (parse-expression '(3 * 4 ^ 2 / 5 + SIN lparen 2 rparen))
(10.50929742682568)
5.
(25 points) Write a procedure
parse-test that parses a test at the front of the list, replacing the tokens
that were part of the test with the boolean
value of the test (#t or #f). As
indicated in the grammar above, a test is an expression followed by a
relational operator followed by an expression.
For example:
> (parse-test '(3.4 < 7.8 THEN 19))
(#t THEN 19)
> (parse-test '(2.3 - 4.7 ^ 2.4 <> SQR lparen 8 - 4.2 ^ 2 * 9 rparen FOO))
(#t FOO)
> (parse-test '(2 ^ 4 = 4 ^ 2))
(#t)
In writing
your parse routines, you should test for potential errors and should call the
error procedure if an error is encountered.
The call should indicate which parsing routine detected the error, so
you should execute one of the following five commands:
(error "illegal test")
(error "illegal expression")
(error "illegal term")
(error "illegal element")
(error "illegal factor")
Most errors
will be detected by parse-factor since it is the lowest level construct in the
grammar. In grading, we won’t be
concerned about which of your parsing methods detects the error. All that we care about is that your code
generates the error (versus, say, calling car on an empty list).
Below are some
examples of calls that should produce errors:
(parse-term '(2.5 * 4 *)) ; no number after second *
(parse-factor '(foo bar)) ; the symbol foo is not a legal
factor
(parse-factor '(- foo)) ; a minus must be followed by a number
(parse-term '(3 * foo)) ; no number after *
(parse-factor '(lparen rparen))
; needs an expression between parens
(parse-factor '(lparen
2 + 3 4)) ; no right paren
when we encounter 4
You are not
allowed to use list mutation in solving this problem.
Keep in mind
that your functions should be greedy, but not overly greedy. For example:
(parse-factor '(- 13 - 17 - 9))
returns (-13 - 17 - 9)
Only the first
combination of minus and number is turned into a negative number. The method should process just one factor at
the front of the list, not multiple factors.