handout #15
CSE341—Programming Languages
Programming
Assignment #7 (Scheme)
due 11 pm Thursday, 11/19/09
In this
assignment you will write a set of functions for parsing various token
sequences that appear in programs written in BASIC. We will be using an approach known as
recursive descent parsing in which we define a series of functions based on the
underlying grammar with one function for each production from the grammar. The grammar we will be using can be described
using the following EBNF productions:
<test> ::= <expression>
("<" | ">" | "<=" | ">=" |
"=" | "<>") <expression>
<expression> ::= <term>
{("+" | "-") <term>}
<term> ::= <element>
{("*" | "/") <element>}
<element> ::= <factor>
{"^" <factor>}
<factor> ::= <number> | -
<number> | ( <expression> ) | <f> ( <expression> )
<f> ::= SIN | COS | TAN | ATN |
EXP | ABS | LOG | SQR | RND | INT
In EBNF the
pipe character (“|”) indicates “or” and curly braces indicate “0 or more
of.” For example, the production for
<expression> indicates that it is composed of a <term> followed by
0 or more “+ <term>” or “- <term>”.
So it would match sequences like this:
<term> + <term> + <term> - <term> + <term> - <term> - <term>
Of course,
like <sentence>, <term> is also a nonterminal,
so it would be filled in with appropriate terminal symbols.
You should
define a function to parse each of the five major elements described
above. The sixth rule is slightly
different (described below). Your functions
will take a list of tokens as an argument.
It should consume tokens from the list and replace those tokens with the
value obtained by evaluating the tokens.
For example, parse-term evaluates the multiplicative operators * and
/. If it is passed the list (2.5 * 3.4 /
2 THEN 210), it will evaluate 2.5 * 3.4 / 2, obtain 4.25 and replace the tokens
it parsed with 4.25, returning the list (4.25 THEN 210). These functions will be “greedy” in the sense
that they will consume as many tokens from the front of the list as they can
that can be part of the given grammar element.
The token “THEN” can’t be part of a term, so the function stops
consuming tokens when it encounters it.
Even though it is greedy, it should pay attention only to tokens that
appear at the front of the list.
Parentheses
are difficult to handle as symbols, so the lists you will be processing use the
tokens lparen and rparen
for left parenthesis and right parenthesis.
The sixth rule
of the grammar indicates a list of legal function names. Because it doesn’t mention any other grammar
elements, it doesn’t need its own parsing function. You are welcome to define a helper function
that processes it, but it’s not a requirement.
In evaluating expressions involving these functions, you will need to
translate the named function into a Scheme function. The skeleton file for the assignment includes
the following association list that will help you to accomplish this task:
(define functions
'((SIN . sin) (COS . cos) (TAN . tan) (ATN . atan) (EXP . exp) (ABS . abs)
(LOG . log) (SQR . sqrt) (RND . rand) (INT . trunc)))
Each element
of the association list is a dotted pair where the car is a key and the cdr is a value (like a map in Java). Scheme has a standard function called assoc
that will search for a particular key.
For example, given the list above, (assoc ‘ATN functions) returns (ATN . atan). To get the “atan”
symbol from this, you’d take the cdr
of it. Because it’s a dotted pair and
not a list, you use cdr instead of cadr. If the key is
not found, as in the call (assoc ‘FOO functions), you get #f as a result.
There are
several cases where Scheme returns rational numbers when we would prefer to
obtain real numbers. For example,
division can produce such results as can exponentiation with integers where the
exponent is negative:
> (/ 3 4)
3/4
> (expt 3 -2)
1/9
For these
cases, you should call the function exact->inexact to convert the rational
to a real:
> (exact->inexact (/ 3 4))
0.75
> (exact->inexact (expt 3 -2))
0.1111111111111111
You are to
write the following parsing functions:
1.
(30 points) Write a function
parse-factor that parses a factor at the front of a list, replacing the tokens
that were part of the factor with the numeric value of the factor. As indicated in the grammar above, a factor
is either a simple number or a minus followed by a number (the negation of a
number) or a parenthesized expression or a parenthesized call on a
function. For example:
> (parse-factor '(3.5 2.9 OTHER STUFF))
(3.5 2.9 OTHER STUFF)
> (parse-factor '(- 7.9 3.4 * 7.2))
(-7.9 3.4 * 7.2)
> (parse-factor '(lparen 7.3 - 3.4 rparen + 3.4))
(3.9 + 3.4)
> (parse-factor '(SQR lparen 12 + 3 * 6 - 5 rparen))
(5)
Keep in mind
that your functions should be greedy, but not overly greedy. For example:
> (parse-factor '(- 13 - 17 - 9))
(-13 - 17 - 9)
Only the first
combination of minus and number is turned into a negative number. The method should process just one factor at
the front of the list, not multiple factors.
2.
(15 points) Write a function
parse-element that parses an element at the front of a list, replacing the
tokens that were part of the element with the numeric value of the
element. As indicated in the grammar
above, an element is a series of factors separated by the token ^ which
indicates exponentiation. Your function
should consume as many tokens as possible that could be part of an
element. The ^ operators should be
evaluated left-to-right. You can call
the standard expt function to compute the
exponent. As noted above, the result can
be a negative number when the exponent is negative, so you need to call exact->inexact
in that case to convert from a rational to a real. Below are several examples of parse-element’s
intended behavior:
> (parse-element '(2 ^ 2 ^ 3 THEN 450))
(64 THEN 450)
> (parse-element '(2 ^ 2 ^ -3 THEN 450))
(0.015625 THEN 450
> (parse-element '(2.3 ^ 4.5 * 7.3))
(42.43998894277659 * 7.3)
> (parse-element '(7.4 + 2.3))
(7.4 + 2.3)
> (parse-element '(2 ^ 3 ^ 4 ^ 5 2 ^ 3 4 ^ 5))
(1152921504606846976 2 ^ 3 4 ^ 5)
3.
(15 points) Write a function
parse-term that parses a term at the front of a list, replacing the tokens that
were part of the term with the numeric value of the term. As indicated in the grammar above, a term is
a series of elements separated by the tokens * and / (multiplication and
division). Your function should consume
as many tokens as possible that could be part of a term. The * and / operators should be evaluated
left-to-right. Scheme will potentially be
returning a rational when we use division, which we don’t want. So for each division, return the value
obtained by passing the result to the standard Scheme function
exact->inexact, as in (exact->inexact 7/8), which returns 0.875. For example:
> (parse-term '(2.5 * 4 + 9.8))
(10.0 + 9.8)
> (parse-term '(38.7 / 2 / 3 THEN 210))
(6.45 THEN 210)
> (parse-term '(7.4 * lparen 2.4 - 3.8 rparen / 4 - 8.7))
(-2.59 - 8.7)
> (parse-term '(3 / 4 + 9.7))
(0.75 + 9.7)
> (parse-term '(2 * 3 * 4 + 3 * 8))
(24 + 3 * 8)
> (parse-term '(24 / 2 - 13.4))
(12.0 - 13.4)
4.
(15 points) Write a function
parse-expression that parses an expression at the front of a list, replacing
the tokens that were part of the expression with the numeric value of the expression. As indicated in the grammar above, an
expression is a series of terms separated by the tokens + and – (addition and
subtraction). Your function should
consume as many tokens as possible that could be part of an expression. The + and - operators should be evaluated
left-to-right. For example:
> (parse-expression '(12.4 - 7.8 * 3.5 THEN 40))
(-14.9 THEN 40)
> (parse-expression '(2 + 3.4 - 7.9 <= 7.4))
(-2.5 <= 7.4)
> (parse-expression '(3 * 4 ^ 2 / 5 + SIN lparen 2 rparen))
(10.50929742682568)
> (parse-expression '(15 - 3 - 2 foo 2 + 2))
(10 foo 2 + 2)
5.
(25 points) Write a function
parse-test that parses a test at the front of the list, replacing the tokens
that were part of the test with the boolean
value of the test (#t or #f). As
indicated in the grammar above, a test is an expression followed by a
relational operator followed by an expression.
For example:
> (parse-test '(3.4 < 7.8 THEN 19))
(#t THEN 19)
> (parse-test '(2.3 - 4.7 ^ 2.4 <> SQR lparen 8 - 4.2 ^ 2 * 9 rparen FOO))
(#t FOO)
> (parse-test '(2 ^ 4 = 4 ^ 2))
(#t)
In writing
your parse routines, you should test for potential errors and should call the
error procedure if an error is encountered.
The call should indicate which parsing routine detected the error, so
you should execute one of the following five commands:
(error "illegal test")
(error "illegal expression")
(error "illegal term")
(error "illegal element")
(error "illegal factor")
Most errors
will be detected by parse-factor since it is the lowest level construct in the
grammar. In grading, we won’t be
concerned about which of your parsing methods detects the error. All that we care about is that your code
generates the error (versus, say, calling car on an empty list).
Below are some
examples of calls that should produce errors:
(parse-term '(2.5 * 4 *)) ; no number after second *
(parse-factor '(foo bar)) ; the symbol foo is not a legal
factor
(parse-factor '(- foo)) ; a minus must be followed by a number
(parse-term '(3 * foo)) ; no number after *
(parse-factor '(lparen rparen))
; needs an expression between parens
(parse-factor '(lparen
2 + 3 4)) ; no right paren
when we encounter 4
You are not
allowed to use list mutation or vectors in solving this problem.