handout #16
CS341—Programming Languages
Programming
Assignment #7 (Scheme)
due 5 pm Saturday, 3/3/07
In this assignment you
will implement a significant subset of the BASIC programming language as
described in the 1964 Dartmouth report available here:
http://www.bitsavers.org/pdf/dartmouth/BASIC_Oct64.pdf
You will be provided a
starter file called hw7.scm that has code for handling the top-level user
interaction and a procedure called tokenize that will convert user input into
lists of tokens. You should include your
answers in hw7.scm.
The first set of
procedures you are to write parse various token sequences. We will be using an approach known as
recursive descent parsing in which we define a series of procedures based on
the underlying grammar with one procedure for each production from the
grammar. The grammar we will be using
can be described using the following EBNF productions:
<test> ::= <expression>
("<" | ">" | "<=" | ">=" | "="
| "<>") <expression>
<expression> ::= <term>
{("+" | "-") <term>}
<term> ::= <factor>
{("*" | "/") <factor>}
<factor> ::= <number> | -
<number> | ( <expression> )
You should define a
procedure to parse each of the four elements described above. Your procedures will take a list of tokens as
an argument. It should consume tokens
from the list and replace those tokens with the value obtained by evaluating
the tokens. For example, parse-term
evaluates the multiplicative operators * and /.
If it is passed the list (2.5 * 3.4 / 2 THEN 210), it will evaluate 2.5
* 3.4 / 2, obtain 4.25 and replace the tokens it parsed with 4.25, returning
the list (4.25 THEN 210). These
procedures will be “greedy” in the sense that they will consume as many tokens
as they can that can be part of the given grammar element. The token “THEN” can’t be part of a term, so
the procedure stops consuming tokens when it encounters it.
Parentheses are
difficult to handle as symbols, so the lists you will be processing use the
tokens lparen and rparen for left parenthesis and right parenthesis.
You are to write the
following parsing procedures:
1.
(5 points) Write a procedure parse-factor
that parses a factor at the front of a list, replacing the tokens that were
part of the factor with the numeric value of the factor. As indicated in the grammar above, a factor
is either a simple number or a minus followed by a number (the negation of a
number) or a parenthesized expression.
For example:
> (parse-factor '(3.5 2.9 OTHER
STUFF))
(3.5 2.9 OTHER STUFF)
> (parse-factor '(- 7.9 3.4
* 7.2))
(-7.9 3.4 * 7.2)
> (parse-factor '(lparen 7.3
- 3.4 rparen + 3.4))
(3.9 + 3.4)
2.
(10 points) Write a procedure parse-term
that parses a term at the front of a list, replacing the tokens that were part
of the factor with the numeric value of the factor. As indicated in the grammar above, a term is
a series of factors separated by the tokens * and / (multiplication and
division). Your procedure should consume
as many tokens as possible that could be part of a factor. The operators should be evaluated
left-to-right. For example:
> (parse-term '(2.5 * 4 + 9.8))
(10.0 + 9.8)
> (parse-term '(38.7 / 2 / 3 THEN 210))
(6.45 THEN 210)
> (parse-term '(7.4 * lparen 2.4 - 3.8 rparen / 4 - 8.7))
(-2.59 - 8.7)
3.
(10 points) Write a procedure
parse-expression that parses an expression at the front of a list, replacing
the tokens that were part of the expression with the numeric value of the
factor. As indicated in the grammar
above, an expression is a series of terms separated by the tokens + and –
(addition and subtraction). Your
procedure should consume as many tokens as possible that could be part of an
expression. The operators should be
evaluated left-to-right. For example:
> (parse-expression '(12.4 - 7.8 * 3.5 THEN 40))
(-14.9 THEN 40)
4.
(10 points) Write a procedure parse-test
that parses a test at the front of the list, replacing the tokens that were
part of the test with the boolean value of the test
(#t or #f). As indicated in the grammar
above, a test is an expression followed by a relational operator followed by an
expression. For example:
In writing your parse
routines, you should test for potential errors and should call the error
procedure if an error is encountered.
The call should indicate which parsing routine detected the error, as
in:
(error
"illegal factor")
For the second part of
the assignment, you are going to complete a procedure called run-program. It will be passed a list of BASIC
instructions to execute. Each value in
the list will have three parts: a line number, the original text of the line
(something your code will ignore) and a list of tokens for the input line. For example, the following BASIC program:
10 LET X = 2.4 * 3.7
20 PRINT X
30 LET X = X/2
40 IF X<200 THEN 20
50 END
Would be represented
in list form as follows:
((10 " LET X = 2.4 * 3.7"
(LET X = 2.4 * 3.7))
(20 " PRINT X" (PRINT X))
(30 " LET X = X/2" (LET X = X / 2))
(40 " IF X<200 THEN 20" (IF X
< 200 THEN 20))
(50 " END" (END)))
You are to implement
the following BASIC commands:
5.
(5 points) REM. The REM command is for comments (short for
REMarks). The general form is:
REM <text>
A REM command does
nothing.
6.
(5 points) END. The END command terminates a program. The general form is:
END
Your program should
terminate execution when it encounters this command. The BASIC language description indicates that
this must be the last line of a BASIC program.
We will not enforce this rule (partly because we are not implementing
the STOP command).
7.
(15 points) LET. The LET command assigns a value to a
variable. The general form is:
LET <variable> =
<expression>
The expression will
always have a numeric result. The
expression might include references to other variables. The LET command can be used to define a new
variable or to reassign a variable that was previously assigned a value.
8.
(10 points) PRINT. The PRINT command prints the values of
expressions. The general form is:
PRINT <expression>,
<expression>,..., <expression>
Commas will be
represented with the token comma, as in (PRINT 3.4 comma 7.8). The PRINT command should evaluate each
expression, printing its value. Values
should be separated by a space and each PRINT command should produce a complete
line of output. You should not print a
trailing space on the
line. The expressions are
guaranteed to be numeric, although they might refer to variables.
9.
(10 points) GOTO. The GOTO command transfers control to another
part of the program. The general form
is:
GOTO <line number>
A GOTO command
transfers control to the given line number.
The line number will be a simple integer (not an expression). If the line number does not exist, your
program should stop executing with a message indicating the illegal line
number.
10. (5
points) IF. The IF command
is for conditionally transferring control.
The general form is:
IF <test> THEN <line
number>
The test will be of
the form described in the grammar. The
line number will be a simple integer. If
the test evaluates to #t, control should be transferred to the given line number. Otherwise it should proceed to the next
statement, as usual. If the program
attempts to transfer to a line number that does not exist, your program should
stop executing with a message indicating the illegal line number.
11. (10
points) GOSUB/RETURN. The GOSUB command
transfers control to a subroutine (a section of code that is terminated with a
RETURN command). The general form is:
GOSUB <line number>
The GOSUB command is
similar to GOTO in that it transfers control to another part of the
program. But in the case of GOSUB, a
subsequent call on RETURN will return the program to the statement after the
call on GOSUB. The basic form of the
RETURN command is:
RETURN
BASIC supports only a
single subroutine execution at any given time.
Your program should ignore a GOSUB command and should print an error
message if the program attempts to execute two GOSUB commands in a row without
a call on RETURN in between. Your
program should ignore a RETURN command and should print an error message if the
program attempts to execute a RETURN command when there is no active subroutine
execution.
12. (5
points) Program execution should proceed sequentially from the first line of
the program until it encounters an END command.
It should produce this message when the program finishes executing:
PROGRAM TERMINATED
If it executes all
statements of the program without executing an END command, your program should
produce this message instead:
PROGRAM TERMINATED WITHOUT
REACHING END
The first token after
the line number indicates the kind of command to execute (REM, END, LET, IF,
etc.). If your program encounters an
illegal token, it should ignore the command and should produce an error message
indicating the illegal command.
Other than the
specific error handling described in this writeup, you may assume in writing
your code that the program you are provided is a legal program.
You will notice that
the starter code defines a global variable called program that stores the
program. It is helpful to be able to
declare a small number of global variables, although you don’t want to overuse
the technique. You are allowed to
declare up to 5 global variables in writing your solution (Stuart’s has 3
globals other than program).
The starter code
provided in hw7.scm gives the following command options to the user:
·
LIST: show a listing of the current
program
·
RUN: run the current program
·
SAVE: save the current program in
program.bas
·
STOP: exit the BASIC interpreter
·
UNSAVE: restore a program from
program.bas
In addition, the user
can enter lines of the program, which will all begin with a line number. They can be entered in any order. The user can also replace a line. To delete a line, the user should type the
line number followed by enter.