handout #16

CS341—Programming Languages

Programming Assignment #7 (Scheme)

due 5 pm Saturday, 3/3/07

In this assignment you will implement a significant subset of the BASIC programming language as described in the 1964 Dartmouth report available here:

http://www.bitsavers.org/pdf/dartmouth/BASIC_Oct64.pdf

You will be provided a starter file called hw7.scm that has code for handling the top-level user interaction and a procedure called tokenize that will convert user input into lists of tokens.  You should include your answers in hw7.scm.

The first set of procedures you are to write parse various token sequences.  We will be using an approach known as recursive descent parsing in which we define a series of procedures based on the underlying grammar with one procedure for each production from the grammar.  The grammar we will be using can be described using the following EBNF productions:

<test> ::= <expression> ("<" | ">" | "<=" | ">=" | "=" | "<>") <expression>

<expression> ::= <term> {("+" | "-") <term>}

<term> ::= <factor> {("*" | "/") <factor>}

<factor> ::= <number> | - <number> | ( <expression> )

You should define a procedure to parse each of the four elements described above.  Your procedures will take a list of tokens as an argument.  It should consume tokens from the list and replace those tokens with the value obtained by evaluating the tokens.  For example, parse-term evaluates the multiplicative operators * and /.  If it is passed the list (2.5 * 3.4 / 2 THEN 210), it will evaluate 2.5 * 3.4 / 2, obtain 4.25 and replace the tokens it parsed with 4.25, returning the list (4.25 THEN 210).  These procedures will be “greedy” in the sense that they will consume as many tokens as they can that can be part of the given grammar element.  The token “THEN” can’t be part of a term, so the procedure stops consuming tokens when it encounters it.

Parentheses are difficult to handle as symbols, so the lists you will be processing use the tokens lparen and rparen for left parenthesis and right parenthesis.

You are to write the following parsing procedures:

1.      (5 points) Write a procedure parse-factor that parses a factor at the front of a list, replacing the tokens that were part of the factor with the numeric value of the factor.  As indicated in the grammar above, a factor is either a simple number or a minus followed by a number (the negation of a number) or a parenthesized expression.  For example:

> (parse-factor '(3.5 2.9 OTHER STUFF))

(3.5 2.9 OTHER STUFF)

> (parse-factor '(- 7.9 3.4 * 7.2))

(-7.9 3.4 * 7.2)

> (parse-factor '(lparen 7.3 - 3.4 rparen + 3.4))

(3.9 + 3.4)

2.      (10 points) Write a procedure parse-term that parses a term at the front of a list, replacing the tokens that were part of the factor with the numeric value of the factor.  As indicated in the grammar above, a term is a series of factors separated by the tokens * and / (multiplication and division).  Your procedure should consume as many tokens as possible that could be part of a factor.  The operators should be evaluated left-to-right.  For example:

> (parse-term '(2.5 * 4 + 9.8))

(10.0 + 9.8)

> (parse-term '(38.7 / 2 / 3 THEN 210))

(6.45 THEN 210)

> (parse-term '(7.4 * lparen 2.4 - 3.8 rparen / 4 - 8.7))

(-2.59 - 8.7)

3.      (10 points) Write a procedure parse-expression that parses an expression at the front of a list, replacing the tokens that were part of the expression with the numeric value of the factor.  As indicated in the grammar above, an expression is a series of terms separated by the tokens + and – (addition and subtraction).  Your procedure should consume as many tokens as possible that could be part of an expression.  The operators should be evaluated left-to-right.  For example:

> (parse-expression '(12.4 - 7.8 * 3.5 THEN 40))

(-14.9 THEN 40)

4.      (10 points) Write a procedure parse-test that parses a test at the front of the list, replacing the tokens that were part of the test with the boolean value of the test (#t or #f).  As indicated in the grammar above, a test is an expression followed by a relational operator followed by an expression.  For example:

In writing your parse routines, you should test for potential errors and should call the error procedure if an error is encountered.  The call should indicate which parsing routine detected the error, as in:

(error "illegal factor")

For the second part of the assignment, you are going to complete a procedure called run-program.  It will be passed a list of BASIC instructions to execute.  Each value in the list will have three parts: a line number, the original text of the line (something your code will ignore) and a list of tokens for the input line.  For example, the following BASIC program:

10 LET X = 2.4 * 3.7

20 PRINT X

30 LET X = X/2

40 IF X<200 THEN 20

50 END

Would be represented in list form as follows:

((10 " LET X = 2.4 * 3.7" (LET X = 2.4 * 3.7))

 (20 " PRINT X" (PRINT X))

 (30 " LET X = X/2" (LET X = X / 2))

 (40 " IF X<200 THEN 20" (IF X < 200 THEN 20))

 (50 " END" (END)))

You are to implement the following BASIC commands:

5.      (5 points) REM.  The REM command is for comments (short for REMarks).  The general form is:

REM <text>

A REM command does nothing.

6.      (5 points) END.  The END command terminates a program.  The general form is:

END

Your program should terminate execution when it encounters this command.  The BASIC language description indicates that this must be the last line of a BASIC program.  We will not enforce this rule (partly because we are not implementing the STOP command).

7.       (15 points) LET.  The LET command assigns a value to a variable.  The general form is:

LET <variable> = <expression>

The expression will always have a numeric result.  The expression might include references to other variables.  The LET command can be used to define a new variable or to reassign a variable that was previously assigned a value.

8.      (10 points) PRINT.  The PRINT command prints the values of expressions.  The general form is:

PRINT <expression>, <expression>,..., <expression>

Commas will be represented with the token comma, as in (PRINT 3.4 comma 7.8).  The PRINT command should evaluate each expression, printing its value.  Values should be separated by a space and each PRINT command should produce a complete line of output.  You should not print a trailing space on the  line.  The expressions are guaranteed to be numeric, although they might refer to variables.

9.      (10 points) GOTO.  The GOTO command transfers control to another part of the program.  The general form is:

GOTO <line number>

A GOTO command transfers control to the given line number.  The line number will be a simple integer (not an expression).  If the line number does not exist, your program should stop executing with a message indicating the illegal line number.

10.  (5 points) IF.  The IF command is for conditionally transferring control.  The general form is:

IF <test> THEN <line number>

The test will be of the form described in the grammar.  The line number will be a simple integer.  If the test evaluates to #t, control should be transferred to the given line number.  Otherwise it should proceed to the next statement, as usual.  If the program attempts to transfer to a line number that does not exist, your program should stop executing with a message indicating the illegal line number.

11.  (10 points) GOSUB/RETURN.  The GOSUB command transfers control to a subroutine (a section of code that is terminated with a RETURN command).  The general form is:

GOSUB <line number>

The GOSUB command is similar to GOTO in that it transfers control to another part of the program.  But in the case of GOSUB, a subsequent call on RETURN will return the program to the statement after the call on GOSUB.  The basic form of the RETURN command is:

RETURN

BASIC supports only a single subroutine execution at any given time.  Your program should ignore a GOSUB command and should print an error message if the program attempts to execute two GOSUB commands in a row without a call on RETURN in between.  Your program should ignore a RETURN command and should print an error message if the program attempts to execute a RETURN command when there is no active subroutine execution.

12.  (5 points) Program execution should proceed sequentially from the first line of the program until it encounters an END command.  It should produce this message when the program finishes executing:

PROGRAM TERMINATED

If it executes all statements of the program without executing an END command, your program should produce this message instead:

PROGRAM TERMINATED WITHOUT REACHING END

The first token after the line number indicates the kind of command to execute (REM, END, LET, IF, etc.).  If your program encounters an illegal token, it should ignore the command and should produce an error message indicating the illegal command.

Other than the specific error handling described in this writeup, you may assume in writing your code that the program you are provided is a legal program.

You will notice that the starter code defines a global variable called program that stores the program.  It is helpful to be able to declare a small number of global variables, although you don’t want to overuse the technique.  You are allowed to declare up to 5 global variables in writing your solution (Stuart’s has 3 globals other than program).

The starter code provided in hw7.scm gives the following command options to the user:

·        LIST: show a listing of the current program

·        RUN: run the current  program

·        SAVE: save the current program in program.bas

·        STOP: exit the BASIC interpreter

·        UNSAVE: restore a program from program.bas

In addition, the user can enter lines of the program, which will all begin with a line number.  They can be entered in any order.  The user can also replace a line.  To delete a line, the user should type the line number followed by enter.