handout #16

CSE341—Programming Languages

Programming Assignment #8 (Scheme)

due 11 pm Friday, 5/28/10

In this assignment you will implement a significant subset of the BASIC programming language as described in the 1964 Dartmouth report available here:

http://www.bitsavers.org/pdf/dartmouth/BASIC_Oct64.pdf

Your solution will build on the parsing functions of assignment 7 and you will be provided with a set of utilities called hw8-support.scm that has code for handling the top-level user interaction and a function called tokenize that will convert user input into lists of tokens.  Your solution should be called hw8.scm and should include the following calls:

(include "hw7.scm")

(include "hw8-support.scm")

A solution to hw7 will be emailed to all students once the final deadline has passed.

The primary task for this assignment is to write a function called run-program.  It will be passed a list of BASIC instructions to execute.  Each value in the list will have two parts: a line number and a list of tokens for the input line.  For example, the following BASIC program:

10 LET X = 203.4 * 37.7

20 PRINT X

30 LET X = X/2

40 IF X>200 THEN 20

50 END

Would be represented in list form as follows:

((10 (LET X = 203.4 * 37.7))

 (20 (PRINT X))

 (30 (LET X = X / 2))

 (40 (IF X > 200 THEN 20))

 (50 (END)))

Variable names in BASIC are very limited.  They can be one letter long or one letter followed by one digit.  The hw8-support file includes a predicate variable? that can be used to test whether something is a legal variable name.

This writeup indicates that you should generate specific error messages.  That means that you should call the error procedure with the given text.  In addition, unless otherwise specified, your error string must include the line number on which the error occurred, as in:

LINE 30: ILLEGAL FOR

You are not required to handle every potential error, although you can include extra error cases if you want to.  You will want to call the parse-test and parse-expression functions of assignment 7, but to do so, you should instead call the functions try-parse-test and try-parse-expression that are provided in hw8-support.  These versions take an extra parameter that specifies the line number so that if an error occurs, it will be reported with the line number, as in:

LINE 55: ILLEGAL EXPRESSION

Whenever a command involves parsing an expression that has an undefined variable in it, you can allow parse-expression to generate the error.

As noted above, the assignment involves writing a single function called run-program.  Obviously you will want to include many helper functions that are called by run-program.  You are allowed to define these in the top-level environment.  Your run-program function should properly implement the following BASIC commands:

1.      (5 points) REM.  The REM command is for comments (short for REMarks).  The general form is:

REM <text>

A REM command does nothing.

2.      (5 points) END.  The END command terminates a program.  The general form is:

END

The hw8-support code guarantees that there will be exactly one occurrence of this command and that it will be the last line of the program.  Your program should terminate execution when it encounters this command.

3.       (15 points) LET.  The LET command assigns a value to a variable.  The general form is:

LET <variable> = <expression>

The expression will always have a numeric result.  The expression might include references to other variables.  The LET command can be used to define a new variable or to reassign a variable that was previously assigned a value (just as in Java).  You should make sure that LET is followed by a legal variable name which is followed by an equals sign.  If not, you should generate the error message “ILLEGAL LET”.  The hw8-support file includes a predicate called variable? that can be used to test whether a variable name is legal in BASIC.  If the line begins with these three elements, you should call the try-parse-expression function to parse the expression (this will generate an error with the line number you provide if the expression is missing or illegal).  If the expression is legal, you will need to remember this value for the variable.  If the expression is followed by extraneous text, you should generate the error message “EXTRANEOUS TEXT AFTER LET EXPRESSSION.”

4.      (5 points) INPUT.  The INPUT command reads a number from the user and stores it in a variable.  The general form is:

INPUT <variable>

This command wasn’t part of the original BASIC specification, but it is obviously very useful.  If the variable is missing or if the variable name is not legal or if there is extraneous text after the variable name, you should generate the error message “ILLEGAL INPUT COMMAND” (as noted earlier, the support file has a predicate called variable? that can be used to test whether a variable name is legal).  Otherwise you should call Scheme’s read function to read the number (it takes no arguments, as in (read)).  You should make sure that the input you get is either an integer or a real.  If it is, you should assign the given variable that value.  Otherwise you should generate the error message “INPUT MUST BE A NUMBER”.

5.       (10 points) PRINT.  The PRINT command prints a sequence of expressions and strings separated by commas.  The general form is:

PRINT (<expression> | <string>) {"," (<expression> | <string>)}

Commas will be represented with the token comma, as in (PRINT “x =” comma 7.8).  Each string should be displayed and each expression should be evaluated and displayed (the display procedure works for both strings and numbers).  Values should be separated by a space.  For this general form of the PRINT command, there should be no extraneous space on the end of the line and each PRINT command should produce a complete line of output (the newline procedure goes to a new line of output).  The expressions are guaranteed to be numeric, although they might refer to variables.  You should call try-parse-expression to parse the expressions.

There are some special versions of PRINT that you also must handle.  PRINT might appear on a line by itself, in which case you should produce a blank line of output.  The final expression might also be followed by a comma that has nothing after it.  That certainly seems like a strange command, but that was how in BASIC you requested the equivalent of a System.out.print versus a System.out.println.  In particular, if the final expression of a PRINT command is followed by a comma without another expression, you should not call newline to complete that line of output and you should terminate the line with the space that would normally have separated the previous expression from what comes next.  For example, if you execute these lines of code:

10 PRINT 2 + 2,

20 PRINT 3,

30 PRINT 42, 15

The result would be a single line of output with a single space separating the items and no trailing space at the end:

4 3 42 15

If a legal expression is followed by something other than a comma, you should generate the error message “ILLEGAL PRINT”.

6.       (10 points) GOTO.  The GOTO command transfers control to another part of the program.  The general form is:

GOTO <line number>

A GOTO command transfers control to the given line number.  The line number will be a simple integer (not an expression).  If the line number does not exist, you should generate the error message “NO SUCH LINE NUMBER”.  If the GOTO command is not followed by an integer or if it has extraneous text after the integer, you should generate the error message “ILLEGAL GOTO.” 

7.      (5 points) IF.  The IF command is for conditionally transferring control.  The general form is:

IF <test> THEN <line number>

The test will be of the form described in the homework 7 grammar.  The line number will be a simple integer.  If the test evaluates to #t, control should be transferred to the given line number.  Otherwise it should proceed to the next statement, as usual.  If the word THEN is missing or if it is not followed by an integer or if the integer has extraneous text after it, you should generate the error message “ILLEGAL IF”.  If the program attempts to transfer to a line number that does not exist, you should generate the error message “NO SUCH LINE NUMBER”.  You should call try-parse-test to parse the test.

8.      (10 points) GOSUB/RETURN.  The GOSUB command transfers control to a subroutine (a section of code that is terminated with a RETURN command).  The general form is:

GOSUB <line number>

The GOSUB command is similar to GOTO in that it transfers control to another part of the program.  But in the case of GOSUB, a subsequent call on RETURN will return the program to the statement after the call on GOSUB.  The basic form of the RETURN command is:

RETURN

If the GOSUB command is not followed by an integer or if it has extraneous text after the integer, you should generate the error message “ILLEGAL GOSUB.”  If the RETURN has extraneous text after it, you should generate the error message “ILLEGAL RETURN”.  If the line number mentioned in GOSUB does not exist, you should generate the error message “NO SUCH LINE NUMBER”.  If the program reaches an END while a subroutine execution is active (i.e., without having encountered a RETURN statement) you should generate the error message “MISSING RETURN” (this should have the line number of the END command).

BASIC supports only a single subroutine execution at any given time.  Your program should generate the error message “ALREADY IN SUBROUTINE” if the program attempts to execute two GOSUB commands in a row without a call on RETURN in between.  Your program should also generate the error message “NOT IN SUBROUTINE” if the program attempts to execute a RETURN command when there is no active subroutine execution.

9.       (30 points) FOR/NEXT.  The FOR command is used for looping.  The general form is:

FOR <variable> = <expression> TO <expression>

The expressions represent the starting and ending values for the loop.  The expressions have to be numerical, but they need not be integers.  Each FOR command should be paired with a corresponding NEXT command using the same variable name:

NEXT <variable>

When the NEXT command is encountered, you examine the value obtained by incrementing the loop variable (adding 1 to it).  If that value is less than or equal to the final value for the loop, then you set the variable to that value and go back to the top of the loop.  If the value after incrementing would be larger than the final value, then you leave the variable unchanged and transfer control to the statement that comes after the NEXT command.

For example, consider the following loop:

100 FOR X = 0.5 TO 5

110 PRINT X

120 NEXT X

130 PRINT X

The loop produces the following output:

0.5

1.5

2.5

3.5

4.5

4.5

Notice that the loop stops when X becomes 4.5 because incrementing it again would cause it to be greater than the final value of 5.  Also notice that X has the value 4.5 after the loop.

The BASIC standard says that if the starting value for a loop is greater than the ending value, then the body of the loop should be skipped and afterwards the variable should be one less than the starting value.  For example:

10 FOR X = 5 to 3

20 PRINT X

30 NEXT X

40 PRINT X

The body of the loop is not execute (the PRINT in line 20) and after the loop the value of X is reported to be 4 (one less than the starting value).

Loops can be nested, as in:

100 FOR X = 1 TO 3

110 FOR Y = 1 TO X

120 PRINT X, Y, X + Y

130 NEXT Y

140 NEXT X

Loops are required to be properly nested.  For example, in the code above you would not be allowed to switch the order of lines 130 and 140 because then we would encounter the NEXT command for the outer loop before encountering the NEXT command for the inner loop.

The BASIC specification describes a loop in terms of the FOR command that begins the loop, the NEXT command that ends the loop and the body of commands that appear in between.  The nesting requirement says that if one FOR/NEXT loop is going to appear in the body of another, then the nested loop must appear in its entirety in the body of the outer loop (i.e., the FOR command, the body of the inner loop and the NEXT command).

There are some unusual cases that can arise if a programmer calls GOTO or GOSUB inside of a loop.  In general, a programmer should not escape a loop with a GOTO command.  A call on GOSUB can be okay, although there are some odd cases (for example, what if the GOSUB code includes a NEXT command for the original loop?).  In general, we aren’t going to worry about many of these subtle errors.

You should call try-parse-expression to parse the two expressions.  Your are required to generate some specific error messages:

·         If FOR is not followed by a legal variable name or if the variable is not followed by =, generate the error message “ILLEGAL FOR”

·         If the first expression is not followed by TO, generate the error message “MISSING TO”

·         If there is extraneous text after the second expression, generate the error message “EXTRANEOUS TEXT AFTER ENDING EXPRESSION”

·         The variable mentioned in a NEXT command should match the variable from the most recently encountered FOR command.  If it doesn’t, you should generate the error message “VARIABLE IN NEXT DOESN'T MATCH”

·         If a program uses the same variable in both an inner and outer loop, you should generate the error message “ILLEGAL NESTED LOOP”

·         If a NEXT command is not followed by a legal variable name and nothing else, generate the error message “ILLEGAL NEXT”

·         If a NEXT command does not have a corresponding FOR, generate the error message “NEXT WITHOUT FOR”

·         If you reach the END command without completing all loops (this will include catching the case where there is a for without a corresponding next), generate the error message “MISSING NEXT”.

10.   (5 points) These final requirements are behavior requirements more than commands to implement.  Program execution should proceed sequentially from the first line of the program until it encounters the END command.  It should produce this message when it encounters the END command and finds no errors to report:

PROGRAM TERMINATED

The first token after each line number indicates the kind of command to execute (REM, END, LET, IF, etc.).  If your program encounters an illegal token (something not mentioned in problems 1 through 9), it should generate the error message “ILLEGAL COMMAND”.

Other than the specific error handling described in this writeup, you may assume in writing your code that the program you are provided is a legal program.

It is helpful to be able to declare a small number of global variables, although you don’t want to overuse the technique.  You are allowed to declare up to 7 global variables in writing your solution (Stuart’s has 5 globals).

The starter code provided in hw8-support.scm gives the following command options to the user (most of which come from the BASIC specification):

·         LIST: show a listing of the current program

·         NEW erases the current program

·         OLD allows you to read in a previously saved program

·         RENAME allows you to change the name of the current program file (only takes effect for future calls on SAVE or UNSAVE)

·         RENUMBER renumbers the lines of the current program in increments of 10

·         RUN: run the current  program

·         SAVE: save the current program in program.bas

·         STOP: exits the BASIC interpreter

·         UNSAVE: restore a program from program.bas

For the commands that involve specifying a file name (NEW, OLD, RENAME), you can either type the command and hit enter or you can type the command followed by the file name.

In addition to the commands, the user can enter lines of the program, which will all begin with a line number.  They can be entered in any order.  The user can also replace a line.  To delete a line, the user should type the line number followed by enter.

The supporting code guarantees certain invariants for the program variable passed as a parameter to the call on run-program:

·         There will be exactly one legal call on END and it will be the last line in the program.

·         No line will be empty.

·         Line numbers do not necessarily increase in increments of 10, but they will be positive numbers and they will appear in increasing order.

There is a global variable called “debug” that can be helpful for debugging purposes.  After you give a STOP command to exit the interpreter, you can use the variable “debug” to access the last program you were running (i.e., the last value passed to your run-program function).