Retro school children University of Washington Computer Science & Engineering
 CSE 401: PL/0 - Project Description
  CSE Home  About Us    Search    Contact Info 

We have provided you with a working compiler (and interpreter) for a very simple language we call PL/0. We do not have a detailed description of this base language, but it is simple enough that it should be clear from studying a sample program, the base lexical and BNF descriptions, and the compiler itself. (Of course, ask questions if not.)

Your course project will be to modify the base PL/0 compiler (but not the interpreter) to implement the language extensions described below. This will happen in stages: lexing, parsing, semantic analysis, storage allocation, and code generation.

Comments
Extend the PL/0 language to allow comments. A comment begins with two slashes - do hackers call them whacks? - ("//") and extends to the end of the line. Just like in C++ and Java single-line comments. The following code illustrates some comments:
    i := x + y; // This is a comment. It's silly, but a comment nonetheless.
    j := a - b; // This computes "a + (-b)" (now /that/ was an useful comment).
    // a comment; still in the comment // and more comment
    k := i + // guess what, a comment can be interspersed with the computation!
        j; // tricky, but legal. Performs k := i + j;
If-Then-Else Statements
Replace the if statement with an if-then-else statement sporting the following syntax:
    <if stmt> ::= if <test> then <stmt list-1> [else <stmt list-2>] end
The semantics are as expected: if the <test> evaluates to true, execute the statements in <stmt list-1>; otherwise, execute the statements in <stmt list-2>, if any. (Note: the -1 and -2 do not represent new syntactic entitites, but rather distinguish two entities that have the same syntax.)

For Loops
Add a for loop to the language by augmenting the <stmt> production and by adding the production:
    <for stmt> ::= for <id> := <expr-1> to <expr-2> do <stmt list> end

The semantics of the for statement are identical to those of the following while statement:

    <id> := <expr-1>
    while <id> <= <expr-2> do
        <stmt list>
        <id> := <id> + 1;
    end
Note that this means that expr-1 is evaluated only once before entering the loop, and expr-2 is evaluated at each iteration. Also, the loop variable must be declared as an integer variable. It is legal for the initial expression to be greater than the final expression; in this case, the body of the loop is never executed. Assignments to the loop index variable within the loop body are allowed.

Continue Statements
The extended language supports the continue statement. The continue statement has the effect of skipping the rest of the body in a while or for loop, and starting a new iteration. It is an error for a continue statement to appear outside a loop. (It need not be directly inside a loop; e.g. it is legal inside an if that is itself inside a loop.)

Booleans
Add first-class Boolean types to the language. This means that the PL/0 programmer can declare variables to be Booleans, using them in essentially all the ways in which integers (the only previously available first-class type) can be used, except A type, bool, is supported as the type of these variables. Additionally, test expressions in if and while statements must be Boolean, not integer, and the result of the odd operator is of Boolean type, not integer. (The expression odd x is true if and only if the value of x is an odd integer.)

Two new Boolean constants are supported, true and false. In addition, four Boolean operators are introduced, and, or, and if, and or else. All these operators take Boolean expressions as arguments and return a Boolean value. The first two are not short-circuiting; they always evaluate both their sides. The last two are short-circuiting, meaning that the second (rightmost) argument is not evaluated if the result of the first argument fully determines the result. They are all left-associative. The two and operators have the same precedence. Similarly, the two or operators have the same precedence. The and operators have higher precedence than the or operators; but and and and if have lower precedence that all other operators. Integer operators do not apply to Booleans and vice versa, although = and <> may be applied if both operands are of the same type.

Examples:

    var a:int, x:int;
    var b:bool;
    if a = 0 or else x / a > 3 and b then <stmt list> end;
    // test parses as "(a = 0) or else (((x / a) > 3) and b)"
    // if a = 0, then don't evaluate the rest of the test
    a := b;             // illegal
    a := 3 * b;         // illegal
    while 1 do ... end; // illegal
    while 1=1 do ... end; // legal, heh heh

Constant Expressions
Named scalar (non-array) constants can be initialized with arbitrary integer and Boolean expressions involving integer literals, Boolean literals, and previously declared named constants. Expressions may not involve procedure parameters, function calls, nor the input primitive. The following examples illustrate some legal constant declarations:
    const a:int = 3 + -4;
    const b:int = 5 * a;
    const c:bool = a > b or odd b;

Arrays
The extended language supports arrays. The type of an array is described as:
    array [  <const expr>  ] of <type>
where the constant expression is constrained to evaluate to a positive (greater than 0) integer, indicating the size of the array for the given type. (Square brackets here and in the definition of array references below are required input tokens, not the meta-notation meaning "optional, " as in the rest of this document.) The type of the elements can be any type, including another array. For example, the following are some legal array variable declarations:
    var a:array [10] of int;
    var b:array [25] of array [100] of bool;
    const size: int = 5 * 100;
    var c:array [size] of array [size + 1] of int;

In the extended language, the following notation references an array element:

    <lvalue> [ <expr> ]
where the <lvalue> production has been modified to:
    <lvalue> ::= IDENT | <lvalue> [ <expr> ]
The <lvalue> should evaluate to some array a, of size n, and the index expression should evaluate to some integer i. This expression returns the i-th element of the array a, where the first element of the array is at index 0. It is an unchecked run-time error for i to be outside the range [0..n-1]. PL/0 does not have "multi-dimensional" arrays, but does allow arrays whose base type is an array (and so on, recursively). Hence, the <lvalue> may be another array indexing operation. Assignment to, arithmetic or Boolean operations on, and comparisons of arrays is disallowed. Use of whole arrays in input and output statements is also forbidden. However, arrays may be passed as (call-by-reference) parameters to procedures and functions (below). Examples:
    var a:array [5] of array [3] of int;
    var b:array [3] of int;
    a[1][i+j] := b[k-1] * 2;
    a[1] := b;      # This is illegal
    b := input;     # So is this
Functions
A procedure may optionally return a result; such a value-returning procedure is called a function. To indicate that a procedure returns a result, a result type declaration is added after the list of formal parameters in the procedure declaration. (This is optional, since procedures are still supported as well as functions.)

The following are legal function declarations:

    procedure foo(x:int, y:bool): int;
    begin
       <stmt list>
    end foo;

    procedure bar(): bool;
    var x:int;
    begin
       <stmt list>
    end bar;
Function calls can be used anywhere that an expression of the appropriate type can be used; syntactically, function call expressions have higher precedence than other operators. Functions may also be called like procedures, as a separate statement. For example, given the above function declarations, the following is a legal statement:
    foo(foo(3,bar()), i < foo(x+y,bar()));
The type of the result of a function must be a scalar type; whole arrays cannot be returned. The result is returned by value.

Return Statements
The return statement can be used to exit a procedure, immediately transferring control to the end of the procedure body. It is legal for a procedure (but not a function) to return by "falling off" the procedure body without encountering an explicit return statement.

To return a value from a function, the return statement is used with an expression. Both forms are captured by

    <return stmt> ::= return [ <expr> ]
The type of the expression returned must match the type declared for the enclosing function. It is illegal for a value-returning return statement to appear in a procedure not declared to return a result. It is similarly illegal for a non-value-returning return statement to appear in a function. It is an error if it is possible for execution to reach the end of a function's body without executing a return statement. In particular, a function whose only return statement is in a then clause of an if statement is invalid, since if the then clause is not taken the function will not return anything.

The following illustrates some legal uses of return statements:

    procedure foo(x:int,y:bool):int;
    begin
       if y then return x; end;
       return x * 5;
    end foo;
Your Pet Feature Here

Yes, that's right! Ever felt frustrated that your favorite language doesn't offer that one feature that would improve your productivity tenfold? From tri-state Booleans to non-local returns to exceptions to interval arithmetic to hygienic macros to the legendary comefrom statement, this is your chance to make the world a better place!

Each team should choose a feature and implement it as part of the project. So get in touch with that inner creative genius!

Known Problems
The base PL/0 compiler contains a few "features" that you might want to be aware of. You do not need to fix any of these.