Trefoil v3 Language Reference

Start by reading the informal description in hw6 if you haven't already.

The definition below describes the syntax and semantics of Trefoil v3 programs. Italicized words are technical terms being defined.

Many of the concepts are the same as Trefoil v2. We do not repeat them here. Instead, we describe only the changes and new features.

Syntax

The syntax has four levels: characters, tokens, parenthesized symbol trees, and abstract syntax trees (ASTs). Only the AST level changes compared to Trefoil v2.

Abstract Syntax Tree level

A program consists of a sequence of bindings.

A binding is one of the following:

  • Any of the binding forms available in Trefoil v2.
  • Function binding: a node with head define with exactly two arguments.
    • The first argument is itself a non-empty node containing only symbols. The first symbol in this node is the function's name. The other symbols are the function's parameter names.
    • If any symbol appears more than once in the parameter names, signal an abstract syntax error.
    • The second argument is an arbitrary expression representing the function's body. (Not necessarily a node!)
    • Example: (define (f x y) (* y (+ x 2)))
      • defines a function named f with parameters x and y and the given body expression

An expression is one of the following:

  • Any of the expression forms available in Trefoil v2, except for let, which is updated below.
  • We change the syntax of let as follows.
  • Let expression: a node with head let and exactly two arguments
    • The first argument is a node with any number of letdefs as children (including zero).
      • A letdef is a node with exactly two children.
        • The first child is a symbol representing the variable name being defined.
        • The second child is an expression representing the variable's definition.
      • If any variable name is defined more than once, signal an abstract syntax error.
    • The second argument to the top-level let is an expression representing the "body" of the let expression (the scope in which the variable definition is available).
    • Example: (let ((x 3) (y 4)) (+ x (+ y 1)))
  • Cond expression: a node with head cond and any number of arguments, each of which is a cond clause.
    • A cond clause is a node with exactly two children, both of which are expressions.
    • Example: The body of this function is a cond expression with two clauses.
      (define (sum l)
        (cond
          ((nil? l) 0)
          (true (+ (car l) (sum (cdr l))))))
      
  • Function call expression: a node with a head that is not any of the keywords used as the head of any PST node anywhere in Trefoil v2 or v3, and any number of arguments, each of which is an expression.
    • Example: (f 0 true)
      • Calls the function named f with the two arguments 0 and true

Semantics

The meaning of a Trefoil v3 program is the same as in Trefoil v2. We only need to describe the semantics of the new binding and expression forms. We will also adjust the definition of the dynamic environment.

The dynamic environment

We change the definition of a dynamic environment to map strings to entries. An entry is either:

  • a variable entry, which is just a value
  • a function entry, which is a function binding and a dynamic environment (called the "defining environment")

The two operations on dynamic environments are lookup and extension. The lookup operation looks up the name in the map and returns the corresponding entry. The extension operation takes an old dynamic environment, a name, and an entry, and returns a new dynamic environment that is the same as the old except that the given name now maps to the given entry.

Semantics of bindings

We update the semantics to take into account that the dynamic environment now maps to entries, not just values.

  • For a variable binding, wrap the value in a variable entry before inserting it in the dynamic environment.

We extend the semantics for the new kinds of bindings.

  • Consider a function binding (define (f params) e), where e stands for any expression and params stands for a possibly empty sequence of symbols representing the parameter names. Call the current dynamic environment env. Return the env extended with f maps to the function entry ((define (f params) e), env). In other words, we capture the current dynamic environment env and store it together with the entire function binding for f in the entry.

Semantics of expressions

We update the semantics to take into account that the dynamic environment now maps to entries, not just values.

  • Consider a variable reference expression x where x stands for any variable name. The semantics is to perform a lookup operation for x in the current dynamic environment. If x maps to a variable entry with value v, return v. Otherwise signal an error.
    • To clarify, if x maps to a function, signal an error.
    • To clarify, if x doesn't map to anything in the current environment, signal an error.

We change the semantics of let as follows.

  • Consider a let expression (let (defs) body), where body stands for any expression and defs stands for any sequence of letdefs. Each letdef is of the form (xi ei) where xi is an arbitrary symbol and ei is an arbitrary expression. The semantics is to iterate through all the letdefs and evaluate each ei in the current dynamic environment to a value vi. Then evaluate body in a new dynamic environment constructed by extending the current dynamic environment with mappings xi -> vi for each xi and vi from the sequences above. Call the resulting value v. Return v.
    • Note that all the ei are evaluated in the old dynamic environment. So variables from earlier letdefs in the list are not in scope for any ei. (This is different from nested let expressions, where outer/earlier variables would be available.)

We extend the semantics for the new kinds of expressions.

  • Consider a cond expression (cond clauses) where clauses stands for any sequence (possibly empty) of cond clauses. Each cond clause is of the form (pi bi) where pi and bi stand for arbitrary expressions. The semantics is to iterate over the clauses in order. For each clause (pi bi), evaluate pi in the current dynamic environment to a value vi. If vi is false, continue the loop. Otherwise, if vi is anything whatsoever besides false, stop the loop and evaluate bi in the current dynamic environment to a value and return that. If the loop reaches the end of the sequence of clauses without finding any pi that evaluates to something besides false, then signal an error.
  • Consider a function call expression (f args), where f stands for any function name, and args stands for any sequence (possibly empty) of expressions to be passed as arguments to f. Call the current dynamic environment callenv. The semantics is to first lookup f in callenv. If f is not mapped to anything, or maps to anything besides a function entry, signal an error. From the function entry mapped to by f, let (define (f params) body) be f's function binding and defenv its defining environment. If args and params are sequences of different lengths, signal an error. Next, evaluate each element of args in left-to-right order in callenv. This results in a sequence of values. Call that sequence vals. Then evaluate body in a new dynamic environment constructed by extending defenv with mappings param -> val for each param and corresponding value val in the lists params and values, and with another mapping from f to its function entry (to implement recursion). Return the result of evaluating body.