CSE 505 Lecture Notes:

Implementation of Functional Languages

Sept. 30, 1994

Introduction

There are two classical implementation techniques for functional languages:

Landin's SECD machine

combinators S = stack E = environment C = code vector D = dump The SECD machine is a little stack-based machine with a stack S. E is the current binding environment. C contains the code. D holds other older contexts (which are restored after we're done evalutating a function).

We could also use the pure Lisp part of a traditional Lisp interpreter.

The SECD machine uses applicative order evaluation. To get normal order evaluation, pass an anonymous function rather than a value as an argument. Evaluate the function whenever the parameter value is needed. (This is the same as call-by-name in Algol-60 -- in Algol-60 this anonymous function is called a thunk.) Improve efficiency by replacing the anonymous function call by its value after it is invoked -- this gives lazy evaluation.

We can use lazy evaluation on an ad hoc basis (e.g. for "if"), or for all arguments. If for all arguments, we can improve efficiency using strictness analysis.

Strictness

A function is strict in an argument x if x must be evaluated whenever the function's value is needed.

plus a b is strict in both arguments const k x is strict in k, but not in x if x y z is strict in x, but not in y and z We can do some analysis and sometimes decide if a user-defined function is strict in some of its arguments.

Examples:

double x ;; strict in x squid n x = if n=0 then x+1 else x-n ;; strict in n and x crab n x = if n=0 then x+1 else n ;; strict in n but not x If a function is strict in an argument x, it is correct to pass x by value (even with normal order evaluation semantics).

Combinators

Turner's landmark 1979 paper describes an alternative implementation strategy, using combinator graphs

Loosely, a combinator is a function with no free variables or constants. (See for example Hindly and Seldin, "Introduction to Combinators and Lambda-Calculus for a formal treatment.)

Schonfinkel (1924) first described combinators. They provide a way of avoiding variables altogether in lambda calculus. (Variables cause a lot of complications in describing the rewrite rules, principally because of the need to avoid accidental collisions of variable names.)

Basic Combinators

S, K, I combinators (see Turner page 34) are a basic set S f g x = f x (g x) K x y = x I x = x (In fact we only need S and K, since SKK=I) rewrite functions such as f x = e using the abstraction operation f = [x] e

Extensionality condition:

([x] E) x = E Notice that [x] E is similar to (lambda (x) E), but [x] E is a textual, compile time operation

Rules for abstracting x:

[x] (E1 E2) = S ([x] E1) ([x] E2) [x] x = I [x] y = K y, where y is a constant or variable and x not equal to y

Proofs that these rules are correct:

take LHS of first rule, and apply it to x: [x] (E1 E2) x = E1 E2 (by extensionality) RHS of first rule: S ([x] E1) ([x] E2) x = (([x] E1) x) (([x] E2) x) = E1 E2 (extensionality, twice) LHS of second rule, applied to x: ([x] x) x = x (extensionality) RHS of second rule I x = x (definition of I) LHS of third rule: ([x] y) x = y (extensionality) RHS of third rule K y x = y (definition of K) Show succ example from Turner paper.

Add additional combinators B and C to get more compact graphs; show succ example with new combinators

Another example:

avg x y = (x+y)/2 replace + and / by curried versions: avg x y = divide (plus x y) 2 abstract y (treating x as a constant) avg x = [y] (divide (plus x y) 2) = S ([y] (divide (plus x y))) ([y] 2) = S (S ([y] divide) ([y] (plus x y))) ([y] 2) = S (S ([y] divide) ([y] (plus x y))) (K 2) = C (S ([y] divide) ([y] (plus x y))) 2 = C (S (K divide) ([y] (plus x y))) 2 = C (S (K divide) (S ([y] (plus x)) ([y] y))) 2 = C (B divide (S ([y] (plus x)) ([y] y))) 2 = C (B divide (S (S ([y] plus) ([y] x)) ([y] y))) 2 = C (B divide (S (S (K plus) (K x)) I)) 2 = C (B divide (S (K (plus x)) I)) 2 = C (B divide (plus x)) 2 avg = [x] (C (B divide (plus x)) 2) avg = [x] (C (B divide (plus x)) 2) = S ([x] (C (B divide (plus x)))) ([x] 2) = S ([x] (C (B divide (plus x)))) (K 2) = C ([x] (C (B divide (plus x)))) 2 = C (S ([x] C) ([x] (B divide (plus x)))) 2 = C (S (K C) ([x] (B divide (plus x)))) 2 = C (B C ([x] (B divide (plus x)))) 2 = C (B C (S ([x] B divide) ([x] (plus x)))) 2 = C (B C (S (S ([x] B) ([x] divide)) ([x] (plus x)))) 2 = C (B C (S (S (K B) (K divide)) ([x] (plus x)))) 2 = C (B C (S (K B divide) ([x] (plus x)))) 2 = C (B C (S (K B divide) ([x] (plus x)))) 2 = C (B C (B (B divide) ([x] (plus x)))) 2 = C (B C (B (B divide) (S ([x] plus) ([x] x)))) 2 = C (B C (B (B divide) (S (K plus) I))) 2 = C (B C (B (B divide) plus)) 2 ugh!

Pattern Matching Combinators

Turner also introduces combinators for pattern matching Y combinator -- finds fixedpoints Y f = f (Y f) used in local recursions E where x = ... x ...

Example:

ham = 1: my_merge ham2 (my_merge ham3 ham5) where ham2 = map (*2) ham ham3 = map (*3) ham ham5 = map (*5) ham

S-K reduction machine

The S-K reduction machine graph rewriting machine to interpret combinator code

Miranda uses normal order evaluation -- go down left branch of the tree until a combinator is found. Apply it to the args, and replace that node with the result.

Example 1

[show suc 2 where suc x = 1+x example]

Example2: Self-optimizing code

code is a built-in function that maps characters to numbers (ascii codes) e.g. code '0' = 48 makedigit n = code n - code '0' after the first evaluation the expression (code '0') will be REPLACED by 48

Example 3

foldr op r = f where f [] = r f (a:x) = op a(f x) sum = foldr (+) 0 after the first evaluation of sum, it will be rewritten to the equivalent of sum [] = 0 sum (a:x) = a+sum x

Later developments

lambda lifting supercombinators (combinators are abstracted from user's program) G machine strictness analysis compilation to conventional single-processor architectures compilation for conventional parallel hardware special-purpose hardware for parallel graph reduction lambda lifting: if we have a local function definition with free variables, we can move it to the top level by adding additional arguments that are then applied to the free variables example: f x = e where e contains a free variable y define a new function f' y x = e at the top level Replace calls to f by f' y supercombinators: combinators are abstracted from user's program Johnnson et al, Chalmers University this technique is used in e.g. one of the Haskell implementations