We have so far discussed the notion of scope (i.e., where a name binding is visible) rather informally. I would like to make your intuition more precise by describing ML's scoping rules in more detail.
Bindings in ML live in environments. Conceptually, each environment (except the top-level environment) consists of
Consider the following code:
- val x = 5; val x = 5 : int - fun f y = x + y; (* 1 *) val f = fn : int -> int - val x = 7; (* 2 *) val x = 7 : int - f 10; (* 3 *) val it = 15 : int
In this code, the reference to x
inside
f
refers to the binding x = 5
.
Regardless of whatever other bindings are later added to the
top-level environment, the body of f
will always be
evaluated in that environment.
Figs. 1-3 show how this is implemented, conceptually. In
Fig. 1, we have evaluated the declaration of f
(at
the line marked (* 1 *)
above), which includes
evaluating the function value. Unlike previous diagrams of
function values in memory, we have included a picture of the
value, sometimes called a closure, which contains
two parts:
f
"captures" the environment in a
"closure". Notice that this points to a part of the environment
that includes f
--- this is necessary for
recursive function definitions.Fig. 2 shows what happens when we evaluate val x =
7
(at the line marked (* 2 *)
). Another
binding is added that shadows the previous binding --- but only
for later declarations. The closure continues to point to the
older environment.
Finally, Fig. 3 shows what happens when we evaluate the
function bound to f
. First, a function
activation record is created for this function. This
activation has a pointer for the parent environment. Second, a
the pointer from the closure is copied into the parent environment
slot. Third, the actual argument value is matched against the
function's argument pattern --- in this case, simply binding 10 to
y
. Finally, the function is executed in the
environment of the activation --- lookup of y
yields
10, and lookup of x
yields 5. The expression x
+ y
evaluates to 15, and this value is returned.
Note that this is only the conceptual picture of what's going on. An optimized implementation might allocate activation records on a stack (perhaps the same stack as the top-level environment), and it might copy the captured bindings into the closure instead of keeping a pointer to the original environment.
The scoping rule used in ML is called lexical scoping, because names refer to their nearest preceding lexical (textual) definition.
The opposite scheme is called dynamic scoping
--- under dynamic scoping, names from outer scopes are
re-evaluated using the most recently executed definition, not the
one present when the code was originally written. Under dynamic
scoping, the above transcript would return 17 for the value of
f 10
.
All sensible languages use lexical scoping. Dynamic scoping is of mostly historical interest --- early implementations of Lisp used dynamic scoping, which Joseph McCarthy (the inventor of Lisp) simply considered a bug. In languages that use dynamic scoping, functions are difficult to use and do not serve as well-behaved abstractions --- it is possible to make a function misbehave in ways that the writer of the function never anticipated, simply by accidentally redefining some name that may be used in the function.
So far, we have examined four contexts where name bindings may take place:
case
statements: bindings are visible
only inside the body of the rule, and may be shadowed by nested
binding constructs inside the body.These all follow rules similar to function arguments. For example, when one rule of a case expression is evaluated, a fresh environment is created with a parent pointer to the textually enclosing scope.
You may find it useful to figure out the scope of various names in the following code.
val a = 1; val b = 3; val f = fn x => x + a + b; val a = 2; fun g (foo, bar) = let val x = f a; val y = fn foo => foo + x + let val n = 4 in n * n end; val x = y bar; in case foo of nil => 0 - x | x::_ => x end;
Let-expressions are actually roughly equivalent to function applications --- witness the following equivalence:
- let val x = 5 in x + x end; val it = 10 : int - (fn x => x + x) 5 val it = 10 : int
In both cases, we bind a value to a name, then evaluate an expression in the environment produced by that name binding.
Let-expressions with more than one binding can be rewritten as a sequence of let-expressions, which in turn can be rewritten as a sequence of function applications:
- let val x = 5 val y = 7 in x + y end; val it = 12 : int - let val x = 5 in let val y = 7 in x + y end end; val it = 12 : int - (fn x => (fn y => x + y ) 7) 5; val it = 12 : int
Thought exercise: Why can't a let-expression with more than one binding simply be translated into a function taking a tuple of values? Hint: consider the let-expression
let val x = 5; val y = x + 1 in x + y end;