Evaluation of an expression may terminate with a value and may along the way engender an effect upon its environment. Our first example of an effect was the possibility of raising an exception, which we explored in detail in the preceding chapter. The next important example of an effect is a storage effect, the allocation or mutation of storage during evaluation. The introduction of storage effects has profound consequences, not all of which are desirable. Indeed, storage effects are sometimes denigrated by referring to them as side effects, by analogy with the unintended effects of some medications. While it is surely excessive to dismiss storage effects as completely undesirable, it is advantageous to minimize the use of storage effects except where clearly appropriate to the task. We will explore some of the basic techniques for using storage effects later in this chapter, but first we introduce the mechanisms for supporting mutable storage in ML.
To support mutable storage the execution model of programs is modified to include an implicit memory consisting of a finite set of mutable cells containing data items of a fixed type. A mutable cell may be thought of as a kind of container in which a data value is stored. During the course of evaluation the content of a cell may be retrieved or may be replaced by any other value of the same type. Mutation introduces a strongly temporal aspect to evaluation: we speak of the current contents of a cell as the value most recently assigned to it. This is to be contrasted with the bindings of values to variables, which never change once made and hence have a permanent quality; the binding of a variable is a uniquely-determined value that does not change during evaluation. Since cells are used by issuing "commands" to modify and retrieve their contents, programming with cells is sometimes called imperative programming.
Since cells may have their contents changed during evaluation it is imperative that we take careful account of the identity of cells. When are two cells the same? When are they different? The guiding principle is that two cells (of the same type) are distinct if there is a program that can tell them apart; otherwise they are equal. How can we tell cells apart? By doing the only things we can ever do with cells: retrieve their contents or set their contents to specified values. Given two integer cells, we can determine whether they are the same cell or not by first checking if they have distinct contents. If so, then they are distinct cells. If not, we must distinguish between two "copies" of a single cell, or two cells that happen to have the same content. To do this, bind the current contents of one cell to a variable, and set that cell's value to an integer different from the saved contents. If the other cell's value is now the newly-assigned value, then the two cells are the same, otherwise they are different.
This principle of equality is called identity of indiscernables: two things are equal if we cannot tell them apart. The test we just outlined extends to cells of other types, but is a rather roundabout way to test for cell identity. In practice we work with a slightly conservative approximation to cell identity, called reference (or pointer) equality --- two cells are equal iff they occupy the same address in memory. This test is conservative in that it may distinguish two cells that are in fact indiscernable: any two unit-valued cells are indiscernable because there is only one value of unit type, yet pointer equality would distinguish them. To avoid such anomalies we use pointer equality to determine cell identity, relying on the representation of cells as references to memory. For this reason mutable cells in ML are called reference cells, or references.
Reference cells containing values of type typ are themselves values of type typ
ref
. They are "first-class" values in the sense that
reference cells may be passed as arguments, returned as results, and even stored in other
reference cells. Reference cells are created, or allocated, by the
function ref
of type typ
-> typ ref
.
When applied to a value val of type typ, ref
allocates a
new cell, initializes its content to val, and returns a reference to the cell.
By a "new cell" we mean a cell that is distinct from all other cells
previously allocated; it does not share storage with any of them. The content
of a cell of type typ is retrieved using the function !
of type typ
ref ->
typ. Applying !
to a (reference to
a) cell returns the current content of that cell. The content of a cell is modified
by the operation op :=
of type typ *
typ ref
-> unit
; it is written using infix syntax with the reference cell as left-hand
argument and the new contents as right-hand argument. When applied to a cell and a
value, it replaces the content of that cell with that value, and yields the null-tuple as
result. Cells may be compared for equality using the equality operation, =
,
which has type typ ref *
typ ref -> bool
.
Here are some examples:
val r = ref 0
val s = ref 0
val a = r=s
val _ = r := 3
val x = !s + !r
val t = r
val b = s=t
val c = r=t
val _ = t := 5
val y = !s + !r
val z = !t + !r
Afterwards, a
is bound to false
, b
to false
,
c
to true
, x
to 3
, y
to 5
,
and z
to 10
. Be sure you understand exactly why in each
case!
The above examples illustrate the problem of aliasing. The variables t
and r
are both bound to the same cell, whereas s
is
bound to a different cell. We say that t
and r
are aliases for the same cell because the one cell is known by two different
names. Aliasing is a serious source of bugs in programs since assigning a value to
one destroys the contents of the other. Avoiding these kinds of problems requires
careful reasoning about the possibility of two variables being bound to the same reference
cell. A classic example is a program to "rotate" the contents of three
cells: given reference cells a, b, and c, with initial contents
x, y, and z, set their contents to y, z, and x,
respectively. Here's a candidate implementation:
fun rot3 (a, b, c) =
let
val t = !a
in
a := !b; b := !c; c := t
end
This code works fine if a, b, and c are distinct reference cells. But suppose that a and c are the same cell. Afterwards the contents of a, b, and c are y, y, and x! A correct implementation must work even in the presence of aliasing. Here's a solution that works correctly in all cases:
fun rot3 (a, b, c) =
let
val (x, y, z) = (!a, !b, !c)
in
a := y; b := z; c := x
end
Notice that we use immutable variables to temporarily hold the initial contents of the cells while their values are being updated.
This example illustrates the use of the semicolon to sequence evaluation of expressions purely for their effect. The expression
exp1
;
exp2
is shorthand for
let val _ =
exp1in
exp2end
The expression exp1 is evaluated only for its effect; its return value is thrown away by the wildcard binding. The value of the entire expression is the value of exp2 after evaluation of exp1 for effect. The cumulative effect of the sequential composition is the effect of evaluating exp1 followed by the effect of evaluating exp2.
It is a common mistake to omit the exclamation point when referring to the content of a
reference, especially when that cell is bound to a variable. In more familiar
languages such as C or Pascal all variables are implicitly bound to reference cells, and
they are implicitly de-referenced whenever they are used so that a variable
always stands for its current contents. This is both a boon and a bane. It is
obviously helpful in many common cases since it alleviates the burden of having to
explicitly dereference variables whenever their content is required. However, it
shifts the burden to the programmer in the case that the address, and not the content, is
intended. In C one writes &x
for the address of (the cell bound to)
x
; in Pascal one must use reference parameters to achieve a similar effect.
Which is preferable is largely a matter of taste. The burden of explicit
de-referencing is not nearly so onerous in ML as it might be in other languages simply
because reference cells are used only infrequently in ML programs, whereas they are the
sole means of binding variables in more familiar languages.
An alternative to explicitly de-referencing cells is to use ref patterns.
A pattern of the form ref
pat matches a reference cell whose
content matches the pattern pat. This means that the cell's contents are
implicitly retrieved during pattern matching, and may be subsequently used without
explicit de-referencing. For example, the second implementation of rot3
above might be written using ref patterns as follows:
fun rot3 (a, b, c) =
let
val (ref x, ref y, ref z) = (a, b, c)
in
a := y; b := z; c := x
end
In practice it is common to use both explicit de-referencing and ref patterns, depending on the situation.
Using references it is possible to mimic the style of programming used in imperative languages such as C or C++ or Java. For example, we might define the factorial function as follows:
fun imperative_fact (n:int) =
let
val result = ref 1
val i = ref 0
fun loop () =
if !i = n then
()
else
(i := !i + 1; result := !result * !i; loop ())
in
loop (); !result
end
Notice that the function loop
is essentially just a while loop; it
repeatedly executes its body until the contents of the cell bound to i
reaches n
. The tail call to loop
is essentially just a goto
statement since its argument is always the null-tuple.
It is bad style to program in this fashion. The purpose of the function imperative_fact
is to compute a simple function on the natural numbers. There is nothing about its
definition that suggests that state must be maintained, and so it is senseless to allocate
and modify storage to compute it. The definition we gave earlier is shorter,
simpler, more efficient, and hence more suitable to the task. This is not to
suggest, however, that there are no good uses of references. We will now discuss
some important uses of state in ML.
The first example is the use of higher-order functions to manage shared private state. This programming style is closely related to the use of objects to manage state in object-oriented programming languages. Here's an example to frame the discussion:
local
val counter = ref 0
in
fun tick () = (counter := !counter + 1; !counter)
fun reset () = (counter := 0)
end
This declaration introduces two functions, tick
of type unit ->
int
and reset
of type unit -> unit
. Their
definitions share a private variable counter
that is bound to a
mutable cell containing the current value of a shared counter. The tick
operation increments the counter and returns its new value, and the reset
operation resets its value to zero. The types of the operations suggest that
implicit state is involved. In the absence of exceptions and implicit state, there
is only one useful function of type unit->unit
, namely the function that
always returns its argument (and it's debatable whether this is really useful!).
The declaration above defines two functions, tick
and reset
,
that share a single private counter. Suppose now that we wish to have several
different instances of a counter --- different pairs of functions tick
and reset
that share different state. We can achieve this by defining a
counter generator (or constructor) as follows:
fun new_counter () =
let
val counter = ref 0
fun tick () = (counter := !counter + 1; !counter)
fun reset () = (counter := 0)
in
{ tick = tick, reset = reset }
end
The type of new_counter
is unit -> { tick : unit->int, reset :
unit->unit }
. We've packaged the two operations into a record containing
two functions that share private state. There is an obvious analogy with class-based
object-oriented programming. The function new_counter
may be thought of
as a constructor for a class of counter objects. Each object has a
private instance variable counter
that is shared between the methods
tick
and reset
of the object represented as a record with two
fields.
Here's how we use counters.
val c1 = new_counter ()
val c2 = new_counter ()
#tick c1 (); (* 1 *)
#tick c1 (); (* 2 *)
#tick c2 (); (* 1 *)
#reset c1 ();
#tick c1 (); (* 1 *)
#tick c2 (); (* 2 *)
Notice that c1
and c2
are distinct counters that
increment and reset independently of one another.
A second important use of references is to build mutable data structures. The data structures (such as lists and trees) we've considered so far are immutable in the sense that it is impossible to change the structure of the list or tree without building a modified copy of that structure. This is both a benefit and a drawback. The principle benefit is that immutable data structures are persistent in that operations performed on them do not destroy the original structure --- in ML we can eat our cake and have it too. For example, we can simultaneously maintain a dictionary both before and after insertion of a given word. The principle drawback is that if we aren't really relying on persistence, then it is wasteful to make a copy of a structure if the original is going to be discarded anyway. What we'd like in this case is to have an "update in place" operation to build an ephemeral (opposite of persistent) data structure. To do this in ML we make use of references.
A simple example is the type of possibly circular lists, or pcls.
Informally, a pcl is a finite graph in which every node has at most one neighbor,
called its predecessor, in the graph. In contrast to ordinary lists the
predecessor relation is not necessarily well-founded: there may be an infinite sequence of
nodes arranged in descending order of predecession. Since the graph is finite, this
can only happen if there is a cycle in the graph: some node has an ancestor as
predecessor. How can such a structure ever come into existence? If the
predecessors of a cell are needed to construct a cell, then the ancestor that is to serve
as predecessor in the cyclic case can never be created! The "trick" is to
employ backpatching: the predecessor is initialized to Nil
, so that
the node and its ancestors can be constructed, then it is reset to the appropriate
ancestor to create the cycle.
This can be achieved in ML using the following datatype
declaration:
datatype 'a pcl = Nil | Cons of 'a * 'a pcl ref
The "tail" of a Cons
node is a reference cell so that we may
assign to it to implement backpatching. Here's an example:
fun hd (Cons (h, _)) = h (* auxiliary functions *)
fun tl (Cons (_, t)) = t
val ones = Cons (1, ref Nil) (* create a preliminary acyclic structure *)
val _ = (tl ones) := ones (* backpatch to form the cycle *)
Initially the variable ones
is bound to the acyclic pcl with one node
whose head element is 1
. We then assign that very node to the
predecessor (tail) of that node, resulting in a circular pcl with one node. Observe
that hd ones
, hd !(tl ones)
, hd !(tl !(tl ones))
, etc
all evaluate to 1
. Notice that we must explicitly refer to the contents
of the tail of each node since it is a reference cell!
Let us define the length of a pcl to be the number of distinct nodes occurring
in it. An interesting exercise is to define a length
function for pcls
that makes no use of auxiliary storage (i.e., no list of
previously-encountered nodes) and runs in time proportional to the number of cells in the
pcl. Hint: think of the fable of the tortoise and the hare. If they
run a long race on an oval track, what is sure to happen, and when? Does this
suggest an algorithm?
In addition to reference cells, ML also provides mutable arrays as a primitive data
structure. The type typ array
is the type of arrays carrying
values of type typ. The basic operations on arrays are these:
array : int * 'a -> 'a array |
create array of given size with given initial value for each element |
size : 'a array -> int |
number of elements in a given array |
sub : 'a array * int -> 'a |
access element; raises Subscript exception if out of bounds access is attempted |
update : 'a array * int * 'a -> unit |
change the contents of a given array element; raises Subscript for out of bounds access |
These are just the basic operations on arrays; consult the Basis
Library document for further details. Immutable arrays are also available.
The type 'a vector
is similar to the type 'a array
, except that
vectors are immutable, whereas arrays are mutable.
One simple use of arrays is for memoization. Here's a function to
compute the nth Catalan number, which may be thought of as the number of distinct
ways to parenthesize an arithmetic expression consisting of a sequence of n
consecutive multiplication's. It makes use of an auxiliary summation function that
you can easily define for yourself. (Applying sum
to f and n
computes the sum of f 0 + ... + f n.)
fun C 1 = 1
| C n = sum (fn k => (C k) * (C (n-k))) (n-1)
This definition of C
is hugely inefficient because a given computation may
be repeated exponentially many times. For example, to compute C 10
we
must compute C 1
, C2
, ..., C 9
, and the computation
of C i
engenders the computation of C 1
, ..., C (i-1)
for each 1<=i<=9. We can do better by caching previously-computed
results in an array, leading to an enormous improvement in execution speed. Here's
the code:
local
val limit : int = 100
val memopad : int option array = Array.array (limit, NONE)
in
fun C' 1 = 1
| C' n = sum (fn k => (C k) * (C (n-k))) (n-1)
and C n =
if n < limit then
case Array.sub (memopad, n)
of SOME r => r
| NONE =>
let
val r = C' n
in
Array.update (memopad, n, SOME r);
r
end
else
C' n
end
Note carefully the structure of the solution. The function C is a memoized version of the Catalan number function. When called it consults the memopad to determine whether or not the required result has already been computed. If so, the answer is simply retrieved from the memopad, otherwise the result is computed, stored in the cache, and returned. The function C' looks superficially similar to the earlier definition of C, with the important difference that the recursive calls are to C, rather than C' itself. This ensures that sub-computations are properly cached and that the cache is consulted whenever possible.
The main weakness of this solution is that we must fix an upper bound on the size of the cache. This can be alleviated by implementing a more sophisticated cache management scheme that dynamically adjusts the size of the cache based on the calls made to it.