It's time to return to function definitions. So far we've only considered very
simple functions (such as the reciprocal function) whose value is computed more or less
directly using the primitives of the language. You may well be wondering at this
stage how to define functions that require some form of iteration to compute. In
familiar imperative languages iteration is accomplished using while
and for
loops; in ML it is accomplished using recursion.
Informally, a function defined by recursion is one that computes the result of a call
by "calling itself". To accomplish this, the function must be given a name
by which it can refer to itself. This is achieved using a recursive value
binding. Recursive value bindings have almost the same form as ordinary,
non-recursive value bindings, except that the binding is qualified with the adjective
"rec
" by writing val rec
pat =
exp.
Here's an example:
val rec factorial : int->int = fn 0 => 1 | n:int => n * factorial (n-1)
This is a recursive definition of the factorial function, which is ordinarily defined in textbooks by the recursion equations
0! = 1
n! = n*(n-1)! (n>0)
Using fun
notation we may write the definition of factorial much more
clearly and concisely as follows:
fun factorial 0 = 1
| factorial (n:int) = n * factorial (n-1)
There is clearly a close correspondence between the ML notation and the mathematical notation for the factorial function.
How are recursive value bindings type-checked? The answer may appear, at first
reading, to be paradoxical: assume that the function has the type specified, then
check that the definition is consistent with this assumption. In the case
of factorial
we assume that factorial
has type int->int
,
then check that its definition
fn 0 => 1 | n:int => n * factorial (n-1)
has type int->int
. To do so we must check that each pattern has
type int
, and that each corresponding expression has type int
.
This is clearly true for the first clause of the definition. For the second,
we assume that n
has type int
, then check that n *
factorial (n-1)
has type int
. This is so because of the rules
for the primitive arithmetic operations and because of our assumption that factorial
has type int->int
. (Be certain that you understand this reasoning!
It is essential for what follows.)
How are applications of recursive value bindings evaluated? The rules are almost
the same as before. We need only observe that the binding for the function may have
to be retrieved many times during evaluation (once for each recursive call). For
example, to evaluate factorial 3
, we retrieve the definition of factorial
,
then pattern match the argument against the pattern of each clause. Clearly 3
does not match 0
, but it does match n:int
, binding n
to 3
in the process. We then evaluate n * factorial (n-1)
relative to this binding for n
. To do so we retrieve the binding for factorial
a second time, and to apply it to 2
. Once again we consider each clause
in turn, failing to match 0
, but succeeding to match n:int
.
This introduces a new binding for n
that shadows the previous
binding so that n
now evaluates to 2
. We then proceed once
again to evaluate n * factorial (n-1)
, this time with n
bound to
2
. Once again we retrieve the binding for factorial
, then
bind n
to 1
, shadowing the two previous bindings, then
evaluating n * factorial (n-1)
with this binding for n
. We
retrieve the binding for factorial
one last time, then apply it to 0
.
This time we match the pattern 0
and yield 1
. We
then (in four steps) compute the result, 6
, by completing the pending
multiplications.
The factorial
function illustrates an important point about recursive
function definitions. Notice that the recursive call in the definition of factorial
occurs as the argument of a multiplication. This means that in order for the
multiplication to complete, we must first complete the calculation of the recursive call
to factorial
. In rough outline the computation of factorial 3
proceeds as follows:
factorial 3
3 * factorial 2
3 * 2 * factorial 1
3 * 2 * 1 * factorial 0
3 * 2 * 1 * 1
3 * 2 * 1
3 * 2
6
(The chains of multiplications are implicitly right-associated.) Notice that the size of the expression first grows (in direct proportion to the argument), then shrinks as the pending multiplications are completed. This growth in expression size corresponds directly to a growth in run-time storage required to record the state of the pending computation.
The foregoing definition of factorial
should be contrasted with the
following definition:
fun fact_helper (0,r:int) = r
| fact_helper (n:int,r:int) = fact_helper (n-1,n*r)
fun factorial n:int = fact_helper (n, 1)
We define factorial
using a helper function fact_helper
that takes an additional parameter, called an accumulator, that records the
running partial result of the computation. This corresponds to reducing the prefix
of the pending computations in the trace given above by "left-associating" the
multiplications. (In fact the technique is only applicable to associative binary
operations for precisely this reason.)
The important thing to observe about fact_helper
is that it is tail
recursive, meaning that the recursive call is the last step of evaluation of an
application of it to an argument. The following evaluation trace illustrates the
point:
factorial 3
fact_helper (3, 1)
fact_helper (2, 3)
fact_helper (1, 6)
fact_helper (0, 6)
6
Notice that there is no growth in the size of the expression because there are no pending computations to be resumed upon completion of the recursive call. Consequently, there is no growth in the space required for an application, in contrast to the first definition given above. In this sense tail recursive definitions are analogous to loops in imperative languages: they merely iterate a computation, and do not require allocation of storage during execution. For this reason tail recursive procedures are sometimes called iterative.
Time and space usage are important, but what is more important is that the function compute the intended result. The key to the correctness of a recursive function is an inductive argument establishing its correctness. The critical ingredients are these:
These ideas may be illustrated by considering the first definition of factorial
given above. A reasonable specification for factorial
is as follows:
if n>=0 then
factorial
n evaluates to n!
Notice that the specification expresses the assumption that the argument, n,
is non-negative, and asserts that the application of factorial
to n
terminates with the expected answer.
To check that satisfies this specification, we apply the principle of mathematical
induction on the argument n. Recall that this means we are to
establish the specification for the case n=0, and, assuming it to hold for n>=0,
show that it holds for n+1. The base case, n=0, is trivial: by
definition factorial
n evaluates to 1, which is 0!.
Now suppose that n=m+1 for some m>=0. By the inductive
hypothesis we have that factorial
m evaluates to m!, and so
by the definition factorial
n evaluates to the value of n*m! =
(m+1)*m! = (m+1)! = n!, as required. This completes the proof.
That was easy. What about the second definition of factorial
?
We focus on the behavior of fact_helper
. A suitable specification is
if n>=0 then
fact_helper
(
n,r)
evaluates to n!*r
Once again we proceed by mathematical induction on n; you can easily check
that fact_helper
satisfies this specification. It follows that the
second definition of factorial
in terms of fact_helper
satisfies
the specification of factorial
given above, since we may take r=1 in
the specification of fact_helper
.
As a matter of programming style, it is usual to conceal the definitions of helper
functions using a local
declaration. In practice we would make the
following definition of the iterative version of factorial
:
local
fun fact_helper (0,r:int) = r
| fact_helper (n:int,r:int) = fact_helper (n-1,n*r)
in
fun factorial (n:int) = fact_helper (n,1)
end
This way the helper function is not visible, only the function of interest is "exported" by the declaration.
Here's an example of a function defined by complete induction, the Fibonacci function, defined on integers n>=0:
(* for n>=0, fib n evaluates to the nth Fibonacci number *)
fun fib 0 = 1
| fib 1 = 1
| fib (n:int) = fib (n-1) + fib (n-2)
The recursive calls are made not only on n-1
, but also n-2
,
which is why we must appeal to complete induction to justify the definition. This
definition of fib
is very inefficient because it performs many redundant
computations: to compute fib n
requires that we compute fib (n-1)
and fib (n-2)
. To compute fib (n-1)
requires that we
compute fib (n-2)
a second time, and fib (n-3)
.
Computing fib (n-2)
requires computing fib (n-3)
again,
and fib (n-4)
. As you can see, there is considerable redundancy
here. It can be shown that the running time fib
of is exponential
in its argument, which is clearly awful for such a simple function.
Here's a better solution: for each n>=0 compute not only the nth Fibonacci number, but also the (n-1)st as well. (For n=0 we define the "-1"st Fibonacci number to be zero). That way we can avoid redundant recomputation, resulting in a linear-time algorithm. Here's the code:
(* for n>=0, fibb n evaluates to (a, b), where a is the nth Fibonacci number and b is the (n-1)st *)
fun fibb 0 = (1, 0)
| fibb 1 = (1, 1)
| fibb (n:int) =
let
val (a:int, b:int) = fibb (n-1)
in
(a+b, a)
end
You might feel satisfied with this solution since it runs in time linear in n. But in fact there's a constant-time algorithm to compute the nth Fibonacci number! In other words the recurrence
F0 = 1
F1 = 1
Fn = Fn-1 + Fn-2
has a closed-form solution. (See Knuth's Concrete Mathematics (Addison-Wesley 1989) for a derivation.) However, this is an unusual case. In most instances recursively-defined functions have no known closed-form solution, so that some form of iteration is inevitable.
It is often useful to define two functions simultaneously, each of which calls itself and/or the other to compute its result. Such fnctions are said to be mutually recursive. Here's a simple example to illustrate the point, namely testing whether a natural number is odd or even. The most obvious approach is to test whether the number is congruent to 0 mod 2, and indeed this is what one would do in practice. But to illustrate the idea of mutual recursion we instead use the following inductive characterization: 0 is even, and not odd; n>0 is even iff n-1 is odd; n>0 is odd iff n-1 is even. This may be coded up using two mutually-recursive procedures as follows:
fun even 0 = true
| even n = odd (n-1)
and odd 0 = false
| odd n = even (n-1)
Notice that even
calls odd
and odd
calls even
,
so they are not definable separately from one another. We join their definitions
using the keyword and
to indicate that they are defined simultaneously by
mutual recursion. Later in these notes we will see more compelling examples of
mutually-recursive functions.