We have already noted that aggregate data structures are especially easy to handle in
ML. Our first examples were tuple and record types. The list types
provide another example of an aggregate data structure in ML. Informally, the values
of type typ list
are the finite lists of values of type typ.
But what is a list? The values of type typ list
are
defined as follows:
nil
is a value of type typ list
.list
, then h::
t is a value of type typ
list
.list
.The type expression typ list
is a postfix notation for the
application of the type constructor list
to the argument typ.
Thus list
is a kind of "function" mapping types to types:
given a type typ, we may apply list
to it to get another type,
written typ list
. The forms nil
and ::
are the value constructors of type typ list
. The
nullary (no argument) constructor nil
may be thought of as the empty list.
The binary (two argument) constructor ::
constructs a non-empty list
from a value h of type typ and another value t of type typ list
;
the resulting value, h::
t, of type typ list
is pronounced "h cons t" (for historical reasons). We
say that "h is cons'd onto t", that h is the
"head" of the list, and that t is its "tail".
The definition of the values of type typ list
given above
is an example of an inductive definition. The type is said
to be recursive because this definition is "self-referential" in the
sense that the values of type typ list
are defined in terms of
(other) values of the same type. This is especially clear if we examine the types of
the value constructors for the type typ list
:
nil
:typ
list
:
op ::typ
*
typlist ->
typlist
(The notation op ::
is used to refer to the "cons"
operator as a function, rather than to use it to form a list, which requires
infix notation.) Two things are notable here:
list
,
and yields a result of type typ list
. This reflects the
"recursive" nature of the type typ list
.nil
is the empty list of type typ list
for any element type typ, and op ::
constructs a non-empty list
independently of the type of the elements of that list.A consequence of the inductive definition of the list type is that values of type typ
list
have the form
h1
::(
h2::
...::(
hn::
nil)
...)
for some n>=0. (When n is zero, this is, by convention, the
empty list, nil
.) The operator ::
is right-associative,
so we may omit the parentheses and just write
h1
::
h2::
...::
hn::
nil
.
As a further convenience this list may be abbreviated using list notation:
[
h1,
h2,
...,
hn]
This notation emphasizes the interpretation of lists as finite sequences of values, but
it obscures the fundamental inductive character of lists as being built up from nil
using the ::
operation.
How do we compute with values of list type? Since the values are defined
inductively, it is natural that functions on lists be defined recursively, using a clausal
definition that analyzes the structure of a list. Here's a definition of the
function length
that computes the number of elements of a list:
fun length nil = 0
| length (_::t) = 1 + length t
The definition is given by induction on the structure of the list argument. The
base case is the empty list, nil
. The inductive step is the non-empty
list _::t
(notice that we do not need to give a name to the head). Its
definition is given in terms of the tail of the list t
, which is
"smaller" than the list _::t
. The type of length
is 'a list -> int
; it is defined for lists of values of any type
whatsoever.
We may define other functions following a similar pattern. Here's the function to append two lists:
fun append (nil, l) = l
| append (h::t, l) = h :: append (t, l)
This function is built into ML; it is written using infix notation as exp1
@
exp2. The running time of append
is proportional to the length of the first list, as should be obvious from its definition.
Here's a function to reverse a list.
fun rev nil = nil
| rev (h::t) = rev t @ [h]
It is not tail recursive. In fact, its time complexity is O(n2), where n is the length of the argument list. This can be demonstrated by writing down a recurrence that defines the running time T(n) on a list of length n.
T(0) = O(1)
T(n+1) = T(n) + O(n)
Solving the recurrence we obtain the result T(n)=O(n2).
Can we do better? Oddly, we can take advantage of the non-associativity
of ::
to give a tail-recursive definition of rev
.
local
fun rev_helper (nil, a) = a
| rev_helper (h::t, a) = rev_helper (t, h::a)
in
fun rev l = rev_helper (l, nil)
end
The pattern is the same as before, except that by re-associating the uses of ::
we reverse the list! The helper function reverses its first argument and prepends it
to its second argument. That is, rev_helper (l, a)
evaluates to (rev
l) @ a
, where we assume here an independent definition of rev
for the
sake of the specification. Notice that rev_helper
runs in time
proportional to the length of its first argument, and hence rev
runs in time
proportional to the length of the list.
The correctness of functions defined on lists is established using the principle of structural
induction. We illustrate this by establishing that the function rev_helper
satisfies the following specification:
for every l and a of type typ
list
,rev_helper (
l,
a)
evaluates to the result of appending a to the reversal of l.
The proof is by structural induction on the list l. If l is nil
,
then rev_helper (
l,
a)
evaluates
to a, which is as required. If l is h::
t,
then by the inductive hypothesis rev_helper (
l,
a)
evaluates to the result of appending h::
a to the reversal
of t, which is easily seen to be the result of appending a to the
reversal of h::
t.
The form of this argument may be summarized as follows:
nil
.::
t.It follows that the function is correct for all lists l, by the inductive definition of the list type. This is called the principle of structural induction on lists. We will soon generalize this to other inductively-defined types.