So far (with a few exceptions) we've programmed in what is known as an explicitly typed style. This means that whenever we've introduced a variable, we've assigned it a type at its point of introduction. In particular every variable in a pattern has a type associated with it. As you've no doubt noticed, this gets a little tedious after a while, especially when you're using clausal function definitions. A particularly pleasant feature of ML is that it allows you to omit this type information whenever it can be determined from context. This process is known as type inference since the compiler is inferring the missing type information based on contextual information.
For example, there is no need to give a type to the variable s
in the
function fn s:string => s ^ "\n"
. The reason is that no
other type for s
makes sense, since s
is used as an argument to
string concatenation. Consequently, you are allowed to write just fn s => s
^ "\n"
, leaving ML to insert ":string
" for you.
When is it allowable to omit this information? It is difficult to give a
precise answer without introducing quite a lot of machinery, but we can give some hints of
when you can and when you cannot omit types. A remarkable fact about ML is that the
answer is "almost always", with the exception of a few trouble spots that we now
discuss.
The main difficulty is with the arithmetic operators, which are overloaded, by
which we mean that the same syntax is used for integer and floating point arithmetic
operations. This creates a problem for type inference because it is not possible to
unambiguously reconstruct type information for a function such as fn n => n+n
because
there is no way to tell whether the addition operation is integer or floating point
addition. We could equally well regard this expression as abbreviating fn
n:int => n+n
, with type int->int
, or fn n:real => n+n
,
with type real->real
. In some cases the surrounding context
determines which is meant. For example, in the expression (fn n => n+n)(0)
the
only sensible interpretation is to regard the parameter n
to have type int
.
A related source of difficulty is the (infrequently used) "sharp"
notation for records. Absent information from the context, the type of the function fn
r => #name(r)
cannot be determined: all that is known of the type of r
is that it has a name
field; neither the type of that field nor the labels
and types of the remaining fields are determined. Therefore this function will be
rejected as ambiguous because there is not enough information to determine the domain type
of the function.
These examples illustrate situations where ambiguity leads to difficulties. But you shouldn't conclude from this that type inference must fail unless the missing type information can be uniquely determined! In many (indeed, most) cases there is no unique way to infer omitted type information, but there is nevertheless a best way. (The examples in the preceding paragraph merely serve to point out that sometimes there is no best way, either. But these are the exceptions, rather than the rule.)
The prototypical example is the identity function, fn x=>x
. The
body of the function places no constraints on the type of x
, since it merely
returns x
as the result without performing any computation on it. You
might suspect that this expression has to be rejected since its type is ambiguous, but
this would be unfortunate since the expression makes perfectly good sense for any
choice of the type of x
. This is in sharp contrast to examples such as
the function fn x=>x+1
, for which only two choices for the type
of x
are possible (namely, int
and real
), with no
way to choose between them. The choice of int
or real
affects the behavior of the function: in one case it performs an integer addition, in the
other a floating point addition. In the case of the identity function, however, we
may choose any type at all for x
, without affecting the execution behavior of
the function --- the function is said to be polymorphic because its execution
behavior is uniform in the type of x
. Therefore the identity function
has infinitely many types, one for each choice of the type of the parameter x
.
Choosing the type of x
to be typ, the type of the identity
function is typ->
typ. In other words every type
for the identity function has the form typ->
typ, where typ
is the type of the argument.
Clearly there is a pattern here, which is captured by the notion of a type scheme.
A type scheme is a type expression involving one or more type variables
standing for an unknown, but arbitrary type expression. Type variables are written 'a
("alpha"), 'b
("beta"), 'c
("gamma"), etc. An instance of a type scheme is
obtained by replacing each of the type variables occurring in it with a type scheme, with
the same type scheme replacing each occurrence of a given type variable. For
example, the type scheme 'a->'a
has instances int->int
, string->string
,
(int*int)->(int*int)
, and ('b->'b)->('b->'b)
, among
infinitely many others. It does not have the type int->string
as instance, since we are constrained to replace all occurrences of a type variable by the
same type scheme. However, the type scheme 'a->'b
has both int->int
and int->string
as instances since there are different type variables
occurring in the domain and range positions.
Type schemes are used to capture the polymorphic behavior of functions. For
example, we may write fn x:'a=>x
for the polymorphic identity function of
type 'a->'a
, meaning that the behavior of the identity function is
independent of the type of x
, an hence may be chosen arbitrarily.
Similarly, the behavior of the function fn (x,y)=>x+1
is independent of
the type of y
, but constrains the type of x
to be int
.
This may be expressed using type schemes by writing this function in the explicitly-typed
form fn (x:int,y:'a)=>x+1
with type int*'a->int
. In
each of these cases we needed only one type variable to express the polymorphic behavior
of a function, but usually we need more than one. For example, the function fn
(x,y) = x
constrains neither the type of x
nor the type of y
.
Consequently we may choose their types freely and independently of one another.
This may be expressed by writing this function in the form fn
(x:'a,y:'b)=>x
with type 'a*'b->'a
. Notice that while it
is correct to assign the type 'a*'a->'a
to this function, doing so would
be overly restrictive since the types of the two parameters need not be the same.
Notice as well that we could not assign the type 'a*'b->'c
to
this function because the type of the result must be the same as the type of the first
parameter since the function returns its first parameter when invoked. The type
scheme precisely captures the constraints that must be satisfied for the function to be
type correct. It is said to be the most general or principal type
scheme for the function.
It is a remarkable fact about ML that every expression (with the exception of
those pesky examples involving arithmetic primitives or record selection operations) has
a principal type scheme. That is, there is always (well, with very few
exceptions) a best or most general way to infer types for expressions
that maximizes generality, and hence maximizes flexibility in the use of
the expression. Every expression "seeks its own depth" in the sense that
an occurrence of that expression is assigned a type that is an instance of its principal
type scheme determined by the context of use. For example, if we write (fn
x=>x)(0)
, then the context forces the type of the identity function to be int->int
,
and if we write (fn x=>x)(fn x=>x)(0)
, the context of use selects the
instance (int->int)->(int->int)
of the principal type scheme for the
identity at the first occurrence, and the instance int->int
for the
second.
How is this achieved? Type inference is a process of constraint satisfaction. First, the expression determines a set of equations governing the relationships among the types of its subexpressions. For example, if a function is applied to an argument, then a constraint equating the domain type of the function with the type of the argument is generated. Second, the constraints are solved using a process similar to Gaussian elimination, called unification. The equations can be classified by their solution sets as follows:
The free type variables in the solution to the system of equations may be thought of as determining the "degrees of freedom" or "range of polymorphism" of the type of an expression --- the constraints are solvable for any choice of types to substitute for these free type variables.
The characterization of type inference as a constraint satisfaction procedure suggests an explanation for the notorious obscurity of type checking errors in ML. If a program is not type correct, then the system of constraints associated with it will not have a solution. The type inference procedure considers the constraints in some order, and at some point discovers an inconsistency. It is fundamentally impossible to attribute this inconsistency to any one feature of the program: all that is known is that the constraint set as a whole is unsatisfiable. The checker usually reports the first unsatisfiable equation it encounters, but this may or may not correspond to the "reason" (in the mind of the programmer) for the type error. The usual method for finding the error is to insert sufficient type information to narrow down the source of the inconsistency until the source of the difficulty is uncovered.
There is an important interaction between polymorphic expressions and value bindings
that may be illustrated by the following example. Suppose that we wish to bind the
identity function to a variable I
so that we may refer to it by name.
We've previously observed that the identity function is polymorphic, with principal type
scheme 'a->'a
. This may be captured by ascribing this type scheme to
the variable I
at the val
binding. That is, we may write
val I : 'a->'a = fn x=>x
to ascribe the type scheme 'a->'a
to the variable I
.
(We may also write
fun I(x:'a):'a = x
for an equivalent binding of I
.) Having done this, each use of I
determines a distinct instance of the ascribed type scheme 'a->'a
.
That is, both I 0
and I I 0
are well-formed expressions,
the first assigning the type int->int
to I
, the second
assigning the types (int->int)->(int->int)
and int->int
to the two occurrences of I
. Thus the variable I
behaves
precisely the same as its definition, fn x=>x
, in any expression where it
is used.
As a convenience ML also provides a form of type inference on value bindings that
eliminates the need to ascribe a type scheme to the variable when it is bound. If no
type is ascribed to a variable introduced by a val
binding, then it is
implicitly ascribed the principal type scheme of the right-hand side. For example,
we may write
val I = fn x=>x
or
fun I(x) = x
as a binding for the variable . The type checker implicitly assigns the principal
type scheme, 'a->'a
, of the binding to the variable I
.
In practice we often allow the type checker to infer the principal type of a variable, but
it is often good form to explicitly indicate the intended type as a consistency check and
for documentation purposes.
We finish this section with a technical issue that arises from time to time. As
we remarked above, the treatment of val
bindings ensures that a bound
variable has precisely the same types as its binding. In other words the type
checker behaves as though all uses of the bound variable are implicitly replaced by its
binding before type checking. Since this may involve replication of the binding, the
meaning of a program is not necessarily preserved by this transformation. (Think,
for example, of any expression that opens a window on your screen: if you replicate the
expression and evaluate it twice, it will open two windows. This is not the same as
evaluating it only once, which results in one window.) To ensure semantic
consistency, variables introduced by a val
binding are allowed to be
polymorphic only if the right-hand side is a value. (This is called the value
restriction on polymorphic declarations.) For fun
bindings this
restriction is always met since the right-hand side is implicitly a lambda, which is a
value. However, it might be thought that the following declaration introduces a
polymorphic variable of type 'a -> 'a
, but in fact it is rejected by the
compiler:
val J = I I
The reason is that the right-hand side is not a value; it requires computation to
determine its value. It is therefore ruled out as inadmissible for polymorphism; the
variable J
may not be used polymorphically in the remainder of the program.
In this case the difficulty may be avoided by writing instead
fun J x = I I x
because now the binding of J
is a lambda, which is a value. In
some rare circumstances this is not possible, and some polymorphism is lost. For
example, the following declaration of a value of list type,
val l = nil @ nil,
does not introduce an identifier with a polymorphic type, even though the almost equivalent declaration
val l = nil
does do so. Since the right-hand side is a list, we cannot apply the
"trick" of defining l
to be a function; we are stuck with a loss of
polymorphism in this case. This particular example is not very impressive since it's
hard to imagine using the former, rather than the latter, declaration in a practical
situation, but occasionally something similar does arise, with an attendant loss of
polymorphism.
Why this limitation? Later on, when we study mutable storage, we'll see that some restriction on polymorphism is essential if the language is to be type safe. The value restriction is an easily-remembered sufficient condition for soundness, but as the examples above illustrate, it is by no means necessary. The designers of ML were faced with a choice of simplicity vs flexibility; in this case they opted for simplicity at the expense of some expressiveness in the language.