** 11. Type Inference **

----------------------------------------------------------------------------

11.1.

Unlike Core ML, real ML's core language (inside of modules aka
structures/functors) allows type declarations to be omitted from value
and function declarations, and ML's type inferencer will reconstruct
them.

Type inference goes over a single declaration at a time (considering a
collection of mutually recursive functions linked by "and" as a single
declaration), given the types of all previously declared things in
scope.  It starts by assigning each variable & expression a fresh,
distinct type variable.  It then traverses each expression in the body
of the declaration, in a linear pass, in any order.  For each
expression, it generates some number of equality constraints over the
type variables of its subexpressions.  When it's done, it solves the
constraints.  If the constraint system has no solution, then there's a
type error somewhere in the declaration (after all, type checking is
just a form of internal consistency checking, and a type error is just
an inconsistency, i.e., an unsatisfiable system of constraints).
Otherwise the declaration is type-correct.

Example: the original source:

    fun f x y =
	if x < x+x then y else (fn z => z)

The source with a fresh type variable annotated on each variable and
expression (including the identifier being declared):

    fun (f:'a) (x:'b) (y:'c) =
	(if ((x:'d) < (((x:'e) + (x:'f)):'g)):'h
	 then y:'i
	 else fn (z:'j) => (z:'k)):'l
	):'m

The environment in which this source is type-inferenced:

    (op +): int*int->int
    (op <): int*int->bool

The constraints that are accumulated, if we traverse the declaration
in bottom-up, depth-first order:

    'a = 'b -> 'c -> 'm
    'd = 'b
    'e = 'b
    'e = int
    'f = 'b
    'f = int
    'g = int
    'd = int
    'h = bool
    'i = 'c
    'k = 'j
    'l = 'j -> 'k
    'm = 'i
    'm = 'l

One solution to the constraints:

    'a = int -> ('j -> 'j) -> ('j -> 'j)
    'b = 'd = 'e = 'f = int
    'c = 'i = 'l = 'm = ('j -> 'j)
    'g = int    
    'h = bool
    'j = 'k 

Note that 'j is left unconstrained; under-constrained solutions
generate polymorphic types, i.e., f is a function of polymorphic type.

The end result of this inference is that a new variable f has been
declared, of type 'a.  ML prints this out, after renaming all type
variables to be the smallest possible letters and putting in parens
only where necessary:

    val f: int -> ('a->'a) -> 'a -> 'a

----------------------------------------------------------------------------

11.2.

There are other solutions to these same constraints, e.g., we could
set 'j to int:

    'a = int -> (int -> int) -> (int -> int)
    'b = 'd = 'e = 'f = int
    'c = 'i = 'l = 'm = (int -> int)
    'g = int    
    'h = bool
    'j = 'k = int

This solution would lead to the type of f being inferred as:

    val f: int -> (int->int) -> int -> int

But we don't want this solution, since it's an unnecessarily
restrictive type.  The first type for f allows f to be called with a
second argument of type (bool->bool), but the second type doesn't
allow it.  Since we want to be able to use code in as many situations
as possible, we prefer the first, more polymorphic type.

An important property of ML type inference is that either the system
of constraints has no solution (i.e., there's a type error), or there
exists a *unique most-general solution*, called the *principal type*.
This means that there always is a single, best type to infer.  All we
need is an algorithm to solve the system of type constraints that
finds it if it exists.

There is such an algorithm.  The standard algorithm uses *unification*
to process type-equality constraints.  Given an equality constraint
between type expressions, tau1 = tau2, unification traverses over the
structure of the two types, verifying that they have the same
top-level structure, and recursively unifying each subtree.  If one of
(sub)trees is a variable, then we equate that variable with the other
(sub)tree.  All future references to that type variable are replaced
with the equated tree; we thus maintain a *substitution* mapping
variables to the trees with which they're unified, and unification of
two trees takes a substitution as an argument and produces a possibly
extended substitution.

E.g.

    unify('x = int, [])  ==>  ['x := int]

    unify('x = 'y, ['x := int])  ==>
    unify(int = 'y, ['x := int])  ==>  ['x := int, 'y := int]

    unify('x = 'y, ['x := int, 'y := bool])  ==>
    unify(int = bool, [...])  ==>  failure

    unify('x = 'y, ['x := int, 'y := 'a->'b])  ==>
    unify(int = 'a->'b, [...])  ==>  failure

    unify(('a->'b, 'c) = ('d, 'a), [])  ==>  ['d := 'a->'b, 'c := 'a]

    unify(('a->'b, 'c) = ('d, 'a), [])  ==>  ['d := 'a->'b, 'c := 'a]

One last tricky case:

    unify('a->'b = 'a, [])  ==>  failure

This fails because we can't have a type variable mapped to a type
expression containing itself, since that would be a kind of recursive
type without an explicit rec expression.  This is called an "occurs
check".  Prolog, which uses unification in its basic execution model,
has a similar issue.

We can impose some structural constraints on substitutions, to make
them easier to deal with.  In particular, we don't want any rhs's of
replacements to include any type variables that are themselves
replaced with something.  I.e., dom(subst) intersect FV(range(subst))
= 0.  If this is true, we can blindly apply substitutions to type
expressions, and know that we'll get fully substituted types when
we're done.  To make it true, we just apply the substitution to any
type expression before putting it in the substitution.  We also check
that the resulting type expression doesn't contain a reference to the
type variable we're defining, i.e., we do a final occurs check.

After all the unifications are processed, we just apply the final
substitution to the type of the variable or function being declared,
to construct its unique, most general type.

---------------------------------------------------------------------------

11.3.

The above example produced a polymorphic function type, but it didn't
use any polymorphic functions or values.  Whenever an identifier of
polymorphic type is referenced, we generate fresh type variables for
all its type parameters, representing a fresh instantiation of the
polymorphic value, and then accumulate type constraints normally on
the fresh type.

E.g. nil and cons have the following polymorphic types:

    nil: 'a list  (* [] *)
    cons: 'a * 'a list -> 'a list  (* op:: *)

Whenever we reference nil or cons, we first generate a fresh type
variable and replace 'a with it.  Then we use the resulting
substituted type as the type of that expression.

Example: the original source:

    fun f x = cons (cons (x, nil), nil)

The source with a fresh type variable annotated on each variable and
expression (including the identifier being declared):

    fun (f:'a) (x:'b) =
	((cons:'c)
	    (((cons:'d) (x:'e, nil:'f)):'g,
	     nil:'h)
	):'i

The constraints that are accumulated, if we traverse the declaration
in bottom-up, depth-first order:

    'a = 'b -> 'i
    'c = 't * 't list -> 't list
    'd = 'u * 'u list -> 'u list
    'e = 'b
    'f = 'v list
    'e = 'u
    'f = 'u list
    'g = 'u list
    'h = 'w list
    'g = 't
    'h = 't list
    'i = 't list

The principal solution to the constraints:

    'a = ('b -> 'b list list)
    'c = ('b list * 'b list list -> 'b list list)
    'd = ('b * 'b list -> 'b list)
    'e = 'u = 'v = 'b
    'f = 'g = 't = 'w = 'b list
    'h = 'i = 'b list list

ML prints out:

    val f: 'a -> 'a list list

Each cons and nil reference ended up with its own distinct type ('c
vs. 'd, 'f vs. 'h), as polymorphic functions can.

---------------------------------------------------------------------------

11.3.

Finally, we need to support recursion.  To do this, we need to add the
types of all the mutually recursive identifiers being defined to the
type environment before we do type inference over the bodies.  This is
easy.  E.g. given the source:

    val f = fn ...
    and g = fn ...

we introduce fresh type annotations as follows:

    fun (f:'a) = fn ...
    and (g:'b) = fn ...

Then we analyze the bodies of these functions in the type environment
where f:'a and g:'b.  We'll naturally accumulate constraints relating
to recursive calls to f and g, and then solve all the constraints when
we're done.  Voila!

One subtlety is that if f & g are polymorphic, we won't be generating
fresh type variables for the recursive calls, i.e., the recursive
calls will be to the same instantiation of f & g, not to some fresh
instantiation.  This is called monomorphic recursion, in contrast to
polymorphic recursion which would allow recursive calls to instantiate
type variables freshly.  ML's type inference only supports monomorphic
recursion, and type inference using polymorphic recursion turns out to
be undecidable!

---------------------------------------------------------------------------

11.4.

When we're done type inference with a function, we treat any remaining
unconstrained type variables in the declared value's type as being
polymorphic.  This has the effect of putting forall quantifiers at the
outermost level on the declared type.  E.g. in the first example
above, we inferred:

    val f: int -> ('a->'a) -> 'a -> 'a

which in Core ML (with explicit quantification) would be:

    val f: forall a. int -> (a->a) -> a -> a

This type is different than, e.g.,

    val f2: int -> (forall a. (a->a) -> a -> a)

which is itself different than:

    val f3: int -> (forall a. a->a) -> forall a. a -> forall a. a

ML type inference only infers the first kind of type, where
quantifiers are introduced at the outmost level.  And in fact,
quantifiers are only introduced for let bindings (either at implicit
top-level let bindings as we've been doing, or for explicit
let...in... expressions).  Other identifiers, in particular the formal
parameters of functions or identifiers bound in patterns, are not
themselves polymorphic (if they have a type variable as their type,
then this type variable is bound by some enclosing polymorphic let
binding).  This restricted form of polymorphism introduced by ML's
type inference is called "let polymorphism".