** 11. Type Inference ** ---------------------------------------------------------------------------- 11.1. Unlike Core ML, real ML's core language (inside of modules aka structures/functors) allows type declarations to be omitted from value and function declarations, and ML's type inferencer will reconstruct them. Type inference goes over a single declaration at a time (considering a collection of mutually recursive functions linked by "and" as a single declaration), given the types of all previously declared things in scope. It starts by assigning each variable & expression a fresh, distinct type variable. It then traverses each expression in the body of the declaration, in a linear pass, in any order. For each expression, it generates some number of equality constraints over the type variables of its subexpressions. When it's done, it solves the constraints. If the constraint system has no solution, then there's a type error somewhere in the declaration (after all, type checking is just a form of internal consistency checking, and a type error is just an inconsistency, i.e., an unsatisfiable system of constraints). Otherwise the declaration is type-correct. Example: the original source: fun f x y = if x < x+x then y else (fn z => z) The source with a fresh type variable annotated on each variable and expression (including the identifier being declared): fun (f:'a) (x:'b) (y:'c) = (if ((x:'d) < (((x:'e) + (x:'f)):'g)):'h then y:'i else fn (z:'j) => (z:'k)):'l ):'m The environment in which this source is type-inferenced: (op +): int*int->int (op <): int*int->bool The constraints that are accumulated, if we traverse the declaration in bottom-up, depth-first order: 'a = 'b -> 'c -> 'm 'd = 'b 'e = 'b 'e = int 'f = 'b 'f = int 'g = int 'd = int 'h = bool 'i = 'c 'k = 'j 'l = 'j -> 'k 'm = 'i 'm = 'l One solution to the constraints: 'a = int -> ('j -> 'j) -> ('j -> 'j) 'b = 'd = 'e = 'f = int 'c = 'i = 'l = 'm = ('j -> 'j) 'g = int 'h = bool 'j = 'k Note that 'j is left unconstrained; under-constrained solutions generate polymorphic types, i.e., f is a function of polymorphic type. The end result of this inference is that a new variable f has been declared, of type 'a. ML prints this out, after renaming all type variables to be the smallest possible letters and putting in parens only where necessary: val f: int -> ('a->'a) -> 'a -> 'a ---------------------------------------------------------------------------- 11.2. There are other solutions to these same constraints, e.g., we could set 'j to int: 'a = int -> (int -> int) -> (int -> int) 'b = 'd = 'e = 'f = int 'c = 'i = 'l = 'm = (int -> int) 'g = int 'h = bool 'j = 'k = int This solution would lead to the type of f being inferred as: val f: int -> (int->int) -> int -> int But we don't want this solution, since it's an unnecessarily restrictive type. The first type for f allows f to be called with a second argument of type (bool->bool), but the second type doesn't allow it. Since we want to be able to use code in as many situations as possible, we prefer the first, more polymorphic type. An important property of ML type inference is that either the system of constraints has no solution (i.e., there's a type error), or there exists a *unique most-general solution*, called the *principal type*. This means that there always is a single, best type to infer. All we need is an algorithm to solve the system of type constraints that finds it if it exists. There is such an algorithm. The standard algorithm uses *unification* to process type-equality constraints. Given an equality constraint between type expressions, tau1 = tau2, unification traverses over the structure of the two types, verifying that they have the same top-level structure, and recursively unifying each subtree. If one of (sub)trees is a variable, then we equate that variable with the other (sub)tree. All future references to that type variable are replaced with the equated tree; we thus maintain a *substitution* mapping variables to the trees with which they're unified, and unification of two trees takes a substitution as an argument and produces a possibly extended substitution. E.g. unify('x = int, []) ==> ['x := int] unify('x = 'y, ['x := int]) ==> unify(int = 'y, ['x := int]) ==> ['x := int, 'y := int] unify('x = 'y, ['x := int, 'y := bool]) ==> unify(int = bool, [...]) ==> failure unify('x = 'y, ['x := int, 'y := 'a->'b]) ==> unify(int = 'a->'b, [...]) ==> failure unify(('a->'b, 'c) = ('d, 'a), []) ==> ['d := 'a->'b, 'c := 'a] unify(('a->'b, 'c) = ('d, 'a), []) ==> ['d := 'a->'b, 'c := 'a] One last tricky case: unify('a->'b = 'a, []) ==> failure This fails because we can't have a type variable mapped to a type expression containing itself, since that would be a kind of recursive type without an explicit rec expression. This is called an "occurs check". Prolog, which uses unification in its basic execution model, has a similar issue. We can impose some structural constraints on substitutions, to make them easier to deal with. In particular, we don't want any rhs's of replacements to include any type variables that are themselves replaced with something. I.e., dom(subst) intersect FV(range(subst)) = 0. If this is true, we can blindly apply substitutions to type expressions, and know that we'll get fully substituted types when we're done. To make it true, we just apply the substitution to any type expression before putting it in the substitution. We also check that the resulting type expression doesn't contain a reference to the type variable we're defining, i.e., we do a final occurs check. After all the unifications are processed, we just apply the final substitution to the type of the variable or function being declared, to construct its unique, most general type. --------------------------------------------------------------------------- 11.3. The above example produced a polymorphic function type, but it didn't use any polymorphic functions or values. Whenever an identifier of polymorphic type is referenced, we generate fresh type variables for all its type parameters, representing a fresh instantiation of the polymorphic value, and then accumulate type constraints normally on the fresh type. E.g. nil and cons have the following polymorphic types: nil: 'a list (* [] *) cons: 'a * 'a list -> 'a list (* op:: *) Whenever we reference nil or cons, we first generate a fresh type variable and replace 'a with it. Then we use the resulting substituted type as the type of that expression. Example: the original source: fun f x = cons (cons (x, nil), nil) The source with a fresh type variable annotated on each variable and expression (including the identifier being declared): fun (f:'a) (x:'b) = ((cons:'c) (((cons:'d) (x:'e, nil:'f)):'g, nil:'h) ):'i The constraints that are accumulated, if we traverse the declaration in bottom-up, depth-first order: 'a = 'b -> 'i 'c = 't * 't list -> 't list 'd = 'u * 'u list -> 'u list 'e = 'b 'f = 'v list 'e = 'u 'f = 'u list 'g = 'u list 'h = 'w list 'g = 't 'h = 't list 'i = 't list The principal solution to the constraints: 'a = ('b -> 'b list list) 'c = ('b list * 'b list list -> 'b list list) 'd = ('b * 'b list -> 'b list) 'e = 'u = 'v = 'b 'f = 'g = 't = 'w = 'b list 'h = 'i = 'b list list ML prints out: val f: 'a -> 'a list list Each cons and nil reference ended up with its own distinct type ('c vs. 'd, 'f vs. 'h), as polymorphic functions can. --------------------------------------------------------------------------- 11.3. Finally, we need to support recursion. To do this, we need to add the types of all the mutually recursive identifiers being defined to the type environment before we do type inference over the bodies. This is easy. E.g. given the source: val f = fn ... and g = fn ... we introduce fresh type annotations as follows: fun (f:'a) = fn ... and (g:'b) = fn ... Then we analyze the bodies of these functions in the type environment where f:'a and g:'b. We'll naturally accumulate constraints relating to recursive calls to f and g, and then solve all the constraints when we're done. Voila! One subtlety is that if f & g are polymorphic, we won't be generating fresh type variables for the recursive calls, i.e., the recursive calls will be to the same instantiation of f & g, not to some fresh instantiation. This is called monomorphic recursion, in contrast to polymorphic recursion which would allow recursive calls to instantiate type variables freshly. ML's type inference only supports monomorphic recursion, and type inference using polymorphic recursion turns out to be undecidable! --------------------------------------------------------------------------- 11.4. When we're done type inference with a function, we treat any remaining unconstrained type variables in the declared value's type as being polymorphic. This has the effect of putting forall quantifiers at the outermost level on the declared type. E.g. in the first example above, we inferred: val f: int -> ('a->'a) -> 'a -> 'a which in Core ML (with explicit quantification) would be: val f: forall a. int -> (a->a) -> a -> a This type is different than, e.g., val f2: int -> (forall a. (a->a) -> a -> a) which is itself different than: val f3: int -> (forall a. a->a) -> forall a. a -> forall a. a ML type inference only infers the first kind of type, where quantifiers are introduced at the outmost level. And in fact, quantifiers are only introduced for let bindings (either at implicit top-level let bindings as we've been doing, or for explicit let...in... expressions). Other identifiers, in particular the formal parameters of functions or identifiers bound in patterns, are not themselves polymorphic (if they have a type variable as their type, then this type variable is bound by some enclosing polymorphic let binding). This restricted form of polymorphism introduced by ML's type inference is called "let polymorphism".