** 7. Type System Design (as in ML) **

7.1.

In design of typed programming languages, organize thinking around
what the types are.  Have a grammar of type expressions, and a grammar
of value expressions.

  1) pick a type to include in type expressions
  2) add corresponding value expressions of that type
  3) repeat as desired

We'll study ML's design by developing an orthogonal core language,
Core ML, that will include sufficient features so that real ML can be
treated as a user-oriented syntactic sugar for combinations of Core ML
constructs.  This will reveal how orthogonal core language design
works, and what the primitives are underlying real ML constructs.

----------------------------------------------------------------------------

7.2.

Start with some base types and expressions over those types:

tau ::= int | bool
e   ::= intconst | true | false
      | e + e | e < e | e = e | ... 
      | if e then e else e

----------------------------------------------------------------------------

7.3.

Then add functions, at the type and value level:

tau ::= ... | tau -> tau
e   ::= ... | lambda id:tau. e | e e | id

(This is essentially the simply typed lambda calculus, enriched with
base types & expressions.  The pure simply typed lambda calculus is
just this language without the base types & expressions; no loss of
computational power!  The pure untyped lambda calculus is just this
expression language, but without the types; much more computational
power!)

The type of (lambda id:tau. e) is tau->tau', where tau' is the type of
e, computed assuming that id has type tau.  (e1 e2) makes sense if e1
has some arrow type, tau1->tau2, and e2 has type tau1; the type of the
result of the application is tau2.  (We'll formalize these kinds of
typing rules later.)

(Real ML writes "lambda id:tau. e" as "fn(id:tau) => e", or, with type
inference, "fn id => e".)

The lambda form is an "introduction" form, since it introduces a value
of arrow type, while the application is an "elimination" form, since
it takes a value of arrow type and removes it from the type of the
expression.

Functions are first-class expressions, naturally, in this language.
Lexical nesting is natural, too.

This language requires argument types to be specified.  Real ML allows
argument (& result) types to be specified, or omitted.  Where
omitted, ML has type inference to put them back in, as a user
convenience.

Lambda is a primitive mechanism for binding names to values, by way of
its formal parameter identifier.  As a convenience, we can introduce
various let forms, e.g.

e   ::= ... | let id:tau = e in e

(Real ML supports multiple bindings per let, using let* semantics.
Real ML also requires the val keyword, to allow the fun sugar
described below.)

Let turns out to be syntactic sugar for lambda, i.e.

    let id:tau = e1 in e2 end  ==>  (lambda id:tau. e2) (e1)

(When we talk about polymorphic type inference, we'll see a reason for
let to be treated differently than a lambda.)

(Real ML also provides syntactic sugar (a la Scheme's two forms of define) to
define functions "directly" as opposed to via explicit lambdas.  E.g.

    let
      fun foo(x:tau) = e1
    in e2 end

is sugar for

    let
      val foo = (fn(x:tau) => e1)
    in e2 end
.)

(Real ML also provides top-level declarations, e.g.
    > val x = 3;
    ...
    > fun id x = x;
    ...

This is just sugar for

    > let val x = 3 in
      ...
      let fun id x = x in
      ...

i.e., the body of the let is all the remaining top-level
declarations.  Note that later bindings only *shadow* earlier
bindings, they don't change them, unlike Scheme's define.  Any
existing references to the old value aren't affected.  This may sound
nice, until you want to redefine a function, e.g. to fix a bug:

    fun square x = x + x;  (* oops *)
    fun map_square lst = map square lst;  (* this binds to the
					     previous square definition! *)
    (* now we try to fix the square definition *)
    fun square x = x * x;  (* fixed *)
    (* but this only introduces a new, distinct function that shadows
the old binding; map_square still references the old, broken square.
we have to redefine every function that refers to square, and all of
their callers, transitively.  so don't do it this way.  instead, put
functions into a file, edit the file to fix bugs, and reread the whole
file after any edit. *)
.)

Applying orthogonality, we might similarly want to use let in types:

t   ::= ... | let id = tau in tau | id

More reasonably, since types can occur in expressions (e.g. lambdas,
or type declarations) we want to allow types to be given names in
expression-let as well as type-let, e.g.:

e   ::= ... | let val id:tau = e in e | let type id = tau in e

In general, anything "interesting" should be able to be given a name
and used by name.  Currently, values and types are the interesting
things, and we can name both of them now.  All name bindings should
have a well-defined scope, too; let has a body expression (or type)
in which the binding is in scope.

----------------------------------------------------------------------------

7.4.

Then add tuple (aka product) types, and introduction and elimination forms:

tau ::= ... | tau * ... * tau
e   ::= ... | (e, ..., e) | pi_i e

Since the pi_i's are often a pain to write, we can introduce a notion
of pattern-matching, where any binding occurrence of an identifier
(function formal parameter, let-bound variable) can be replaced with a
pattern instead.  So let's add pattern expressions to our language,
and uniformly use patterns where name-binding might happen:

e   ::= ... | lambda p. e | let p = e in e
p   ::= id:tau | _:tau

(_ is the wildcard pattern, that doesn't bother to bind a name.)

Then we can extend patterns to have patterns for our composite
data types:

p   ::= ... | (p, ..., p)

(Consistently, a type's pattern form looks like its introduction form,
since it's a kind of reverse-introduction.)

Note that patterns can be nested arbitrarily, as long as the pattern
structure follows the type structure of the value being
pattern-matched.

Now we can use patterns to implicitly eliminate/deconstruct/index into
composite data.  E.g.

    val (x,_,z) = some_triple;

is equivalent to

    val x = #1 some_triple;
    val _ = #2 some_triple;
    val z = #3 some_triple;

where #i is ML's syntax for the pi_i operators.)

A multi-argument function is handled simply as a function that takes a
tuple as its argument, e.g.

    fun plus args = (#1 args) + (#2 args);

or, with pattern-matching:

    fun plus(x,y) = x + y;

With this tuple-based approach, a function can return multiple results
just as easily as taking multiple arguments:

    fun quadratic_roots(a,b,c) =
            let val r = sqrt(b*b - 4.0*a*c)
            in ((-b + r)/2.0*a, (-b - r)/2.0*a) end;

Given this way to program multi-argument functions, real functional
languages replace the expression forms over base types (e.g. e+e) with
first-class functions (e.g. +) that operate over (tuples of) base
types, thereby simplifying the value expression sublanguage.

Orthogonality argues that patterns, as interesting things, should be
namable, too.  How would one do this?

----------------------------------------------------------------------------

7.5.

Another way for functions to take multiple arguments is to take a
single argument and return a function that takes the next argument,
which returns a function to take the next argument, etc.  Taking
arguments one at a time is known as "currying".  E.g.

    val curried_plus = (fn x => (fn y => x + y));

ML provides syntactic sugar for writing functions in curried style,
just by providing a series of formal parameter (patterns) separated by
spaces:

    fun curried_plus x y = x + y;

This is particularly convenient to enable a function to be invoked on
only a subset of its arguments, without having to wrap things up
explicitly with lambdas, e.g.:

    val inc = curried_plus 1;
    val dec = curried_plus ~1;

Functionals like map & foldl & foldr (foldl/r are ML's names for
reduce-left/-right) are defined in curried form, so they can be called
on just part of their arguments, e.g.:

    val square_list = map (fn x => x * x);
    val prod        = foldl (op *) 1;
    val dot_product = (foldl (op +) 0) o (map2 (fn (x,y) => (x * y)));

("o" is ML's compose operator, which is an infix operator.)

("op +"	is the + function as a value; we can't just write + because
then it looks like an infix operator.  Because ML has infix & prefix
operators in the syntax, we have to "quote" the infix operator with
the op keyword to disable the infixness.  Scheme syntax avoids these
issues.)

Now we're programming at the level of functions, and the various data
values passing around become implicit.  (ML's types will help make sure
that these expressions make sense.)  We can always write the lambdas
explicitly, but it's convenient that we don't have to.

(IMO, currying is a nice convenience, and a nice example of the
benefits of first-class, nested functions, but no more.)

----------------------------------------------------------------------------

7.6.

We might also add record types, along with introduction and
elimination expression forms and record patterns:

tau ::= ... | {id:tau, ... id:tau}
e   ::= ... | {id=e, ..., id=e} | #id e
p   ::= ... | {id=p, ..., id=p}

Records are a generalization of tuples: an n-tuple is just a record
whose labels are the integers 1..n.  So tuple types, expressions, and
patterns can be treated as syntactic sugar for corresponding record
constructs.

[Key idea: keep the core language small by defining as many things as
possible via syntactic sugar (if we need them at all).  The surface
language is separate from the internal, core language.]

----------------------------------------------------------------------------

7.7.

We then add (tagged, disjoint) union (aka sum) types.  Union types are
a simple primitive mechanism modeling things like dynamic typing in
Scheme (something is an integer or a string), variations in
implementation (a list is either nil or a cons cell), or inheritance
hierarchies (an account is a checking account, a savings account, a
money market account, ...).

One simple form to introduce unions resembles tuples:

tau ::= ... | tau + ... + tau
e   ::= ... | in_i e | is_i e | pi_i e
p   ::= ... | in_i p

In this design, we give each tau in the sum a distinct integer (its
position), introduce values in a particular position using the in_i
form, test positions using the is_i form, and extract values in a
particular position using the pi_i form.  E.g.

    type t = string + int;  (* not real ML! *)

    val v1:t = in_1 "hi";
    val v2:t = in_2 7;

    is_1 v1  =>  true
    is_1 v2  =>  false
    is_2 v1  =>  false
    is_2 v2  =>  true

    val s:string = pi_1 v1;
    val i:int = pi_2 v2;

    val (in_1 s) = v1;
    val (in_2 i) = v2;

If you don't like integer positions (and who does?), you can adopt a
more record-like form, which uses identifiers for the positions:

tau ::= ... | [id:tau, ..., id:tau]
e   ::= ... | [id=e] | ?id e | %id e
p   ::= ... | [id=p]

Then:

    type t = [name:string, count:int];  (* not real ML! *)

    val v1:t = [name="hi"];
    val v2:t = [count=7];

    ?name v1  =>  true
    ?name v2  =>  false
    ?count v1  =>  false
    ?count v2  =>  true

    val s:string = %name v1;
    val i:int = %count v2;

    val [name=s] = v1;
    val [count=i] = v2;

These two models have several problems:

*) how to do type inference of full type, given only one id?  not
local.

*) what happens if do %count v1?  the projection functions can
fail!

*) what happens if do val [count=i] = v1?  the pattern-matching can
fail!

The first problem we'll confront later when we talk about type
inference.

The second problem can be fixed if we bundle together the testing
expressions (the ?id forms) with the projection expressions into a
single construct, case.  Instead of the ?id and %id forms, we use:

e   ::= ... | case e of p1 => e1 "|" ... "|" pN => eN

Each branch of the case does a pattern-match of the result of
evaluating the first e.  The first pi that matches, evaluates its
argument ei (with any binding from pi visible) and returns it as the
result of the case.  (We'll formalize this semantics later.)
Now we can't have projections that fail, because projection is
automatic only when the test succeeds.  E.g.

    case expr of
      [name=s] => (print_string s; 0)
    | [id=i] => (print_int i; 1)

("e1;e2" is an expression form that evaluates e1, throws away its
result, then evaluates and returns e2.  print_string and print_int
have result type unit, which is ML's void-like type, with a single
value, the 0-ary tuple, written "()".)

A series of patterns is "exhaustive" if it covers all possible cases
of the tested value.  A pattern is "redundant" if the values it could
match would have already been matched by earlier patterns; a series of
patterns is irredundant if none of its cases is redundant.  We can
check these two desirable properties of a case statement statically,
as part of typechecking.  ML signals a "match-not-exhaustive" warning if
we give it a non-exhaustive case.

A "refutable" pattern is one that can fail to match values of its type.
Patterns on union cases are refutable, while all previous patterns
were "irrefutable" (matching always succeeded).

Building on the idea of refutable patterns, we can add pattern
matching on other types e.g. integer & bool constants, treating int &
bool as unions over all their possible constant values.  E.g.:

    fun simple_num x =
      case x of
        0 => "zero"
      | 1 => "one"
      | _ => "many"

Functions that do casing on their arguments in this way are very
common, so the fun syntactic sugar is further extended to support a
kind of case analysis over arguments, by writing a single function
declaration as a series of function cases.  The above function can be
rewritten as follows:

    fun simple_num 0 = "zero"
      | simple_num 1 = "one"
      | simple_num _ = "many"

Of course, due to orthogonality, this pattern matching applies to
arguments of union type, to tuples of arguments containing union types
or int/bool types, and to curried functions.

(Haskell further generalizes pattern-matching to allow predicate
expressions, too.  Object-oriented languages do a kind of
pattern-matching on run-time classes as part of method dispatching.
Grand Unified Dispatching is a dispatching model developed at UW that
generalizes and unifies all these previous models, while retaining the
ability to do static typechecking of exhaustiveness and irredundancy
(and non-ambiguity, which can happen in OO dispatching), and efficient
compilation of the important special cases of dispatching.)

----------------------------------------------------------------------------