** 18. Structural Subtyping **

----------------------------------------------------------------------------

18.1.

Can define subtyping rules over predefined types.  Called
*structural subtyping* rules.

Some possible rules for Core ML + subtyping:

Recall the kinds of type expressions we have:

tau ::= id
      | int | bool
      | tau -> tau
      | {id:tau, ... id:tau}
      | [id:tau, ..., id:tau]
      | rec id = tau
      | forall id. tau
      | ref tau

No subtyping over base types.  If we had richer numeric types, we
could define, e.g.

    int <= real
    float <= real
    real <= complex
    complex <= number

For record types: one record is a subtype of another if it has at
least all the fields of the supertype, whose types are "pointwise"
subtypes (since this implies that the subtype can be used wherever the
supertype can safely be used):


tau1 <= tau'
...
tauN <= tauN'
--------------------
{id1:tau1, ..., idN:tauN, ..., id(N+M):tau(N+M)} <= {id1:tau1', ..., idN:tauN'}

(Remember that order of fields in a record doesn't matter.)

If tuples are just sugared records, then longer tuples are subtypes of
shorter tuples, if the common components are pointwise subtypes.

For union types: one union is a subtype of another if it has no more
than the tags of the supertype, with component types that are
"pointwise" subtypes (since this means that code that handles a
particular set of possible tags will always work when given something
that's one of those tags):

tau1 <= tau'
...
tauN <= tauN'
--------------------
[id1:tau1, ..., idN:tauN] <=
    [id1:tau1', ..., idN:tauN', ..., id(N+M):tau(N+M)']

For functions, we might think a function is a subtype if its argument
& result types are subtypes:

tau1 <= tau1'
tau2 <= tau2'
--------------------
(tau1->tau2) <= (tau1'->tau2')

But we'd be wrong!  This rule would allow the following:

type t1 = {x:int}
type t2 = {x:int, y:bool}  (* t2 <= t1 *)
val f1:t1->t1 = lambda {x=x1}:t1. {x=x1 + 1}
val f2:t2->t2 = lambda {x=x1,y=y1}:t2. {x=(x1+1), y=(not y1)}
val f:t1->t1 = f2  (* type-correct, if typeof(f2) <= typeof(f) *)
f {x=0}  (* type-correct, but crashes! *)

A function of arrow type S that's a subtype of arrow type T has to be
able to handle any argument that can be passed to a function of type
T.  This means that S's argument type must be at least as general as
T's argument type, i.e. a *supertype* not a subtype.

In addition, a function of type S had better not return any value that
a caller of type T isn't expecting.  This means that S's result type
should be no more general than T's result type, i.e. a *subtype*.

Put together, we see that argument types and result types work
differently in the function subtyping rules:

tau1 >= tau1'
tau2 <= tau2'
--------------------
(tau1->tau2) <= (tau1'->tau2')

This is the infamous "contravariant function subtyping rule", in
contrast to the first (buggy) rule above, the "covariant ... rule".
(More precisely, we say that function subtyping is contravariant in
its argument type and covariant in its result type.  Similarly, record
and union subtyping are covariant in their component types.)
Contravariance has nasty consequences for lots of uses of subtyping in
OO languages, both theoretical and practical.  But it's an inavoidable
truth, like Heisenberg's Uncertainty Principle and the ban on
faster-than-light travel.

For a type constructor that has component types, we refer to 
component types using a covariant subtyping rule as "positive"
positions, and those components using a contravariant subtyping rule
as "negative" positions.  All positions we've seen so far are positive
positions, except the argument position of an -> constructor, which is
a negative position.  If a component type is itself a type constructor
with its own components, then we can compute the "sign" of any
position by "multiplying" the signs of the enclosing positions.  E.g. in

(t1->t2) -> (t3->t4)

t1->t2 is in a negative position, and t1 is in a negative position
within that, so t1 is in positive position overall (two negatives make
a positive).  And in fact it is true that

(s1->t2) -> (t3->t4) is a subtype of the type above iff s1 <= t1.  t2
and t3 are negative positions (and so determines subtyping
contravariantly), while t4 is a positive position (and determines
subtyping covariantly).

If we have a single type that appears in both positive and negative
positions, then there is no way it can change and still lead to a
subtype.  E.g. in

t->t

t is in both a positive and a negative position, so there's no s != t
that produces a subtype of this type (technically, such an s may
exist, but it must be both a subtype and a supertype of t).  We call
such types invariant (neither covariant nor contravariant).

[Subtyping over recursive types, polymorphic types, and references later.]

----------------------------------------------------------------------------

18.2.

One consequence of contravariance leads to restrictions on method
overriding.  We can only override one method with another if the
overriding method can be viewed as a subtype of the overridden one.
This is because the overriding method is going to be substituted for
the overridden one, possibly without the caller's knowledge.  As a
result, by contravariance, the overriding method's argument types must
be equal to or *supertypes* of (i.e., at least as general as) the
overridden method's argument types, and the overriding method's result
type must be equal to or a *subtype* of (i.e., no more general than)
the overridden method's result type.  Making result types more
specific is often a natural thing to want to do, e.g. in a copy
method:

class Point  { 
  int x;
  ...
  Point copy() { return new Point(x); }
}

class ColorPoint extends Point { 
  color c;
  ...
  ColorPoint copy() { return new ColorPoint(x, c); }
}

But making argument types more general is rarely useful.  Much more
useful would be to make argument types more specific, e.g. in an
equality method:

class Point {
  ...
  bool equals(Point arg) { return this.x == arg.x; }
}

class ColorPoint extends Point { 
  color c;
  ...
  ColorPoint equals(ColorPoint arg) {	// covariant argument overriding
    return this.x == arg.x && this.c == arg.c; }
}

Unfortunately, this isn't safe to do.  E.g.:

    Point p1 = new Point(3);
    Point p2 = new ColorPoint(3, blue);

    p1.equals(p2)  -->  invokes Point::equals, returns true
    p2.equals(p1); -->  invokes ColorPoint::equals, then crashes

So, in a language like Java, we must write:

class ColorPoint extends Point { 
  color c;
  ...
  ColorPoint equals(Point arg) {
    if (arg instanceof ColorPoint) {
        return this.x == arg.x && this.c == ((ColorPoint)arg).c;
    } else {
	... do whatever should be done ...
    }
}

[Note the clumsy instanceof + cast, where a case would be much nicer.]

There are now two kinds of dispatching in the language, one done by
method overriding and dynamic dispatching (and extensible to new
subclasses), and another done manually through instanceof + cast (and
non-extensible w/o rewriting existing code).  Yuck.

----------------------------------------------------------------------------

18.3.

When are two recursive types in a subtype relation?  Or when is a
recursive type in a subtype relation with some other (recursive or
non-recursive) type?  This is a difficult question.  In general, we
should look at the infinite expansion of the recursive type(s), and
place the types in a subtype relation whenever the infinite
expansion(s) are in a subtype relation.  Here is one possible rule:

Delta, (id <= tau2) |- tau1 <= tau2
--------------------
Delta |- (rec id = tau1) <= tau2

Since there are type identifiers (id), we will need to maintain an
environment of subtypings that we can assume over these identifiers.
For the recursive type, we can assume that the id (which is the
recursive type) is a subtype of the rhs, the proof
that tau1 <= tau2 under this assumption will validate the assumption.
We have a similar rule for a recursive type as the candidate supertype:

Delta, (tau1 <= id) |- tau1 <= tau2
--------------------
Delta |- tau1 <= (rec id = tau2)

To make use of subtyping assumptions, we need a rule to do a lookup in
the assumption set:

(tau1 <= tau2) in Delta
--------------------
Delta |- tau1 <= tau2

These rules look plausible to me, and they appear to be decidable
(since we try to solve a smaller problem each time around), and they
may in fact be correct, but don't quote me on this.  I'm not sure
they're complete (in that they detect all possible subtyping
relationships between types).

Note that if in the type "rec id = tau" id appears in both
positive and negative positions, then there won't be any non-trivial
subtyping relations over this type.

----------------------------------------------------------------------------

18.4.

One way to think about this style of OO programming is to model an
object as a record.  An instance variable just becomes a field of a
record.  A method becomes a function value stored in a field of the
record.  To allow a method to refer to the record containing the
method, we make the record a recursive value.  E.g.:

    type Point =
	rec Point =
	    {x:int, copy:()->Point, equals:Point->bool }

[Note the recursive type!]

    val p1:Point =
	rec self:Point =
	    {x=3,
             copy=(lambda().
		{x=#x self, copy=#copy self, equals=#equals self}),
             equals=(lambda(arg:Point). (#x self) = (#x arg))}

[Note the recursive value!]

[Note how new objects are constructed within the copy method!  While
this works as written, it may not be the right way of encoding
constructors for a fuller design.]

    type ColorPoint =
        rec ColorPoint =
	    {x:int, c:color, copy:()->ColorPoint, equals:Point->bool }

[Question for the reader: is ColorPoint <= Point?]

    val cp2:ColorPoint =
	rec self:ColorPoint =
	    {x=3, c=blue,
	     copy=(lambda(). ...),
		{x=#x self, c=#c self, copy=#copy self, equals=#equals self}),
             equals=(lambda(arg:Point). ...)}

    val p2:Point = cp2

    (#equals p2) p1       (* p2.equals(p1) *)


There's no explicit inheritance here; I have to retype the parts from
type Point within the declaration of type ColorPoint, and I have to do
something to share functions that I would have inherited from one
class to another.  Class declarations can be viewed as a syntactic
sugar on top of this objects-as-records core to make it convenient for
programmers to write certain common kinds of patterns.  There are ways
of writing down explicit class objects and constructor functions, but
things get pretty complicated.

Note that method overriding rules just fall out of subtyping rules for
records and functions.  Also note that we don't have to inheritance in
order to have subtyping, since subtyping is solely a function of
types, not of implementations.

----------------------------------------------------------------------------

18.5.

When is "forall id1. tau1" a subtype of "forall id2. tau2"?  This
turns out to simply be covariant in the body type (after renaming to
make the type parameters the same), i.e.


tau1 <= tau2[id2 := id1]
--------------------
(forall id1. tau1) <= (forall id2. tau2)

For example, a polymorphic stack is a supertype of a polymorphic
double-ended stack:

type stack = 
    forall elem_t.
	rec stack = {
	    push: elem_t -> stack,
	    pop: unit -> (elem_t * stack)
	}
type double_ended_stack = 
    forall t.
	rec des = {
	    push: t -> des,
	    pop: unit -> (t * des),
	    pushBottom: t -> des,
	    popBottom: unit -> (t * des)
	}

double_ended_stack <= stack

In other words, it is OK to substitute a polymorphic double-ended
stack wherever a polymorphic stack can appear.  Say we did.  What we
can do to a polymorphic stack is instantiate it to some real type
(which works fine on the d-e-s too), and then do push & pop operations
on the stack (which works fine on the d-e-s too).

What about instances of such a type?  E.g. is stack[S] <= stack[T] if
S <= T?  Or if T <= S?  I.e., are polymorphic types covariant in their
type parameters, or contravariant, or invariant?  The answer depends
on the definition of the polymorphic type, and whether occurrences of
the type parameter appear all in positive positions (leading to
covariance in the type parameter), all in negative positions (leading
to contravariance), or a mix (leading to invariance).

E.g.: a polymorphic list:

type list = forall t. rec l = [Nil:unit, Pair:{hd:t, tl:l}]

When is list[S] a subtype of list[T]?  In this type definition, t
appears only in a positive position, so the list is covariant in its
type parameter (list[S] <= list[T] iff S <= T).

Another example: the polymorphic stack example from above:

type stack = 
    forall elem_t.
	rec stack = {
	    push: elem_t -> stack,
	    pop: unit -> (elem_t * stack)
	}

Here, elem_t appears in both positive and negative positions, so the
polymorphic type is invariant in its type parameter.  I.e., one can't
use a stack of ints in place of a stack of numbers (int <= num).  What
would go wrong if one did?

val si:stack[int] := ...
val sn:stack[num] := si  (* allow this, to see what can go wrong *)
sn.push(3.4)
val i:int := si.pop()  (* i:int, but i=3.4! *)
(* now if we do something to i that assume's it's an int, we can crash *)

In general, a parameterized type will be covariant in type variables
for read-only properties, but invariant for read-write properties.
The type of an instance variable will appear in a positive position on
the get accessor functions (things like pop and fetch, and the
implicit read operations of instance variables of a record) and in
negative position in the set accessor function (things like push and
store, and the implicit assign operations of instance variables of a
mutable record).

----------------------------------------------------------------------------

18.6.

When is "ref tau1" a subtype of "ref tau2"?  If ref tau1 is a subtype
of ref tau2, then a value of type ref tau1 can be used wherever a ref
tau2 is expected.  So what can be done with a value of type ref tau2?
Its contents can be fetched, and its contents can be updated.  We can
think about ref as a polymorphic type, of objects with two operations,
fetching and storing:

type ref = forall tau. { fetch: unit -> tau, update: tau -> unit }

"!r" is encoded as "r.fetch()" and "r := v" is encoded as
"r.update(v)".

With this encoding, we can rephrase our question about refs as a
question about two different instances of the ref type, i.e.,

ref tau1 <= ref tau2

is encoded as

ref[tau1] <= ref[tau2]

which after substitution of the type parameter gives

{ fetch: unit -> tau1, update: tau1 -> unit } <=
{ fetch: unit -> tau2, update: tau2 -> unit }

Now we just apply regular record and function subtyping rules to
determine constraints on tau1 & tau2.  We see that the tau's appear in
both positive and negative positions in each record type, implying
that there is *no* non-trivial subtype relation over the tau's that
leads to a subtype relation over the enclosing record types.  This
means that "ref tau1" is a subtype of "ref tau2" only when tau1=tau2.