OK, so last time we decided enough of this nonsense encoding every language feature in weird lambda expressions: we're just gonna add stuff directly! Although doing that weird lambda stuff was good for us: built up PL muscle, now we want better machinery so we can move faster: going from running to cycling. Our goal is to add stuff and build up to something like OCaml! But then we got a bit nervous. After all, we did a bunch of work to get type safety and we don't want to break that... So what happened? Well, we saw how to add let. Then we saw how to add booleans, and conditionals. And a general pattern emerged. To add a language feature we have to add: (1) syntax and types (2) semantics (3) typing rules (4) cases for progress and preservation Now we'll start adding data structures. Pairs: Syntax and Types: [slide] Semantics: [slide] Typing Rules: [slide] Soundness: What did this mean again? Again, one thing that we don't want to happen when we start extending our lambda calculus, is to invalidate all that hard work we did to prove that our type systems was good: i.e. the type safety proof. What did that say again? Oh yes: "Well typed terms do not get stuck." or, more formally: If * |- e : T and e -*-> e', then either e' is a value of there exists e'' such that e' -> e''. And how did we prove it? With Progress and Preservation: Progress: If * |- e : T, then either (1) e is a value or (2) there exists e' such that e -> e'. Preservation: If * |- e : T and e -> e', then * |- e' : T. At a high level Progress works by: induction on * |- e : T base cases either: (A) either value (done) (B) not typable in empty context (contradiction, done) inductive cases: - inversion on typing judgement to get types for subexpressions - IH + subexpression types gives that they are values or can step - if subexpression steps, big expression steps - NOTE: use canonical forms lemma to get shape of well typed values At a high level Preservation works by: induction on * |- e : T base cases all contradictions, either (A) not typable in empty context (bogus) (B) cannot step (bogus) inductive cases: - inversion on typing judgement to get types for subexpressions - case analysis on step + inversion to get subexpression step - IH + subexp type + subexp step gives subexp still well typed - stitch back together to show big expr still well typed - NOTE: use substitution lemma to handle call case of app OK, now we can sketch proof update for pairs. You can see how this generalizes to n-tuples, yes? What about records? Well, they're a lot like tuples, just named positions and projectors: [slide] Should we relax typing rule? What if fields are reordered? Does it matter? - not for type safety - makes implementation trickier though - can't just use fixed offset OK, so now we have let, if, pairs (tuples), and records. What's left? What about those dataypes? Stuff like: type t = A | B of int | C of int * t Well, we can do the first two cases with tagged variants (discriminated unions). We'll punt on recursive types and type constructors till a later lecture. We'll also talk about whether types should have names just a bit later. For now, we're gonna see how far we can get with "anonymous sum types". Sums: Syntax and Types: [slide] Semantics: [slide] Typing Rules: [slide] Soundness: sketch Why do we want sums? What are the good for? Sums in C and Java... - in Java easy to add new variant : new subclass - in Java hard to add new operation : touch all previous subclasses - in OCaml easy to add new operation : new function - in OCaml hard to add new variant : touch all previous matches Do we need pairs and sums? Pairs and sums are logical duals! Base types, primitives. - just add them - implementors obligation not to screw up So far everything preserves termination! Can't type check Y: \f. (\x. f (\y. (x x) y)) (\x. f (\y. (x x) y)) Why not? So we're not Turing complete... Could do some sort of "let rec" thing... but we'll do something even nicer. OK, so let's bake in something like Y...