Lecture 1: intro

+------+    proof   +------+
| spec |  <-------> | code |
+------+            +------+

Today’s plan

background & introduction
goals
- precisely specify systems behavior
- find programming abstractions suitable for reasoning
- focus: understanding spec/code/proof (in addition to ideas in papers)
schedule
- weeks 1-3: crash course on verification tools
- weeks 4-10: paper discussion
lecture/discussion
- design questions/hands-on assignments
- lead discussions
- sign up due tomorrow
assignments
- hands-on assignments
- paper questions due before lecture
projects
- proposal due in week 3
- checkpoint due in week 7
- paper due & demo in finals week
grading
- 20% participation + 20% assignments + 60% projects
- no late days

some examples of efforts
compilers
- 1967: Correctness of a compiler for arithmetical expressions, McCarthy & Painter
- 1972: Proving compiler correctness in a mechanized logic, Milner & Weyhrauch
- 2009: CompCert
OS kernels
- 197x-1980: UCLA Unix security kernel - finished 90%+ spec and 20%- proof
- 198x: Kit
- 2009: seL4
what does verification provide
- a mechanical proof that the impl “meets” the spec
- assume a correct proof checker
cost
- verification effort, run-time performance, compability, learning curve
- seL4: “about 25–30 person years, to do this again it would be about 10 person years”
recent advance in SMT
- still requires substantial human efforts
- Ironclad: 3 person-years for 6500 lines of implementation code
questions
- does verification actually improve end-to-end correctness?
- is verified code just too simple and unlikely to have bugs anyway?
- whether/when/how to apply verification techniques to building systems?

example: develop a little-endian serializer/deserializer for 16-bit integers
- encode: n → bytes
- decode: bytes → n
- what’s the specification?
spec v0: forall 16-bit integer n, decode(encode(n)) == n
- what bugs cannot be captured?
- is this good enough?
verification: check if a given implementation meets the spec
- code: le16.c
- exhaustive testing: check the condition for n from 0 to 65535
- apply rewriting rules: show LLVM -O2 result constant true
search for proofs
- in the input space: check a claim is true for every input; easier to automate; boundedness
- in the “rewriting” space - rewrite the claim to true; harder to automate
survey - choose a tool to do this problem
- SMTLIB: calvin.smt2, stuart.smt2
- Z3Py: luke-helgi.py, jamesb.py, le16.py
- Rosette: jamesb.rkt, le16.rkt
- Boogie: jrw.bpl
- Dafny: jrw.dfy, le16.dfy
- Coq: eric.v, le16.v
- what’s the TCB in each case?
- bugs in verification tools
  - Coq’s falso
  - Dafny - show example
spec v1: like spec v0, but for machine code
- how to describe a more complex system
- layers & refinement: will see a lot of examples this quarter
spec complexity
- how about spec about the little endian details
- how about spec about just memory safety

once verified, how to turn code into executables?
- performance & TCB
- example projects?
interpretation - rarely used due to performance issues
- Z3py → Python
- Rosette → Racket
- Coq/Gallina → Eval compute in …
extraction
- Coq/Gallina to OCaml/Haskell
- Dafny → C# → MSIL
- DSL in Coq → OCaml/Haskell
- DSL in Coq → asm
- DSL in Rosette → C/C++
parsing
- C → clightgen → Coq
- C → c-parser → Simpl in Isabelle/HOL
verified compiler
- C → CompCert → asm
- BPF → CompCert → asm
translation validation
- Dafny → Boogie → asm
proof-carrying code

problem: isolation - running untrusted code in kernel
approach: don’t do it - just run code in user space
- performance issues
- kernel as the security mediator
other approaches: SFI, type-safe language, hypervisor, proof-carrying code
approach: verification
- what to verify: MinVisor, ARMor, RockSalt, Jitk
- what’s the TCB in each case