Lecture 7: Verification: Symbolic Execution
Week 7 | May 11 – May 17
The state of a running program is a map from variable to value. If the values are Z3 expressions instead of concrete numbers, the state becomes symbolic. Branching forks the execution tree, and each fork records its guard in a path condition. The question "can this assertion fail?" becomes one Z3 query per leaf of the tree. That is symbolic execution. CBMC and Klee work this way. Practice builds the engine in Python and runs it on four programs: an absolute-value function with an INT_MIN overflow, a chain of assignments where naive encoding blows up, a loop checked by BMC, and the Zune Y2K bug. Theory defines symbolic state, path conditions, and the assert, assume, and havoc primitives, and connects symbolic execution to the strongest postcondition. Studio is pair-trace exercises and a second look at bvudiv2. Next week is the backward direction.
What We Cover
Practice opens with the verification-tool spectrum, then builds the engine. The first demo is a four-line my_abs(x) that claims my_abs(x) >= 0 for every 32-bit integer. The engine produces two paths and finds a counterexample at x = INT_MIN, where -x overflows back to itself. The cascade demo encodes the same chain of assignments two ways: naive substitution produces a formula with over 130,000 nodes at depth 16; the principled encoding produces a formula of size one. Practice closes with the Zune Y2K bug. BMC at depth 3 hands back days = 366, is_leap = 1, the input that bricked the Zune fleet on Dec 31, 2008.
Theory defines symbolic state, path condition, and execution tree, then introduces the three IVL primitives assert, assume, and havoc with SE rules for each. Symbolic execution on a loop-cut program computes the strongest postcondition; the formula our engine emits is that SP. Loop unrolling formalizes BMC. The lecture closes on the SP/WP duality: two ways to ask Z3 the same question, one starting from the precondition and one from the postcondition.
Studio is guided practice. Two pair-trace exercises on a maximum-of-two function and a hidden-branch program give students practice running symbolic execution by hand. The third revisits Week 1's bvudiv2: two Z3 queries decide four billion 32-bit inputs, one query per equivalence class. The session closes with a preview of Reading Reflection 4.
Practice: Building an SE engine and finding bugs
A small absolute-value function with an INT_MIN overflow, a chain of assignments where naive encoding blows up, a loop checked by bounded model checking, and the Zune Y2K bug.
Theory: Path conditions and the strongest postcondition
Practice's engine emits one Z3 query per leaf of the execution tree. The formula at each leaf is called the strongest postcondition. The formal rules for assert, assume, and havoc are how the engine builds it.
Studio: Hand-tracing SE and revisiting bvudiv2
Hand-trace symbolic execution on three short programs, then count inputs per equivalence class. A stretch exercise asks you to construct your own one-in-four-billion bug.