Studio
Hand-trace symbolic execution on three short programs, then count inputs per equivalence class. A stretch exercise asks you to construct your own one-in-four-billion bug.
How Studio works tonight
Practice walked you through symbolic execution on swap and my_abs. Studio is where you do the trace yourself, predict the engine's output, then run the engine and compare.
Pair up with a neighbor. Open lectures/l07/03-studio/exercises.py in your editor. The file has three small mini-IMP functions. For each one, work out on paper:
- the paths through the function
- the path condition at each leaf
- the symbolic state at each leaf
- whether each assertion can fail, and on what input
Predict before you run. The driver at the bottom of exercises.py runs one exercise at a time so you can stage the reveal.
The files live in the course code repo next to the Practice demos. Clone and run them locally with python3 exercises.py (requires z3-solver; see Setup).
Plan to get through Parts 1 and 2. Part 3 is stretch territory.
Part 1: Hand-trace three programs
File: lectures/l07/03-studio/exercises.py.
Exercise 1a: warmup
def warmup(x, y):
if x > y:
m = x
else:
m = y
assert m >= x
assert m >= y
return m
Two paths through the function. Trace each. Does either assertion ever fail?
Expected engine output
2 paths:
Path 0: PC = x > y ∧ m__1 >= x ∧ m__1 >= y return = m__1
Path 1: PC = ¬(x > y) ∧ m__2 >= x ∧ m__2 >= y return = m__2
All assertions hold on all paths.
Why both paths are safe
On Path 0, the PC says `x > y` and the state has `m = x`. The two assertions evaluate to `x >= x` (always true) and `x >= y` (follows from `x > y`). On Path 1, the PC says `¬(x > y)`, equivalently `x ≤ y`, and the state has `m = y`. The two assertions evaluate to `y >= x` (follows from `x ≤ y`) and `y >= y` (always true). Four Z3 queries total, all UNSAT.Exercise 1b: hidden_branch
def hidden_branch(x):
if x > 0:
if x < 100:
r = x * 2
else:
r = x
else:
r = -x
assert r >= 0
return r
Three paths through the function. One has a bug. Which path, and on what input?
Expected engine output
3 paths:
Path 0: PC = x > 0 ∧ x < 100 return = x * 2 safe
Path 1: PC = x > 0 ∧ ¬(x < 100) return = x safe
Path 2: PC = ¬(x > 0) return = -x BUG
Path 2, line 10: assertion CAN fail.
Counterexample: x = -2147483648
Why the bug only lives on Path 2
This is the `my_abs` bug from Practice. Path 2's PC is `x ≤ 0`, which includes `x = INT_MIN`. The path computes `r = -x`, and `-INT_MIN` overflows back to `INT_MIN` in 32-bit two's complement, so `r < 0` and `assert r >= 0` fails. Paths 0 and 1 have `x > 0` in their PCs, ruling out negatives. The boundary value 100 plays no role in the bug.Exercise 1c: unreachable
def unreachable(x):
if x > 100:
if x < 50:
assert False
y = x - 100
else:
y = 100 - x
assert y >= 0
return y
Two assertions. One is on a path that can never be reached. The other has a subtle bug. Find both.
Expected engine output
3 paths:
Path 0: PC = x > 100 ∧ x < 50 ∧ False ∧ ...
(PC contains False → infeasible)
Path 1: PC = x > 100 ∧ ¬(x < 50) return = x - 100 safe
Path 2: PC = ¬(x > 100) return = 100 - x BUG
Path 2, line 9: assertion CAN fail.
Counterexample: x = -2147483648
The unreachable assertion
Path 0's PC requires `x > 100 ∧ x < 50`, which is unsatisfiable. The engine still walks the path symbolically and records the `assert False`, but when `check_assertions` asks Z3 "is `pc ∧ ¬False` satisfiable?" the answer is UNSAT because `pc` itself is UNSAT. The assertion never fires. Infeasible paths are silent.The real bug
Path 2 computes `y = 100 - x` under the PC `x ≤ 100`. When `x = INT_MIN`, `100 - INT_MIN` overflows in 32-bit arithmetic: the true value `100 + 2^31` wraps to `-(2^31) + 100`, which is negative. The outer assertion `y >= 0` fails on that one input. Same INT_MIN-overflow pattern as `my_abs`, hidden inside an innocent-looking subtraction.Part 2: Equivalence classes
Take hidden_branch from Exercise 1b. The engine produced three paths and asked Z3 one question per path's assertion. Three queries total decided whether the assertion can fail on any input.
Each path's PC defines an equivalence class of inputs: all 32-bit x values satisfying that PC follow the same path through the function. The Z3 query on each path asks "does any input in this class violate the assertion?" UNSAT proves the assertion holds across the whole class; SAT plus a model exhibits one violating input.
How many 32-bit x values fall into each class?
Count the inputs in each class
- Path 0 (`0 < x ∧ x < 100`): the 99 values `x ∈ {1, 2, …, 99}`. - Path 1 (`x ≥ 100`): the `2^31 − 100` values `x ∈ {100, …, 2^31 − 1}`. - Path 2 (`x ≤ 0`): the `2^31 + 1` values `x ∈ {−2^31, …, 0}`. - Sum: `99 + (2^31 − 100) + (2^31 + 1) = 2^32`, the entire 32-bit space.Three Z3 queries decided correctness for 2^32 = 4 billion inputs. Two returned UNSAT and proved the assertion holds across roughly 2^31 inputs each. The third returned SAT with witness x = INT_MIN, showing the assertion fails somewhere in Path 2's class.
Compare with random fuzzing. A random 32-bit input has a 1-in-4-billion chance of hitting any specific value. To cover even the negative half-line, fuzzing needs on the order of 2^31 trials. Symbolic execution covers it with one query.
Symbolic execution decides an equivalence class per query rather than an input per trial. That is why these tools find bugs random testing does not.
Part 3 (stretch): your own SE target
Open a new mini-IMP function in a fresh file. Write a program whose assertion can fail on a specific input you have in mind. Run check_assertions and verify the engine finds the input you intended.
Hint: pick a witness first, then design the trap
Decide on the input you want to expose, then write a program whose assertion fails on exactly that input. - INT_MIN traps: arrange `-x` or `100 - x` or similar, where signed-integer overflow at the boundary kicks in. - Single-value asserts: `assert x != 42` fires only at `x = 42`. - Off-by-one boundary: a comparison that uses `<` where it should use `≤`, then an assert that flips at the boundary value. The point is to predict before running. Pick the witness, write a program you believe singles it out, then let the engine confirm.No solution file. The exercise is open-ended on purpose.
What you should be able to do now
By the end of Studio, the engine should feel mechanical:
- Walk a mini-IMP function by hand, listing each path with its PC and symbolic state.
- Predict which path a bug lives on from reading the PC at each leaf.
- Read the engine's output and reconcile it with your hand trace.
- Run
check_assertionson a function you wrote and read the counterexample.
Start thinking about your project
The project milestone is due Friday May 29: a working prototype and a one-page doc, 50 points. The final is the following Friday, June 5.
This is a good week to pick a direction:
- Skim the project page for the timeline, kinds of projects, and the "beats naive" rule.
- Pick a problem you care about. Verification, synthesis, analysis, or bring-your-own all qualify. The only rule is that a solver does real work in the solution.
- Find a partner if you want one. Pairs share a grade and have the same scope as solo work. A partner is there to think with. If you are looking, mention it during break tonight or post in Ed.
Office hours are open for project conversations. Pivoting is much easier now than at the milestone.
Reading reflection
Reading Reflection 4 individual write-up is due Friday at 17:00. The reading theme is the relationship between AI tools and formal methods. The small-group discussion is in next week's Studio.