Skip to main content
  (Week 7)

Studio

Hand-trace symbolic execution on three short programs, then count inputs per equivalence class. A stretch exercise asks you to construct your own one-in-four-billion bug.

How Studio works tonight

Practice walked you through symbolic execution on swap and my_abs. Studio is where you do the trace yourself, predict the engine's output, then run the engine and compare.

Pair up with a neighbor. Open lectures/l07/03-studio/exercises.py in your editor. The file has three small mini-IMP functions. For each one, work out on paper:

Predict before you run. The driver at the bottom of exercises.py runs one exercise at a time so you can stage the reveal.

The files live in the course code repo next to the Practice demos. Clone and run them locally with python3 exercises.py (requires z3-solver; see Setup).

Plan to get through Parts 1 and 2. Part 3 is stretch territory.

Part 1: Hand-trace three programs

File: lectures/l07/03-studio/exercises.py.

Exercise 1a: warmup

def warmup(x, y):
    if x > y:
        m = x
    else:
        m = y
    assert m >= x
    assert m >= y
    return m

Two paths through the function. Trace each. Does either assertion ever fail?

Expected engine output
2 paths:
  Path 0:  PC = x > y ∧ m__1 >= x ∧ m__1 >= y       return = m__1
  Path 1:  PC = ¬(x > y) ∧ m__2 >= x ∧ m__2 >= y    return = m__2

  All assertions hold on all paths.
Why both paths are safe On Path 0, the PC says `x > y` and the state has `m = x`. The two assertions evaluate to `x >= x` (always true) and `x >= y` (follows from `x > y`). On Path 1, the PC says `¬(x > y)`, equivalently `x ≤ y`, and the state has `m = y`. The two assertions evaluate to `y >= x` (follows from `x ≤ y`) and `y >= y` (always true). Four Z3 queries total, all UNSAT.

Exercise 1b: hidden_branch

def hidden_branch(x):
    if x > 0:
        if x < 100:
            r = x * 2
        else:
            r = x
    else:
        r = -x
    assert r >= 0
    return r

Three paths through the function. One has a bug. Which path, and on what input?

Expected engine output
3 paths:
  Path 0:  PC = x > 0 ∧ x < 100              return = x * 2    safe
  Path 1:  PC = x > 0 ∧ ¬(x < 100)           return = x        safe
  Path 2:  PC = ¬(x > 0)                     return = -x       BUG

  Path 2, line 10: assertion CAN fail.
    Counterexample: x = -2147483648
Why the bug only lives on Path 2 This is the `my_abs` bug from Practice. Path 2's PC is `x ≤ 0`, which includes `x = INT_MIN`. The path computes `r = -x`, and `-INT_MIN` overflows back to `INT_MIN` in 32-bit two's complement, so `r < 0` and `assert r >= 0` fails. Paths 0 and 1 have `x > 0` in their PCs, ruling out negatives. The boundary value 100 plays no role in the bug.

Exercise 1c: unreachable

def unreachable(x):
    if x > 100:
        if x < 50:
            assert False
        y = x - 100
    else:
        y = 100 - x
    assert y >= 0
    return y

Two assertions. One is on a path that can never be reached. The other has a subtle bug. Find both.

Expected engine output
3 paths:
  Path 0:  PC = x > 100 ∧ x < 50 ∧ False ∧ ...
                                 (PC contains False → infeasible)
  Path 1:  PC = x > 100 ∧ ¬(x < 50)          return = x - 100    safe
  Path 2:  PC = ¬(x > 100)                   return = 100 - x    BUG

  Path 2, line 9: assertion CAN fail.
    Counterexample: x = -2147483648
The unreachable assertion Path 0's PC requires `x > 100 ∧ x < 50`, which is unsatisfiable. The engine still walks the path symbolically and records the `assert False`, but when `check_assertions` asks Z3 "is `pc ∧ ¬False` satisfiable?" the answer is UNSAT because `pc` itself is UNSAT. The assertion never fires. Infeasible paths are silent.
The real bug Path 2 computes `y = 100 - x` under the PC `x ≤ 100`. When `x = INT_MIN`, `100 - INT_MIN` overflows in 32-bit arithmetic: the true value `100 + 2^31` wraps to `-(2^31) + 100`, which is negative. The outer assertion `y >= 0` fails on that one input. Same INT_MIN-overflow pattern as `my_abs`, hidden inside an innocent-looking subtraction.

Part 2: Equivalence classes

Take hidden_branch from Exercise 1b. The engine produced three paths and asked Z3 one question per path's assertion. Three queries total decided whether the assertion can fail on any input.

Each path's PC defines an equivalence class of inputs: all 32-bit x values satisfying that PC follow the same path through the function. The Z3 query on each path asks "does any input in this class violate the assertion?" UNSAT proves the assertion holds across the whole class; SAT plus a model exhibits one violating input.

How many 32-bit x values fall into each class?

Count the inputs in each class - Path 0 (`0 < x ∧ x < 100`): the 99 values `x ∈ {1, 2, …, 99}`. - Path 1 (`x ≥ 100`): the `2^31 − 100` values `x ∈ {100, …, 2^31 − 1}`. - Path 2 (`x ≤ 0`): the `2^31 + 1` values `x ∈ {−2^31, …, 0}`. - Sum: `99 + (2^31 − 100) + (2^31 + 1) = 2^32`, the entire 32-bit space.

Three Z3 queries decided correctness for 2^32 = 4 billion inputs. Two returned UNSAT and proved the assertion holds across roughly 2^31 inputs each. The third returned SAT with witness x = INT_MIN, showing the assertion fails somewhere in Path 2's class.

Compare with random fuzzing. A random 32-bit input has a 1-in-4-billion chance of hitting any specific value. To cover even the negative half-line, fuzzing needs on the order of 2^31 trials. Symbolic execution covers it with one query.

Symbolic execution decides an equivalence class per query rather than an input per trial. That is why these tools find bugs random testing does not.

Part 3 (stretch): your own SE target

Open a new mini-IMP function in a fresh file. Write a program whose assertion can fail on a specific input you have in mind. Run check_assertions and verify the engine finds the input you intended.

Hint: pick a witness first, then design the trap Decide on the input you want to expose, then write a program whose assertion fails on exactly that input. - INT_MIN traps: arrange `-x` or `100 - x` or similar, where signed-integer overflow at the boundary kicks in. - Single-value asserts: `assert x != 42` fires only at `x = 42`. - Off-by-one boundary: a comparison that uses `<` where it should use `≤`, then an assert that flips at the boundary value. The point is to predict before running. Pick the witness, write a program you believe singles it out, then let the engine confirm.

No solution file. The exercise is open-ended on purpose.

What you should be able to do now

By the end of Studio, the engine should feel mechanical:

Start thinking about your project

The project milestone is due Friday May 29: a working prototype and a one-page doc, 50 points. The final is the following Friday, June 5.

This is a good week to pick a direction:

Office hours are open for project conversations. Pivoting is much easier now than at the milestone.

Reading reflection

Reading Reflection 4 individual write-up is due Friday at 17:00. The reading theme is the relationship between AI tools and formal methods. The small-group discussion is in next week's Studio.