Practice

Four theory solvers, four engineering problems. Picking the right theory is a reduction choice.

Where We Left Off

Last week we opened one theory solver all the way up. The theory of equality and uninterpreted functions had four axioms, one decision procedure (congruence closure), and a worked algorithm you traced by hand. Z3 has more theory solvers than that.

Today we meet four more: linear real arithmetic (LRA), linear integer arithmetic (LIA), bitvectors (BV), and arrays. Each one decides a different kind of problem. Each one is the right reach for some engineering situation and the wrong reach for others.

The thread across the four demos is a question you already have a catch phrase for. If you get the reduction wrong, the solver gives you a correct answer to the wrong question. Today that catch phrase gets four different meanings: four distinct ways the wrong-theory choice goes wrong.

LRA: Resource Allocation and Optimization

The problem

A campus dining hall builds a grain bowl from three ingredients: tofu, beans, and rice. Nutrition and cost per kilogram cooked, using rough USDA averages:

Ingredient	Cost per kg	Protein per kg	Carbs per kg
Tofu	$3.00	170g	20g
Beans	$2.50	90g	240g
Rice	$1.50	30g	280g

For every kilogram of bowl, hit the dietary targets: at least 80g of protein and at least 200g of carbohydrates. (A 500g bowl at these ratios has 40g of protein and 100g of carbs, a solid meal for an adult.) What's the cheapest blend that meets both floors?

The linear program

Before writing any code, state the problem formally.

Let $t$ , $b$ , and $r$ be kilograms of tofu, beans, and rice per kilogram of output bowl. All three are continuous quantities. A bowl with $t = 0.25$ kg of tofu is a meaningful portion, not a rounding error.

Objective. Minimize cost per kilogram of bowl:

minimize 3.00 t + 2.50 b + 1.50 r

Constraints. Non-negativity, the blend-sum, and the two nutrition floors:

$t \geq 0, b \geq 0, r \geq 0$ $t + b + r = 1$ $170 t + 90 b + 30 r \geq 80$ $20 t + 240 b + 280 r \geq 200$

Three unknowns, a linear objective, a few linear inequalities. This is a linear program. Our job is to hand it to a solver.

The encoding

Translating the LP to Z3 is nearly mechanical.

01-lra-blending.py

from z3 import Real, Optimize

t = Real('t')
b = Real('b')
r = Real('r')

opt = Optimize()
opt.add(t >= 0, b >= 0, r >= 0)
opt.add(t + b + r == 1)                  # 1 kg of bowl
opt.add(170*t + 90*b  + 30*r  >= 80)     # protein floor
opt.add( 20*t + 240*b + 280*r >= 200)    # carbs floor

cost = opt.minimize(3.00*t + 2.50*b + 1.50*r)
opt.check()

Three things are new. The variables are real-valued: Real('t') declares a variable that ranges over the rationals (Z3 uses rationals internally; for linear arithmetic the difference between rationals and reals does not matter). The solver we create is Optimize, not Solver: it answers feasibility and finds an optimum. And opt.minimize(...) registers a linear objective and returns a handle we query for the optimal value.

Everything else is familiar. opt.add(...) adds constraints exactly as Solver does. opt.check() runs the solver. This time it does more than sat/unsat: it also finds the point that minimizes the objective.

The result

m = opt.model()
print(m[t])              # 7/25
print(m[b])              # 9/50
print(m[r])              # 27/50
print(opt.lower(cost))   # 21/10

Per kilogram of bowl: $7 / 25$ kg tofu, $9 / 50$ kg beans, $27 / 50$ kg rice. Minimum cost $$ 21 / 10 = $ 2.10$ . Notice Z3 reported the answer as an exact rational, not a float. That is a feature of LRA: the solver reasons over rationals and returns exact answers.

m[t], m[b], and m[r] are the values at the minimum; opt.lower(cost) is the best lower bound Z3 could prove for the objective (for our problem, equal to the achieved value).

Where the numbers come from

Why those particular fractions? The blend-sum constraint $t + b + r = 1$ lets us eliminate one variable. Substituting $r = 1 - t - b$ into the two nutrition floors collapses the problem from three unknowns to two:

$170 t + 90 b + 30 (1 - t - b) \geq 80 ⟺ 140 t + 60 b \geq 50$ $20 t + 240 b + 280 (1 - t - b) \geq 200 ⟺ 260 t + 40 b \leq 80$

Plus $t \geq 0$ , $b \geq 0$ , and $t + b \leq 1$ (so that $r$ is non-negative). Five inequalities in two unknowns cut out a polygon in the $(t, b)$ plane. Every point inside that polygon is a feasible bowl, with $r = 1 - t - b$ determined by the blend-sum.

A 2D plot showing the feasible polygon for the
blending LP. Three lines cut the plane: the protein
floor 140t + 60b = 50, the carbs ceiling 260t + 40b
= 80, and the blend cap t + b = 1. The feasible
region is a quadrilateral with vertices at (0, 5/6),
(0, 1) (labeled 'pure beans, LIA answer'), (2/11,
9/11), and (7/25, 9/50) (labeled 'optimum, $2.10'
and highlighted). — Feasible blends in the (t, b) plane. Each ingredient constraint is a half-plane. Their intersection is the polygon of all valid blends.

The cost, written in $t$ and $b$ alone:

cost (t, b) = 3.00 t + 2.50 b + 1.50 (1 - t - b) = 1.50 + 1.50 t + 1.00 b

Cost grows with both $t$ and $b$ , so cheaper bowls lie toward the lower-left of the plot. The dark-gray dashed line in the figure is the isocost contour through the optimum: every blend on that line costs exactly \$2.10. The optimum sits at the vertex where this contour first touches the polygon coming from outside.

Rice has no axis of its own in this plot. It is the slack $r = 1 - t - b$ , determined by the other two variables. The green blend-cap line is the rice-zero contour (everything on it has $t + b = 1$ , so $r = 0$ ), and rice grows as you move perpendicular to the blend cap toward the origin. The lighter dotted line in the figure is an iso-rice contour at $r = 0.5$ . Our optimum has $r = 27 / 50 \approx 0.54$ , so the $r = 0.5$ contour passes just past the optimum on the blend-cap side.

Solving the 2×2 system at the binding vertex,

140 t + 60 b = 50, 260 t + 40 b = 80,

gives $t = 7 / 25$ , $b = 9 / 50$ , and (r = 1 - 7/25 - 9/50 = 27/50). That's the answer Z3 reported.

The optimum of a linear objective over a feasible polytope always sits at a vertex. A solver doesn't search the interior. It walks from vertex to vertex, looking for a lower cost, until no adjacent vertex improves. That algorithm is called simplex, and it is the Theory phase.

What if we used integers?

The dining hall scales ingredients by the gram, so integers seem natural. Swap Real for Int and run the same problem:

from z3 import Int, Optimize

t = Int('t')
b = Int('b')
r = Int('r')

opt = Optimize()
opt.add(t >= 0, b >= 0, r >= 0)
opt.add(t + b + r == 1)
opt.add(170*t + 90*b  + 30*r  >= 80)
opt.add( 20*t + 240*b + 280*r >= 200)

cost = opt.minimize(3.00*t + 2.50*b + 1.50*r)
opt.check()
print(m[t], m[b], m[r])     # 0 1 0
print(opt.lower(cost))      # 5/2

Z3 returns sat. But look at the answer: $t = 0$ , $b = 1$ , $r = 0$ . Pure beans, at $$ 2.50$ per kilogram.

What happened? The constraint $t + b + r = 1$ with $t, b, r \geq 0$ over integers forces exactly one of them to be 1 and the others to be 0. The solver checked each option: pure tofu fails the carbs floor; pure rice fails the protein floor; pure beans satisfies both. Beans it is.

Pure beans is a correct answer, but to a different question. Our LRA blend costs $$ 2.10$ per kilogram. The integer "solution" costs $$ 2.50$ , a 19% penalty. Worse, the solver reported sat silently, without any hint that the blend you actually wanted was outside its search space.

Wrong theory, wrong question

You meant to ask about ingredient ratios: continuous fractions of tofu, beans, and rice per kilogram of output. Writing Int changed the question to "which single ingredient is cheapest while hitting both floors?" The answer to that question is pure beans. It is even feasible. It is also not what you meant, and the solver has no way to tell you that.

You reach for LRA when the quantities in your problem are genuinely continuous. Bulk mixing ratios, timing budgets in seconds, fluid volumes, currency amounts at arbitrary precision: LRA is the right theory. Rounding to integers changes the problem, sometimes quietly.

How it scales

Three ingredients is still a toy. Real LPs have hundreds or thousands of variables. Here is the same blending problem with the ingredient list extended. Each new ingredient cycles through the tofu, beans, and rice profiles, and prices rise with the index. Timings:

Ingredients	Solver time
3	0.0005s
10	0.0008s
100	0.0060s
500	0.0572s

Growth is mild. Linear programming is in P: there is a polynomial-time decision procedure, and practical LP solvers are fast on realistic inputs. LRA is the one theory today that sits in P. The other three are NP-complete.

What we take away

LRA is the solver you reach for when the problem is linear in continuous quantities and you care about feasibility or an optimum. The whole category falls under this one theory: resource allocation, scheduling with fractional times, mixture and blending problems, electrical and mechanical equilibrium, finance under simplified assumptions.

After the break, we look inside. The decision procedure is called simplex, and it walks the corners of the feasible polytope. The optimum we just saw was one of those corners: the unique vertex where both nutrition constraints bind at once.

LIA: Integer Constraints and Why They're Harder

The problem

A compiler considers a nested loop that might benefit from reordering. Reordering inner and outer loops can improve cache locality, enable vectorization, or expose parallelism. The legality question is always the same: do any two distinct iterations access the same memory address in a conflicting way?

Concretely:

for i in range(N):
    for j in range(N):
        A[3*i + 1] = f(A[5*j + 3])

Each iteration writes to address 3*i + 1 and reads from address 5*j + 3. The write and read strides do not match, and the offsets shift them past each other. The reorder is legal iff no write-read pair hits the same address across distinct iterations.

The dependence query

Ask the solver directly: does there exist $(i, j, i^{'}, j^{'})$ with each index in $[0, N)$ and $(i, j) \neq (i^{'}, j^{'})$ such that $3 i + 1 = 5 j^{'} + 3$ ?

Decision variables. Four indices, one per iteration coordinate:

i, j, i^{'}, j^{'} \in ℤ

Constraints. Iteration bounds, the write-read conflict equation, and distinctness:

$0 \leq i, j, i^{'}, j^{'} < N$ $3 i + 1 = 5 j^{'} + 3$ $(i, j) \neq (i^{'}, j^{'})$

If the system is satisfiable, a conflicting iteration pair exists and the reorder is unsafe. If it is unsatisfiable, no such pair exists and the reorder is legal.

The encoding at $N = 4$

02-lia-loop-reorder.py

from z3 import Int, Solver, Or

i  = Int('i')
j  = Int('j')
ip = Int('ip')
jp = Int('jp')

N = 4

s = Solver()
s.add(0 <= i,  i  < N)
s.add(0 <= j,  j  < N)
s.add(0 <= ip, ip < N)
s.add(0 <= jp, jp < N)
s.add(3*i + 1 == 5*jp + 3)
s.add(Or(i != ip, j != jp))

print(s.check())    # unsat

Z3 returns unsat. At N = 4 the reorder is legal. The interesting question is why.

Where the integer solutions live

Rearrange the conflict equation: $3 i - 5 j^{'} = 2$ . This is a linear Diophantine equation, a linear equation we are asking to solve over the integers. The first question is whether any integer solution exists at all, ignoring the iteration bounds for a moment. The answer is yes. The greatest common divisor of the coefficients on the left side is $gcd (3, 5) = 1$ , and $1$ divides the right-hand side $2$ , so integer solutions are possible. (If the GCD did not divide the right-hand side, no integer solution could exist.)

Once one integer solution exists, a whole family exists, evenly spaced along the solution line. For our equation, the family is

(i, j^{'}) = (4, 2) + k \cdot (5, 3), k \in ℤ .

Enumerating a few values:

$k$	$i$	$j^{'}$
$- 1$	$- 1$	$- 1$
$0$	$4$	$2$
$1$	$9$	$5$
$2$	$14$	$8$

Integer solutions exist; they are just spread out.

The compiler's real question is not "do integer solutions exist anywhere?" but "do any integer solutions exist inside the iteration box $[0, N)^{2}$ ?" This is where things get harder.

A 2D plot showing the integer solutions to the
equation 3*i - 5*j' = 2 in the (i, j') plane.
Integer solutions sit on a straight line and are
spaced by vector (5, 3). Three highlighted solutions
are visible: (-1, -1), (4, 2), and (9, 5). Two
nested iteration boxes are overlaid: an orange N=4
box covering (i, j') in [0, 4) x [0, 4) with no
integer solutions inside, and a larger green dashed
N=5 box covering [0, 5) x [0, 5) that contains the
solution (4, 2) right at its upper corner. — The Diophantine line, its integer solutions, and two iteration boxes. No integer solution fits inside the orange $N = 4$ box. The $N = 5$ box is just big enough to contain the first non-negative solution $(4, 2)$.

The figure makes the sensitivity to $N$ concrete. The Diophantine line $3 i - 5 j^{'} = 2$ crosses the $N = 4$ box, and over the reals it provides an infinite supply of solutions inside the box. None of those real solutions are integer points. The nearest integer solution, $(4, 2)$ , sits on the boundary of the $N = 4$ box and outside of it. Only when we enlarge the box to $N = 5$ does the first integer solution fit inside.

At $N = 4$ , no conflict exists in the integer-lattice version of the problem. Z3 proves this by real integer-programming work, not by a one-line arithmetic trick: it has to reason about the coefficients, the GCD, and the bound box together.

What if we used LRA?

Integer-constrained variables are harder to reason about than real-valued ones. One tempting shortcut is to relax the question to LRA (an LP relaxation) and hope the relaxed answer agrees. Swap Int for Real and run the same system at $N = 4$ :

from z3 import Real, Solver, Or

i  = Real('i')
j  = Real('j')
ip = Real('ip')
jp = Real('jp')

# ... same constraints as before ...
s.add(3*i + 1 == 5*jp + 3)
s.add(Or(i != ip, j != jp))

print(s.check())          # sat
print(s.model())
# witness: i = 24/11, jp = 10/11

LRA returns sat. The line $3 i - 5 j^{'} = 2$ crosses the $N = 4$ box over the reals, and LRA picks a fractional point on that line. The witness $(i, j^{'}) = (24 / 11, 10 / 11)$ does satisfy the conflict equation, and it does lie in $[0, 4)^{2}$ as a region of the real plane. It just does not correspond to any actual iteration of the loop. Loops iterate at integer points.

Wrong theory, wrong question

A compiler that trusted the LP relaxation would refuse to reorder this loop, citing a phantom dependence at $(24 / 11, 10 / 11)$ that no real iteration could ever exhibit. Correct answer to the wrong question, paid for with a lost optimization.

This is the honest version of the LRA-vs-LIA tradeoff. LRA is decidable in polynomial time and often gives the answer you want. But when the answer hinges on which integer points are in your box, the LP relaxation cannot tell.

When conflicts do appear

The answer changes with one more iteration. Run the same encoding at $N = 5$ :

# Same constraints, N = 5 instead of 4
s.check()                     # sat
print(s.model())
# witness: i = 4, j = 3, ip = 0, jp = 2

The integer solution $(i, j^{'}) = (4, 2)$ now fits the box. Z3 returns sat and hands back a concrete iteration pair. At $i = 4$ the writing iteration touches address $3 \cdot 4 + 1 = 13$ . At $j^{'} = 2$ a reading iteration touches address $5 \cdot 2 + 3 = 13$ . Same address, distinct iterations, and the compiler must respect their order.

The answer flipped from unsat to sat just by making the iteration range one larger. Nothing about the equation changed; what changed is which lattice points fit the box. This sensitivity to the iteration bound is exactly what makes LIA harder than LRA. The decision procedure has to reason about the integer lattice and the box together, not just the feasibility of a continuous region.

What we take away

LIA is the solver you reach for when the problem is linear in integer quantities. Compiler loop analysis, scheduling with integer counts, integer-weighted assignment and packing: all land in this theory. Integer quantities are not a cosmetic detail. A real-valued relaxation can quietly change the question and return a phantom answer that does not correspond to any real iteration.

The cost is complexity. LIA is NP-complete. The decision procedure can no longer walk a convex polytope's corners directly, because the answer might sit between corners at an integer point that is not a vertex of the polytope. After the break, we look inside. The algorithm is called branch-and-bound, and it layers integer refinement on top of simplex.

BV: Bit-Level Semantics and Production Correctness

The problem

Binary search is simple. Lower bound, upper bound, midpoint, compare, recurse. It is the textbook algorithm every introductory course teaches and every standard library ships. And for nine years Java's standard library shipped a version of it that was wrong, because of one line of code that assumed mathematical integer semantics when the machine actually provides finite-width ones.

Josh Bloch, in a 2006 Google blog post titled "Extra, Extra: Read All About It: Nearly All Binary Searches and Mergesorts Are Broken," documented that the midpoint calculation $(l o + h i) / 2$ overflows when the array is large enough for $l o + h i$ to exceed $2^{31}$ . The bug sat in the JDK for nine years.

Our L04 BV demo uses this same scenario. The underlying question is simple: is this midpoint calculation correct? The correct answer depends entirely on which theory of integer arithmetic you mean.

Bits wrap

Before the main event, a one-line reminder of how bit-level arithmetic differs from the mathematical kind. Take an 8-bit unsigned slot, a variable of type unsigned char in C, and compute:

200 + 100 = ?

Mathematically $300$ . On an 8-bit unsigned slot the answer is $44$ . The nine-bit sum $0b100101100$ only has eight bits of storage, so the leading 1 is discarded and the stored result is $0b00101100 = 44$ . Z3's bitvector arithmetic behaves the same way as the hardware:

03-bv-overflow.py

from z3 import BitVecVal, simplify

a = BitVecVal(200, 8)
b = BitVecVal(100, 8)
print(simplify(a + b))    # 44

Modular wraparound is the defining feature of the bitvector theory. It is also the source of every binary-search-style bug that tries to treat machine integers as mathematical ones.

The property we want to verify

Return to binary search. The midpoint should satisfy $l o \leq m i d \leq h i$ whenever $l o \leq h i$ . If that invariant ever fails, the algorithm can index outside the array.

Encode the property as a counterexample search: ask the solver for any $(l o, h i)$ in the valid input range where $m i d = (l o + h i) / 2$ falls outside $[l o, h i]$ . unsat means no counterexample exists and the property holds; sat means the model is a concrete bug witness.

What Int says

Try Int first. That is, pretend machine integers are mathematical integers.

from z3 import Int, Solver, And, Not

W = 32   # 32-bit input range
lo = Int('lo')
hi = Int('hi')

s = Solver()
s.add(0 <= lo, lo <= hi, hi < 2**W)
mid = (lo + hi) / 2
s.add(Not(And(lo <= mid, mid <= hi)))

print(s.check())    # unsat

Int returns unsat. In mathematical integer arithmetic the midpoint is always between its endpoints. That is a true fact about mathematical integers.

Wrong theory, wrong question

Int answered the question "is the midpoint formula correct for mathematical integers?" The actual binary search runs on 32-bit machine integers, where $l o + h i$ can wrap around to something below $l o$ . Int silently confirms a property that the real code does not satisfy.

What BV says

Switch to BitVec for the right semantics. Index arithmetic on real hardware is unsigned 32-bit, so the comparison uses ULE (unsigned less-or-equal) and the division uses UDiv (unsigned division):

from z3 import BitVec, Solver, And, Not, ULE, UDiv

lo = BitVec('lo', 32)
hi = BitVec('hi', 32)

s = Solver()
s.add(ULE(lo, hi))
mid = UDiv(lo + hi, 2)
s.add(Not(And(ULE(lo, mid), ULE(mid, hi))))

print(s.check())    # sat
print(s.model())

BV returns sat with a concrete witness. On the run, Z3 gives us:

Field	Value
$l o$	$134 217 728 = 2^{27}$
$h i$	$4 294 967 295 = 2^{32} - 1$
$l o + h i$ mathematically	$4 429 185 023$
$l o + h i$ at 32 bits	$134 217 727 = 2^{27} - 1$
$m i d$ computed	$67 108 863 = 2^{26} - 1$

The mathematical sum overflows 32 bits. The high bit is dropped. The computed midpoint is below the lower bound $l o$ . A binary search using this midpoint on an array of $2^{27}$ or more elements can land anywhere outside $[l o, h i]$ . The JDK shipped this bug; Python and C standard libraries have shipped variants of it. BV catches it because bit-level semantics are what the problem is about.

The fix

Bloch's fix is

m i d = l o + (h i - l o) / 2 .

The subtraction $h i - l o$ stays within 32 bits whenever $l o \leq h i$ , the division by two shrinks it further, and adding the result back to $l o$ gives a value inside $[l o, h i]$ regardless of how large the endpoints are. BV verifies this:

mid = lo + UDiv(hi - lo, 2)
s = Solver()
s.add(ULE(lo, hi))
s.add(Not(And(ULE(lo, mid), ULE(mid, hi))))
print(s.check())    # unsat

unsat now means for every $(l o, h i)$ in the 32-bit unsigned range with $l o \leq h i$ , the fixed midpoint is between them. The fix is verified, not just tested.

What we take away

BV is the solver you reach for whenever correctness depends on the actual bit-level behavior of machine integers. Overflow checks, shift and mask tricks, crypto masking, fixed-point arithmetic, hardware verification, compiler correctness: all live here. An integer-level analysis cannot see overflow; it cannot see a sign extension; it cannot see a shift past the word boundary.

The cost is complexity. BV is NP-complete. The decision procedure works by unrolling every bitvector variable into a vector of booleans and every operator into a boolean circuit. A 32-bit $l o + h i$ becomes a 32-full-adder ripple-carry circuit; the whole property becomes a CNF formula that the SAT engine from Week 2 can attack. Theory phase shows how bit-blasting works in detail, using the same CDCL machinery we already have.

Arrays: Immutable Maps, Mutable Models

The problem

When do two sequences of stores to an index-to-value map produce the same final map? This is not a question about mutation. The theory of arrays has no mutation. The question is whether two array values, each a total function from indices to values, agree at every index.

This is the theory verification tools use to model mutable state. A heap snapshot is an array value. A sequence of memory writes becomes a sequence of array values, each a slightly-different snapshot. The engineer models evolving state as a walk through snapshots; the theory reasons about the snapshots themselves.

Arrays in Z3: select and store

The theory has two operations. Select(a, i) reads the value at index i of array a. Store(a, i, v) returns a new array value (a completely separate function) that agrees with a everywhere except at index i, where it takes the value v. The original array a is unchanged and still there. Stores are functional, not in-place.

04-arrays-commute.py

from z3 import Array, IntSort, Store, Select, simplify

a = Array('a', IntSort(), IntSort())

# Build a new array value by storing 4 at index 3
a1 = Store(a, 3, 4)

print(simplify(Select(a1, 3)))    # 4
print(simplify(Select(a1, 5)))    # a[5]

Two reads, two outcomes. Reading at index 3 returns the value we stored there. Reading at index 5, a different index, returns a[5]: a symbolic read into the original array a, untouched by the store. Z3 applied the read-over-write axiom to simplify each lookup without any manual hints.

The same axiom, stated once and for all, is

Select (Store (a, i, v), j) = {\begin{matrix} v & if i = j \\ Select (a, j) & if i \neq j . \end{matrix}

One equation. That, plus the fact that store produces a new array value rather than modifying the old one, is the whole theory. Everything else we ask the solver reduces to applying that equation.

You can think of the array as a function from indices to values, and store as returning a slightly-different function. The transcript framing from one prior offering: "arrays are just the theory of functions in disguise."

What the theory does not give you

Two things the theory deliberately omits. Students coming from Python or C tend to assume they are there; they are not.

No length. Array(name, I, V) is a total function from every value of the index sort I to a value of the value sort V. Select(a, -1), Select(a, 10**100), and Select(a, 0) are all valid expressions, and each returns some value of type V. The theory does not know about array sizes, out-of-range indices, or buffer overflows. If your problem is about a 10-slot buffer, declare the array as usual and add constraints that every index you read or store at lies in [0, 10). The solver will reason under those constraints. Outside them, the array is still defined; you just are not asking about it.

No mutation. Every store produces a new array value; the old one is unchanged and still there. There is no "mutating the array" at the theory level. If you want to model a store that changes over time, represent each point in time as a separate array variable. A sequence of five writes to a heap becomes five distinct array variables, each a snapshot of the heap after one more write. The engineer writes the walk; the theory reasons about the snapshots.

When two stores commute

Two store operations commute when applying them in either order produces the same array value. Ask Z3 by searching for an index where the two resulting arrays disagree:

x = Array('x', IntSort(), IntSort())

# Order A: store 10 at index 3, then 20 at index 5
a1 = Store(x,  3, 10)
a2 = Store(a1, 5, 20)

# Order B: same two stores, reverse order
b1 = Store(x,  5, 20)
b2 = Store(b1, 3, 10)

# Find any index where the two arrays disagree
i = Int('i')
s = Solver()
s.add(Select(a2, i) != Select(b2, i))

print(s.check())    # unsat

unsat. No index distinguishes a2 from b2. The two stores target different indices, and neither depends on the other's result, so both orderings build the same array value.

Verification pattern carries over from earlier theories: we asked for a counterexample (an index where the arrays disagree) and got unsat, so the property "the arrays agree everywhere" holds for all indices.

When two stores collide

Change one index so both stores target the same slot:

x = Array('x', IntSort(), IntSort())

# Order A: store 10 at 3, then 99 at 3
a1 = Store(x,  3, 10)
a2 = Store(a1, 3, 99)

# Order B: store 99 at 3, then 10 at 3
b1 = Store(x,  3, 99)
b2 = Store(b1, 3, 10)

i = Int('i')
s = Solver()
s.add(Select(a2, i) != Select(b2, i))

print(s.check())    # sat
print(s.model())
# witness i = 3
# a[3] = 99, b[3] = 10

sat, and Z3 hands us a concrete disagreement: index 3. In order A the final array value reads 99 at index 3; in order B it reads 10. Both sequences apply the same two stores at the same index; the second store always wins, so the final arrays differ whenever the second stored values differ.

Why this matters for engineering

Compilers reordering memory writes, databases serializing transactions, filesystems reordering disk writes, merge algorithms combining concurrent edits: all of these systems model mutable state using arrays, and all of them ask the same theory question. Do these two sequences of stores produce the same final array value? The theory gives the exact condition, decidable by the solver in a handful of axiom applications.

What we take away

Arrays is the theory you reach for when you want to reason about indexed lookup tables: total functions from an index sort to a value sort. If you are modeling mutable state, the theory gives you the values, and you are responsible for writing the walk through them. Memory in a verifier, heap in a model checker, records in a database, cells in a filesystem: everything in that list is an engineering application that represents state as a sequence of array values, not as one mutable array.

The alternative would be to bit-blast the whole store into one large BitVec with explicit offset arithmetic. That is technically correct and practically impossible. A heap with $2^{32}$ addressable bytes becomes a 4-gigabyte boolean formula. The theory of arrays exists so we never have to write that formula.

After the break, Theory phase pulls back the curtain on how the array solver actually works. The answer has a surprise in it, and the surprise is good news: we have already built most of the machinery it needs.

Four Problems, Four Theories

Four engineering problems, four theories, four distinct ways the wrong choice would have gone wrong.

Cafeteria blending wanted LRA because the quantities were real-valued ratios; Int collapsed the problem to whole ingredients and reported no feasible mix. Compiler loop reordering wanted LIA because iteration spaces are integer lattices; the LP relaxation invented phantom dependences at fractional coordinates. Binary-search correctness wanted BV because machine integers wrap; integer-level reasoning silently proved a real bug away. Memory-update equivalence wanted arrays because select and store already capture the right intuition; bit-blasting memory into one enormous boolean formula would have been correct in principle and unusable in practice.

Picking the theory is itself a reduction decision. Whatever question your problem is actually asking, your encoding either asks the solver that question or asks it a different one.

After the break, we look inside the four solvers and see how each answers the question its theory is built for.

Demo Code

All demo files for this phase are in the course code repo. Clone the repo and run them locally with python3 01-lra-blending.py etc. (requires z3-solver; see Setup).

Practice

Where We Left Off

LRA: Resource Allocation and Optimization

The problem

The linear program

The encoding

The result

Where the numbers come from

What if we used integers?

How it scales

What we take away

LIA: Integer Constraints and Why They're Harder

The problem

The dependence query

The encoding at N=4

Where the integer solutions live

What if we used LRA?

When conflicts do appear

What we take away

BV: Bit-Level Semantics and Production Correctness

The problem

Bits wrap

The property we want to verify

What Int says

What BV says

The fix

What we take away

Arrays: Immutable Maps, Mutable Models

The problem

Arrays in Z3: select and store

What the theory does not give you

When two stores commute

When two stores collide

What we take away

Four Problems, Four Theories

Demo Code

The encoding at $N = 4$