Practice
From SAT to SMT: richer primitives for encoding, and the right abstraction for performance.
Where We Left Off
For two weeks we have emphasized SAT: boolean variables, clauses, and the CDCL algorithm that makes modern solvers fast. But our very first demo used integers and arithmetic to solve the xkcd restaurant puzzle, and those are not booleans. Z3 accepted them without complaint. This lecture starts our exploration of how theory solvers go beyond booleans and extend SAT to SMT.
Under the Hood
Here is the xkcd restaurant problem from Week 1, slightly simplified. Six menu items, integer quantities, total must be exactly $15.05:
x1 = Int('x1')
x2 = Int('x2')
x3 = Int('x3')
x4 = Int('x4')
x5 = Int('x5')
x6 = Int('x6')
s = Solver()
s.add(x1 >= 0)
s.add(x2 >= 0)
s.add(x3 >= 0)
s.add(x4 >= 0)
s.add(x5 >= 0)
s.add(x6 >= 0)
s.add(215*x1 + 275*x2 + 335*x3 + 355*x4 + 420*x5 + 580*x6 == 1505)
Every Z3 solver has a method called .sexpr() that shows the
representation Z3 is actually working with. This representation
is called SMT-LIB, and it is the standard
input language for SMT solvers:
(declare-fun x1 () Int)
(declare-fun x2 () Int)
(declare-fun x3 () Int)
(declare-fun x4 () Int)
(declare-fun x5 () Int)
(declare-fun x6 () Int)
(assert (>= x1 0))
(assert (>= x2 0))
(assert (>= x3 0))
(assert (>= x4 0))
(assert (>= x5 0))
(assert (>= x6 0))
(assert (= (+ (* 215 x1)
(* 275 x2)
(* 335 x3)
(* 355 x4)
(* 420 x5)
(* 580 x6))
1505))
Notice (declare-fun x1 () Int). The variables are declared as
integers, not booleans. The constraints use arithmetic operations
like * and + and >=. None of this is SAT. This is SMT:
Satisfiability Modulo Theories. The "modulo theories" part means
Z3 is reasoning about integers using the rules of integer
arithmetic, not just boolean satisfiability.
The Python API is a thin layer over this representation. When you
write Int('x1'), Z3 creates an integer variable in its theory
of integer arithmetic. When you write s.add(x1 >= 0), Z3
generates (assert (>= x1 0)) in its internal representation.
Encoding with Theories: Sudoku
SAT solvers work with boolean variables and clauses. To solve a problem that is not naturally boolean, you have to encode it as one. Sometimes that encoding is painful. Sudoku is a good example.
The puzzle
A sudoku board is a 9x9 grid divided into nine 3x3 blocks. Some cells are filled in (the givens). The goal: fill every empty cell with a digit from 1 to 9 so that each row, each column, and each 3x3 block contains all nine digits exactly once.
| 5 | 3 | 7 | ||||||
| 6 | 1 | 9 | 5 | |||||
| 9 | 8 | 6 | ||||||
| 8 | 6 | 3 | ||||||
| 4 | 8 | 3 | 1 | |||||
| 7 | 2 | 6 | ||||||
| 6 | 2 | 8 | ||||||
| 4 | 1 | 9 | 5 | |||||
| 8 | 7 | 9 |
Five rules define a valid solution:
- Givens: pre-filled cells keep their values
- Cell range: each cell holds exactly one value from 1 to 9
- Row: each row contains all nine values, no repeats
- Column: each column contains all nine values, no repeats
- Block: each 3x3 block contains all nine values, no repeats
The rules are easy to state. The question is how to tell a solver about them.
In code, we represent the board as a 2D list board[r][c]
where r is the row (0 to 8) and c is the column (0 to 8). A
given cell holds its value (1 to 9). A blank cell holds 0.
Both our SAT and SMT encodings solve this puzzle instantly. The solved board (values found by the solver in blue):
| 5 | 3 | 4 | 6 | 7 | 8 | 9 | 1 | 2 |
| 6 | 7 | 2 | 1 | 9 | 5 | 3 | 4 | 8 |
| 1 | 9 | 8 | 3 | 4 | 2 | 5 | 6 | 7 |
| 8 | 5 | 9 | 7 | 6 | 1 | 4 | 2 | 3 |
| 4 | 2 | 6 | 8 | 5 | 3 | 7 | 9 | 1 |
| 7 | 1 | 3 | 9 | 2 | 4 | 8 | 5 | 6 |
| 9 | 6 | 1 | 5 | 3 | 7 | 2 | 8 | 4 |
| 2 | 8 | 7 | 4 | 1 | 9 | 6 | 3 | 5 |
| 3 | 4 | 5 | 2 | 8 | 6 | 1 | 7 | 9 |
The SAT encoding
In pure SAT there are no integers. We need a way to represent "cell (r, c) has value v" using only booleans. The standard approach: create a boolean variable for every cell and every possible value. If is true, cell (r, c) holds value v.
# If n = 3, then size = 9; if n = 4, then size = 16, etc.
size = n * n
# Create the solver and the variables.
s = Solver()
# Create one boolean variable per row, column, and value.
# x[r][c][v] = "cell (r,c) has value v"
# v is 0-indexed internally (0 to 8); we convert at I/O
x = [[[
Bool(f'x_{r}_{c}_{v}')
for v in range(size) ]
for c in range(size) ]
for r in range(size) ]
That is a matrix: 9 rows, 9 columns, 9 possible values. 729 boolean variables for an 81-cell puzzle. Now we encode each rule as clauses over these variables.
Givens. If cell (r, c) has a given value g (where
board[r][c] is 1-indexed), we assert the corresponding
boolean variable as a unit clause. The board uses values 1 to 9;
our variables are 0-indexed, so we subtract 1:
for r in range(size):
for c in range(size):
if board[r][c] != 0:
s.add(x[r][c][board[r][c] - 1])
Cell range. Each cell has at least one value (a disjunction over all values for that cell) and at most one value (pairwise exclusion over all values for that cell). For each cell (r, c):
At least one:
At most one:
# At least one value per cell
# x[r][c] is a list of 9 booleans (one per value).
# Z3's Or() accepts a list: Or([a, b, c]) means a ∨ b ∨ c.
for r in range(size):
for c in range(size):
s.add(Or(x[r][c]))
# At most one value per cell
for r in range(size):
for c in range(size):
for v1 in range(size):
for v2 in range(v1 + 1, size):
s.add(Or(Not(x[r][c][v1]), Not(x[r][c][v2])))
Rows. For each row r and value v, that value appears at least once in the row (a disjunction across columns) and at most once (pairwise exclusion across columns):
At least one per row:
At most one per row:
# At least one of each value per row
for r in range(size):
for v in range(size):
s.add(Or([x[r][c][v] for c in range(size)]))
# At most one of each value per row
for r in range(size):
for v in range(size):
for c1 in range(size):
for c2 in range(c1 + 1, size):
s.add(Or(Not(x[r][c1][v]), Not(x[r][c2][v])))
Columns. The same structure, iterating over rows instead of columns. For each column c and value v:
At least one per column:
At most one per column:
# At least one of each value per column
for c in range(size):
for v in range(size):
s.add(Or([x[r][c][v] for r in range(size)]))
# At most one of each value per column
for c in range(size):
for v in range(size):
for r1 in range(size):
for r2 in range(r1 + 1, size):
s.add(Or(Not(x[r1][c][v]), Not(x[r2][c][v])))
Blocks. The same pattern again. For each 3x3 block and each value, the value appears at least once and at most once among the nine cells of that block:
for br in range(n):
for bc in range(n):
cells = [ (br * n + r, bc * n + c)
for r in range(n)
for c in range(n) ]
# At least one of each value per block
for v in range(size):
s.add(Or([x[r][c][v] for r, c in cells]))
# At most one of each value per block
for v in range(size):
for i in range(len(cells)):
for j in range(i + 1, len(cells)):
r1, c1 = cells[i]
r2, c2 = cells[j]
s.add(Or(Not(x[r1][c1][v]),
Not(x[r2][c2][v])))
Every rule follows the same pattern: a disjunction for "at least one" and pairwise exclusion for "at most one." The full SAT encoding produces 729 boolean variables and over 12,000 clauses. Some of these constraints are redundant. We include all of them to make the encoding's intent clear. Which ones could be dropped is a good exercise.
Z3 does provide an
AtMost
constraint that would simplify the "at most one" clauses. But
AtMost is a pseudo-boolean constraint, not pure SAT. Using it
would already be stepping beyond booleans. The pairwise encoding
here is what a pure SAT solver actually receives.
The SMT encoding
With integer variables, we have one variable per cell instead of one per cell-value pair:
size = n * n
s = Solver()
# One integer variable per cell
cells = [[
Int(f'c_{r}_{c}')
for c in range(size) ]
for r in range(size) ]
That is a matrix. 81 integer variables for
an 81-cell puzzle. Two things changed. First, the variables are
integers, not booleans. Since the puzzle is about integers, and
the solver understands integers, the encoding is direct: no need
to decompose values into booleans. Second, the integer theory
gives us Distinct(), a built-in constraint that asserts all
its arguments take different values. This is not cheating. It
is exactly what you get by moving from SAT to SMT: the theory
provides richer primitives that match the structure of the
problem.
Distinct()
takes a list of variables of the same sort and asserts they all
take different values. Now the same five rules:
Givens. A pre-filled cell is an equality constraint. No index conversion needed since the solver works with integers directly:
for r in range(size):
for c in range(size):
if board[r][c] != 0:
s.add(cells[r][c] == board[r][c])
Cell range. Each cell is an integer between 1 and 9:
for r in range(size):
for c in range(size):
s.add(cells[r][c] >= 1)
s.add(cells[r][c] <= size)
Rows. Each row has all different values:
for r in range(size):
s.add(Distinct(cells[r]))
Columns. Each column has all different values:
for c in range(size):
s.add(Distinct([cells[r][c] for r in range(size)]))
Blocks. Each 3x3 block has all different values:
for br in range(n):
for bc in range(n):
block = [ cells[br * n + r][bc * n + c]
for r in range(n)
for c in range(n) ]
s.add(Distinct(block))
Distinct() handles both "at least one" and "at most one" in a
single constraint. The full SMT encoding uses 81 variables
and 219 constraints. To be fair, each Distinct() constraint
asks more of the solver than a single boolean clause does: the
solver needs equality reasoning machinery to handle it. But the
Python code we have to write is much shorter
and simpler, and for many problems that tradeoff is worth making.
The right choice depends on the problem and the team building
the solution.
The contrast
Both encodings solve the same puzzle and produce the same answer. The difference is in the reduction.
The SAT encoding fights a mismatch: the problem is about integers, but the solver only understands booleans. The programmer has to build the integer representation (729 boolean variables for 81 cells) and manually decompose "all different" into pairwise exclusion clauses (over 12,000 of them).
The SMT encoding has no mismatch. The problem is about
integers, and the solver understands integers. Each cell is one
variable. "All different" is one call to Distinct(). The
encoding says what it means.
A simpler reduction is easier to get right. When the encoding is 20 lines instead of 80, there are fewer places for the kind of mistake where the solver gives you a correct answer to the wrong question.
Abstraction for Performance
The sudoku comparison showed that theories can make encoding easier. Theories can also let you hide details that do not matter. When the solver does not have to reason about the bits, it can answer questions much faster, and sometimes questions it would otherwise be unable to answer at all. You get to reason at the right level of detail for your actual problem.
The problem
Consider two functions on 64-bit bitvectors:
def sq(y):
return y * y
def sqabs(y):
return abs(y) * abs(y)
Are they equivalent for all inputs?
To ask Z3 this question, we need to express the computations
symbolically. We declare y as a symbolic 64-bit bitvector,
then build Z3 expressions for y * y and abs(y) * abs(y).
Z3 does not have abs, so we write our own bvabs using If:
BW = 64
def bvabs(y):
return If(y < 0, -y, y)
y = BitVec('y', BW)
sq = y * y
sqabs = bvabs(y) * bvabs(y)
Now sq and sqabs are symbolic expressions, not concrete
values. They represent the computation of y * y and
bvabs(y) * bvabs(y) for an unknown y.
How to ask the question
Mathematically, you might write our goal as:
That is, "for all y, sq(y) = sqabs(y)." You can encode this directly in Z3 as:
s.add(ForAll([y], sq == sqabs))
This version does work at small bit widths. To decide a closed
ForAll query, Z3 uses a technique called
model-based quantifier instantiation (MBQI):
it proposes a candidate model for the ground part of the
problem, checks whether the quantified body holds in that
model, and uses any counterexample as a concrete instantiation
to refine. The loop is incomplete in general, and over
bit-vector multiplication every refinement step requires a
fresh bit-blasted check, so it scales poorly in the width. As
the bit width grows, Z3 gives up:
| Bit width | Result | Time |
|---|---|---|
| 8 | sat | 0.02s |
| 16 | sat | ~4s |
| 32 | unknown | (times out) |
| 64 | unknown | (times out) |
We have seen sat and unsat before. unknown is a third
possible result from check(). It means the solver could not
decide the question. It is not saying the formula is true or
false. It is saying "I gave up." For quantified queries over
expensive theories, unknown is common. When you hit it, you
have to find a different way to ask the question.
Even more interesting: if you rewrite the query by hand using the logical moves we walk through below, Z3 handles all four bit widths comfortably (well under a second up through 64 bits, compared to a timeout here). The rewrite is a logically equivalent transformation the solver could in principle do on its own. In practice, the manual version sidesteps Z3's quantifier heuristics and lands us in territory the solver handles efficiently.
To understand why, take a step back. So far we have only worked with formulas that have no quantifiers: the propositional formulas in Weeks 1 and 2, and now Z3 expressions over integer and bitvector variables. These all live in what is called the quantifier-free fragment of first-order logic. The quantifier-free fragment is decidable: there is an algorithm that always terminates with a yes-or-no answer. SAT and SMT solvers exploit this. The problem is hard (NP-complete for many theories), but often tractable in practice.
Once you add quantifiers, the picture changes dramatically. Validity in full first-order logic is undecidable (Church and Turing, 1936-37). No algorithm can correctly answer every quantified query in general. Z3 does support quantifiers, but through incomplete heuristics: it may succeed, or it may not. With bitvectors and multiplication, "may not" is the common case. We will come back to all this in Week 5.
So we need a different way to ask the question. Fortunately we have one. It takes two steps.
Step 1: quantifier negation. A standard fact about how and relate says that "for all y, P(y)" and "there is no y where ¬P(y)" are the same statement:
"P holds for every input" and "no input makes P fail" are two ways of saying the same thing. This is pure quantifier logic, independent of validity or satisfiability. Notice that the right side still has a quantifier. We have not escaped quantified logic yet.
Step 2: check for unsat with a free variable. Now we apply the duality from Week 1: a formula is valid if and only if its negation is unsatisfiable. Our goal is , which is the same as by step 1. Showing that is valid is the same as showing its negation is unsatisfiable:
We have turned "prove valid for all inputs" into "show
unsatisfiable." And now the convenient part: when you hand Z3 a
formula with a free variable and call s.check(), Z3 searches
for an assignment to that variable that makes the formula true.
Calling check() on with y free is implicitly
asking: does there exist a y such that is true?
This move (replacing a bound variable with a free one and letting
the solver hunt for an assignment) is a lightweight form of what
logicians call Skolemization.
Putting it all together: to check , we hand
Z3 with y free. If Z3 says unsat, there is no
y satisfying , so the original universal claim
holds. If Z3 says sat, it has found a counterexample.
The rewritten query is in the quantifier-free fragment. No
ForAll, no Exists, just a formula with a free variable.
It is decidable. Solvers handle it well.
s = Solver()
s.add(Not(sq == sqabs))
result = s.check()
# sat -> found a counterexample, property is violated
# unsat -> no counterexample exists, property holds for ALL inputs
UNSAT often means verified
To check that a property P holds for all inputs, hand Z3
Not(P) with the inputs as free variables and call
s.check(). If Z3 returns unsat, no input violates P,
so P is verified. If Z3 returns sat, the model is a
counterexample.
This is not a Z3 performance trick. It is how solver-based
verification works. When the solver returns unsat, it has
established that no input in the entire space violates the
property. In this style of verification, UNSAT is how the
solver tells you "verified." Rosette's verify, CBMC, and
bounded model checkers all work this way: they search for
counterexamples and report success when none exist.
Not(property) followed by checking for unsat is the
verification pattern we will use for the rest of the course.
Attempt 1: full bitvector semantics
With the counterexample formulation and real 64-bit multiplication:
s = Solver()
s.add(Not(sq == sqabs))
result = s.check() # unsat in ~1 second
Correct: the functions are equivalent. But it takes about a second. That is another reduction under the hood. Z3 bit-blasts the 64-bit multiply into a circuit of boolean gates and hands it to the same CDCL engine we saw in Week 2. At 64 bits the circuit is big, and even CDCL takes a while to chew through it.
Attempt 2: uninterpreted multiply, no axiom
In Attempt 1, Z3 had to reason about bitvector multiplication
all the way down to the bits. Z3 lets us hide that detail from
the solver by declaring an uninterpreted function: umul,
a function symbol with a fixed signature and no fixed meaning.
Z3 knows umul takes two 64-bit values and returns a 64-bit
value, and nothing else. It is free to pick any function that
fits.
BV = BitVecSort(BW)
# umul takes two 64-bit BVs and returns a 64-bit BV.
# Z3 knows nothing else about it.
umul = Function('umul', BV, BV, BV)
Now rebuild sq and sqabs using umul instead of the
built-in *, and ask the same counterexample question as
before:
sq = umul(y, y)
sqabs = umul(bvabs(y), bvabs(y))
s = Solver()
s.add(Not(sq == sqabs))
result = s.check() # sat in 0.01 seconds
Fast. But wrong. check() returned sat, which means Z3
found a counterexample: a value of y where sq(y) differs
from sqabs(y) in this model. We can pull the values out and
see:
m = s.model()
print(f"y = {m[y]}")
print(f"sq(y) = {m.eval(sq)}")
print(f"sqabs(y) = {m.eval(sqabs)}")
# y = 12189670989262225408
# sq(y) = 18446744073709551615
# sqabs(y) = 0
m[y] gives the value Z3 picked for our symbolic variable.
m.eval(expr) evaluates any expression in the model, including
one that uses an uninterpreted function like umul.
In this "counterexample" sq(y) and sqabs(y) really do come
out different. But we know the real functions are equivalent.
For any y, equals . So
what happened?
Z3 is free to assign any function it likes to umul. We
declared umul as a two-argument function from bitvectors to
bitvectors and said nothing else about it. So Z3 was free to
pick a umul where umul(y, y) is the max 64-bit value and
umul(-y, -y) is zero. We can ask the model for umul
directly and see the function Z3 chose:
print(m[umul])
# [(12189670989262225408, 12189670989262225408) -> 18446744073709551615,
# else -> 0]
That is the entire function: a one-entry lookup table plus a
default. umul(y, y) maps to the max 64-bit value because Z3
wrote that entry. Every other input, including umul(-y, -y),
falls through to the else branch and returns 0. That is
enough to make sq(y) and sqabs(y) disagree, and the
counterexample is self-consistent. It is just not consistent
with real bitvector multiplication.
So sat here is honest and useless. Z3 answered the question
we actually asked: is there some umul that breaks the
equivalence? Yes. That is not the question we meant to ask.
If you get the reduction wrong, you get a correct answer to the wrong question.
Attempt 3: uninterpreted multiply with one axiom
The key property of real multiplication that makes sq and sqabs equivalent: squaring commutes with negation. for any y. We tell the solver just this one fact:
axiom = umul(y, y) == umul(-y, -y)
s = Solver()
s.add(axiom)
s.add(Not(sq == sqabs))
result = s.check() # unsat in 0.02 seconds
Fast and correct. The solver does not need to know anything else about multiplication. It reasons about equality between terms, one case at a time:
- If
bvabs(y) = y:umul(bvabs(y), bvabs(y)) = umul(y, y)- which is
sq(y).
- If
bvabs(y) = -y:umul(bvabs(y), bvabs(y)) = umul(-y, -y)- which equals
umul(y, y)by the axiom - which is
sq(y).
Either way, sq(y) and sqabs(y) land in the same equivalence
class of terms, and the solver reports unsat for "can they
ever be different?" This kind of reasoning (tracking which terms
are equal to which other terms) is called congruence closure,
and it is exactly what Theory phase is about after the break.
The scaling test
The full bitvector approach bit-blasts multiplication, so its cost grows with bit width. The uninterpreted function approach reasons about equality, so bit width barely matters:
| Bit width | Full BV | UF + axiom |
|---|---|---|
| 32 | ~0.04s | ~0.01s |
| 64 | ~0.6s | ~0.03s |
| 128 | ~20s | ~0.08s |
| 256 | unknown | ~0.2s |
At 256 bits, the full bitvector approach times out and Z3
returns unknown. The uninterpreted function approach finishes
in about a fifth of a second. The proof Z3 finds here is
term-level: it works on the shape of the expressions, not on
the bit values. Z3 still does a little bit-level preprocessing
on the BV arguments, which is why the UF + axiom times grow
mildly with bit width. The growth is nothing like Full BV,
which is what we would expect if the bits actually mattered for
the proof.
The engineering skill
Choosing the right abstraction level is an engineering decision. Too concrete (full bitvector semantics) and the solver is slow. Too abstract (uninterpreted function with no axioms) and the solver gives a wrong answer. The right level of abstraction (uninterpreted function with the axioms your proof actually needs) is fast and correct.
This is the same tradeoff that shows up throughout software engineering: how much detail do you model? The solver is a tool. You decide what to tell it.
The SMT Architecture
We have used three theories so far today: integer arithmetic
(the xkcd problem), bitvector arithmetic (sq and sqabs), and
equality with uninterpreted functions (the umul move). Each
of these is a separate reasoning system inside Z3. How do they
all work together?
The picture of an SMT solver is a boolean skeleton plus a collection of specialized theory solvers:
graph TD
accTitle: SMT solver architecture
accDescr: An SMT solver routes the boolean structure of a formula to a CDCL engine and its theory literals to specialized theory solvers, one per theory.
F["Query
(logical formula)"]
F --> B["Boolean
skeleton"]
F --> L["Theory
literals"]
B --> CDCL["CDCL
(Week 2)"]
L --> T1["Integer
arithmetic"]
L --> T2["Bitvector
arithmetic"]
L --> T3["Arrays"]
L --> T4["Equality +
uninterpreted
functions"]
CDCL <-.-> T1
CDCL <-.-> T2
CDCL <-.-> T3
CDCL <-.-> T4The CDCL engine from Week 2 handles the boolean structure: the
ands, ors, and nots connecting everything together. Each
theory solver handles conjunctions of literals in its own
domain: the integer solver knows about +, *, <, =; the
bitvector solver knows about bvadd, bvmul, and bit-blasting;
the array solver knows about select and store; the equality
solver knows about = and uninterpreted function symbols.
Every theory solver presents the same interface to the CDCL engine: "give me a conjunction of literals in my theory, and I will tell you whether it is satisfiable." The CDCL engine does not need to know anything else about the theory.
How the CDCL engine and the theory solvers cooperate — who talks to whom, when, and in what order — is the architecture of SMT. It is called DPLL(T), and it is Week 5 material. Today we are going to look inside one of these theory solvers: the equality solver. It is the simplest one, and it is the foundation that the others all build on.
What We Learned
Theories let you talk to the solver in the language of your problem. Today's key ideas:
- Richer primitives make encoding easier. In sudoku, 81
integer variables and
Distinct()replaced 729 boolean variables and 12,000 pairwise exclusion clauses. A simpler reduction is easier to get right. - Abstraction is a lever on performance. Uninterpreted functions hide details the solver does not need to reason about. With the right axiom, Z3 proved sq and sqabs equivalent at the level of terms, not bits. Too concrete is slow. Too abstract is wrong. Just right is fast and correct.
- UNSAT often means verified. To check that a property
holds for all inputs, negate it and search for a
counterexample. If Z3 returns
unsat, no counterexample exists and the property holds. This is the verification pattern we will use for the rest of the course.
How Does Z3 Decide This?
Before we go to break, one more question. Consider this formula:
Here f is a completely uninterpreted function and a is a
constant in some uninterpreted sort. Z3 knows nothing about
f: it could be anything. Is there an assignment of f and
a that satisfies all three constraints at once? Think about
it for a minute before reading on.
In Z3 we need one more piece we have not used yet: an
uninterpreted sort. Declaring S = DeclareSort('S') tells
Z3 "there is some set S; I do not care what is in it." Then
a is an element of S, and f is a function from S to
S. Nothing about S is fixed; Z3 is free to pick any set it
likes when it builds a model.
S = DeclareSort('S')
f = Function('f', S, S)
a = Const('a', S)
# Build nested applications step by step
f1 = f(a)
f2 = f(f1)
f3 = f(f2)
f4 = f(f3)
f5 = f(f4)
s = Solver()
s.add(f3 == a)
s.add(f5 == a)
s.add(f1 != a)
result = s.check() # unsat in ~5 milliseconds
Z3 says unsat. No such f and a exist. And it figures this
out in about five milliseconds. Remember that f is
uninterpreted: Z3 did not enumerate all possible functions, or
all possible values for a. It reasoned about the equality
structure of the formula and determined that the constraints
contradict each other.
How? That is Theory phase. The algorithm is called congruence closure, and it is what we will spend the next fifty minutes on.
Demo Code
All demo files for this phase are in the
course code repo.
Clone the repo and run them locally with python3 sudoku-smt.py
etc. (requires z3-solver; see Setup).