Skip to main content
  (Week 6)

DPLL(T)

CDCL is good at boolean SAT. The theory solvers from L03 and L04 decide conjunctions of atoms in one theory. Nelson-Oppen lets us combine theories, but still only a conjunction goes in. Almost every formula we have written uses disjunctions. DPLL(T) is the SMT algorithm that connects CDCL to the theory layer.

Where Nelson-Oppen Left Off

The last page ended with a formula that exposes the gap:

(x=yf(x)f(y))(a+b>5a+b<3)

Each disjunct is unsatisfiable:

So the whole formula is unsatisfiable. But Nelson-Oppen cannot decide this directly, because its interface accepts a conjunction of theory literals. The OR is outside its scope.

DPLL(T) splits the work by what each engine is good at. The SAT engine handles the boolean skeleton: the ANDs, ORs, and NOTs over theory atoms, treating each atom as an opaque true/false. The theory solver (or a Nelson-Oppen combination) checks the conjunction of atoms the SAT engine commits to. The two layers communicate in a loop until one terminates with a definitive answer.

The three-layer architecture from L05 is now fully populated:

graph LR
    accTitle: SMT solver architecture
    accDescr: An SMT formula enters DPLL(T), which sends conjunctions of theory literals to a Nelson-Oppen layer, which dispatches to individual theory solvers.
    F["SMT formula φ"]
    DPLLT["DPLL(T)
(today)"] NO["Nelson-Oppen
(L05)"] Solvers["theory solvers
(L03–L04)"] F --> DPLLT DPLLT -- "conjunctions" --> NO NO --> Solvers

This is the last time we need this picture. After today, the full stack is visible and all three layers have algorithms behind them.

The Idea

DPLL(T) decides quantifier-free first-order formulas. The core trick: replace each theory atom in ϕ with a fresh propositional variable to get a boolean abstraction ϕP. A tiny example:

ϕ=(x=1)(x=2)

Each atom becomes a proposition:

ϕP=b1b2

ϕP is propositionally satisfiable: set b1=b2=true. But ϕ is theory-unsatisfiable: x cannot equal 1 and 2. The abstraction has thrown away theory information.

The key fact: the abstraction over-approximates satisfiability. Every theory-SAT model of ϕ lifts to a propositional model of ϕP. The reverse fails: a propositional model of ϕP may have no theory witness.

flowchart TB
    accTitle: Boolean abstraction over-approximates
    accDescr: The set of theory-satisfying models of phi is contained inside the set of propositionally-satisfying models of phi_P.
    subgraph outer["Propositional models of φP"]
        inner["Theory models of φ"]
    end

Two consequences:

DPLL(T) is built on this asymmetry. Enumerate candidate models of ϕP with CDCL, hand each one to the theory solver, keep going until you either run out (UNSAT) or find one the theory blesses (SAT).

Offline DPLL(T)

The algorithm

The simplest version is offline DPLL(T). Offline means the theory solver is called only after the SAT engine has produced a complete boolean assignment, not during the search.

The shape of the algorithm before any details:

  1. Build a boolean abstraction ϕP of ϕ.
  2. Ask CDCL: is ϕP propositionally satisfiable?
    • No: return UNSAT.
    • Yes: get a propositional model μP.
  3. Ask the theory: does μP correspond to a real solution of ϕ?
    • Yes: return SAT.
    • No: add a clause to ϕP that rules out μP. Go to 2.

The theory either blesses an abstraction model or makes us try again with one fewer candidate.

The abstraction step happens once; the other three steps loop. Each has a standard name: T2B for the abstraction, CDCL for the propositional check, B2T for the back-translation to theory literals, T-solve for the theory check. The four moves in detail:

T2B. Build the boolean abstraction ϕP by replacing each theory atom in ϕ with a fresh propositional variable. The formula structure (∧, ∨, ¬) is preserved.

CDCL. Run a SAT solver on ϕP. If the result is UNSAT, return UNSAT. If it produces a satisfying model μP, proceed.

B2T. Translate μP back to theory literals by reversing the T2B substitution. Call the result μT.

T-solve. Ask the theory solver whether μT is satisfiable. If SAT, return SAT. If UNSAT, learn the conflict clause ¬μP, add it to ϕP, and go back to CDCL.

Putting the four moves together:

Offline-DPLL_T(T-formula φ):
    φP ← T2B(φ)
    while (TRUE) do
        μP, res ← CDCL(φP)
        if res = UNSAT then return UNSAT
        μT ← B2T(μP)
        T-res ← T-solve(μT)
        if T-res = SAT then return SAT
        else φP ← φP ∧ ¬μP

The control flow as a picture:

graph TD
    accTitle: Offline DPLL(T) main loop
    accDescr: T2B builds the boolean abstraction; CDCL either returns UNSAT or a model; the model is refined by B2T and given to T-solve; T-solve either returns SAT or triggers learning a conflict clause and looping back to CDCL.
    Start(["T-formula φ"])
    T2B["φP ← T2B(φ)"]
    CDCL{"CDCL(φP)"}
    UNSAT(["return UNSAT"])
    B2T["μT ← B2T(μP)"]
    Tsolve{"T-solve(μT)"}
    SAT(["return SAT"])
    Learn["φP ← φP ∧ ¬μP"]
    Start --> T2B
    T2B --> CDCL
    CDCL -- "UNSAT" --> UNSAT
    CDCL -- "model μP" --> B2T
    B2T --> Tsolve
    Tsolve -- "SAT" --> SAT
    Tsolve -- "UNSAT" --> Learn
    Learn --> CDCL

T2B: the boolean skeleton

T2B walks ϕ and replaces each theory atom with a fresh propositional variable. The boolean connectives stay where they are.

Take the formula we will trace shortly:

ϕ=(x=1)((x=2)(x=3))

Three distinct theory atoms appear: a1=(x=1), a2=(x=2), a3=(x=3). Introduce one fresh propositional variable per atom and substitute:

ϕP=b1(b2b3)

The shape of ϕ carries over exactly. Only the leaves change.

Formally, T2B is recursive on the structure of the formula:

T2B(ai)=bi T2B(ϕ1ϕ2)=T2B(ϕ1)T2B(ϕ2) T2B(ϕ1ϕ2)=T2B(ϕ1)T2B(ϕ2) T2B(¬ϕ)=¬T2B(ϕ)

The atom case is the substitution. The connective cases say leave the structure alone.

T2B keeps a table of which propositional variable it picked for each atom. That table is what makes B2T possible: B2T reads the same table backwards.

B2T: from boolean back to theory

When CDCL hands back a propositional model μP, B2T turns it into a conjunction of theory literals by undoing the T2B substitution literal by literal.

Literal in μP B2T produces Reading
bi (assigned true) ai the atom must hold
¬bi (assigned false) ¬ai the atom must not hold

The negation case is where students get tripped up. If the SAT solver assigned b2=false, that is a decision that atom a2 is false. For a2=(x=2), that means x2.

For example, if μP=b1¬b2b3, then:

B2T(μP)=(x=1)(x2)(x=3)

Pair work: T2B and B2T

Take three minutes with your neighbor on a fresh formula:

ϕ=(a=b)((f(a)=c)(f(b)=c))
  1. List the theory atoms in ϕ.
  2. Apply T2B to get ϕP.
  3. Suppose CDCL returns μP=¬b1b2b3. Apply B2T to get μT.
Answers
  1. Three atoms: a1 = (a = b), a2 = (f(a) = c), a3 = (f(b) = c).
  2. φP = b1 ∨ (b2 ∧ b3).
  3. μT = (a ≠ b) ∧ (f(a) = c) ∧ (f(b) = c).

The ¬b1 in μP is the case students often miss. b1 = false means atom a1 does not hold, so B2T produces ¬a1, which is a ≠ b.

Bonus: μT is theory-SAT in EUF. Pick a, b distinct with f(a) = f(b) = c.

Worked Example

Run the loop on a formula small enough to trace by hand:

ϕ=(x=1)((x=2)(x=3))

It is unsatisfiable: x cannot equal 1 and also equal 2 or 3.

Setup. T2B maps each atom to a fresh boolean variable:

Theory atom Bool var
x=1 b1
x=2 b2
x=3 b3

So ϕP=b1(b2b3). This is propositionally satisfiable. CDCL will find a model on the first call.

CDCL is non-deterministic: different implementations may enumerate satisfying assignments in a different order and reach UNSAT in a different number of iterations. The demo in 01-dpllt.py may produce a different ordering.


Iteration 1.

CDCL proposes μP=b1b2b3.

B2T gives μT=(x=1)(x=2)(x=3).

T-solve returns UNSAT: x cannot simultaneously equal 1, 2, and 3.

Learn the conflict clause ¬b1¬b2¬b3. Update:

ϕPb1(b2b3)(¬b1¬b2¬b3)

Iteration 2.

CDCL proposes μP=b1¬b2b3.

B2T gives μT=(x=1)(x2)(x=3).

T-solve returns UNSAT: x=1 and x=3 are inconsistent.

Learn ¬b1b2¬b3. Update ϕP to add this clause.


Iteration 3.

CDCL proposes μP=b1b2¬b3.

B2T gives μT=(x=1)(x=2)(x3).

T-solve returns UNSAT: x=1 and x=2 are inconsistent.

Learn ¬b1¬b2b3. The accumulated ϕP is now:

ϕP=b1(b2b3)(¬b1¬b2¬b3)(¬b1b2¬b3)(¬b1¬b2b3)

Iteration 4.

CDCL says ϕP is UNSAT. With b1=true forced (it is a unit), the three learned clauses reduce to ¬b2¬b3, b2¬b3, and ¬b2b3. Together with the original b2b3, these are four constraints over two variables that admit no satisfying assignment.

Return UNSAT.


Summary:

Iter μP μT T-result Learned
1 b1b2b3 (x=1)(x=2)(x=3) UNSAT ¬b1¬b2¬b3
2 b1¬b2b3 (x=1)(x2)(x=3) UNSAT ¬b1b2¬b3
3 b1b2¬b3 (x=1)(x=2)(x3) UNSAT ¬b1¬b2b3
4 CDCL: UNSAT

The 01-dpllt.py demo runs this trace with verbose=True. Every column of the table appears in the printed output.

Soundness and Termination

Soundness. Recall from The Idea that ϕP over-approximates ϕ: every theory model of ϕ lifts to a propositional model of ϕP, but the reverse can fail. The two ways the loop returns are both sound:

Termination. There are finitely many boolean assignments to ϕP, and each T-UNSAT iteration eliminates at least one of them by learning ¬μP. The loop must run out of assignments and terminate.

Better Conflict Clauses: Unsatisfiable Cores

The learned clause ¬μP is a sledgehammer. It names every literal in μP, so it blocks exactly one assignment: the one we just rejected.

If ϕP has 100 variables and only three of them actually caused the theory conflict, the other 97 literals are along for the ride. The learned clause has 100 literals and rules out 1 assignment out of 2100. The next CDCL call faces almost the same search space.

A smaller clause learns the same fact about the conflict but blocks more candidates. A 3-literal clause ¬b1¬b2¬b3 rules out every assignment that sets those three literals true, no matter what the other 97 do. That is 297 assignments eliminated in a single step.

So we want to identify the literals that actually caused the theory conflict, and learn only those. Two literals can already be inconsistent (x=1 and x=2 on their own); the rest are bystanders.

The fix: minimum unsatisfiable core. Given a theory-UNSAT conjunction μT, find the smallest subset SμT that is still theory-UNSAT. Convert S back to propositional variables via T2B and learn ¬T2B(S) instead of ¬μP.

The change is to the conflict-clause-learning branch on the last line of the loop body. The original else line:

else φP ← φP ∧ ¬μP

expands to three lines:

else
    S ← MinUnsatCore(μT)
    t ← T2B(S)
    φP ← φP ∧ ¬t

The rest of the algorithm is unchanged.

Pair work: minimum unsat core

Take 90 seconds. Find a minimum unsatisfiable core of:

(x5)(x3)(x=4)
Answers

Three of them work:

Any two of the three literals are jointly inconsistent. "Minimum" means no element of the core can be removed without losing UNSAT. It does not mean unique. A real solver picks whichever core it finds first based on the order it tries removing literals.

With this change, the trace on the calibration example shortens to two theory calls:

Iteration 1 (with cores).

CDCL proposes μP=b1b2b3.

B2T gives μT=(x=1)(x=2)(x=3).

T-solve returns UNSAT. The minimum unsatisfiable core is {(x=1),(x=2)}: those two are inconsistent on their own, and dropping either one makes the set satisfiable.

Learn the binary clause ¬b1¬b2. Update:

ϕPb1(b2b3)(¬b1¬b2)

Iteration 2 (with cores).

CDCL proposes μP=b1¬b2b3.

B2T gives μT=(x=1)(x2)(x=3).

T-solve returns UNSAT. The core is {(x=1),(x=3)}: x=1 and x=3 are inconsistent.

Learn ¬b1¬b3. Update ϕP to add this clause.

Iteration 3.

CDCL finds ϕP UNSAT. The reasoning chain:

Return UNSAT.

Three theory calls in the original trace become two with cores, on a tiny formula. On a real formula with hundreds of variables and only a handful of culprits per conflict, the difference is exponential.

Computing the true minimum unsatisfiable core is expensive in general, so real solvers use near-minimal cores: drop literals from μT one at a time, test whether the remaining set is still theory-UNSAT, and keep only the literals that the test shows are necessary. The improvement in practice is substantial even with near-minimal cores.

An unresolved limitation remains: offline DPLL(T) still requires a complete boolean assignment before calling the theory solver at all. The theory solver cannot prune partial assignments early. Online DPLL(T) fixes this.

Online DPLL(T)

Offline DPLL(T) waits too long. CDCL might commit at decision level 1 to two literals that are already theory-inconsistent (say b1=(x=1) and b2=(x=2)), but we keep going until CDCL assigns every variable before asking the theory. On a formula with 100 variables, that is up to 98 wasted decisions per dead branch.

Online DPLL(T) interleaves. Each time CDCL reaches a stable state (a BCP fixpoint), ask the theory whether the partial assignment is already broken. If yes, conflict now and backtrack. If the theory derives an implied literal, take it and let BCP propagate further.

The shape of the loop:

  1. T-Decide. Pick an unassigned literal and add it to μP.
  2. T-Deduce. Run BCP to a fixpoint, then hand the current partial B2T(μP) to the theory. The theory either reports CONFLICT (theory-UNSAT under the partial), reports SAT with every variable assigned (return SAT), derives a new implied literal (add to μP and keep propagating), or reports SAT with variables left unassigned (back to step 1).
  3. T-AnalyzeConflict / T-Backtrack. On a conflict, learn a clause and jump back to the right decision level.

The four standard CDCL procedures (from L02) each acquire a theory-aware version, and a new T-Preprocess step runs once before the main loop. The pseudocode signature carries the theory assignment μ with the search state:

Online-DPLL_T(T-formula φ, T-assignment μ):
    if T-Preprocess(φ, μ) = CONFLICT then return UNSAT
    φP, μP ← T2B(φ), T2B(μ)
    while (TRUE) do
        T-Decide(φP, μP)
        while (TRUE) do
            res ← T-Deduce(φP, μP)
            if res = SAT then return SAT
            else if res = CONFLICT
                blevel ← T-AnalyzeConflict(φP, μP)
                if blevel < 0 then return UNSAT
                else T-Backtrack(blevel, φP, μP)
            else break

The correspondence with L02 CDCL:

CDCL (L02) Online DPLL(T) What changes
(none) T-Preprocess One-time simplification before the main loop
Decide T-Decide May use theory semantics to guide variable choice
BCP T-Deduce Also calls the theory solver on each partial assignment; adds early pruning and theory propagation
AnalyzeConflict T-AnalyzeConflict Can generate mixed boolean-theory conflict clauses
Backtrack T-Backtrack Also undoes incremental theory solver state

T-Preprocess simplifies ϕ once before the loop: drop redundant operators, exploit associativity, apply theory-specific simplifications. If the formula is already a conflict, return UNSAT immediately.

T-Decide picks an unassigned propositional literal and adds it to μP. The theory-aware version can use theory semantics to choose, for example preferring literals likely to propagate in the theory.

T-Deduce is where the inlining happens. It runs BCP on ϕP and μP as usual, then calls the theory solver on the current partial assignment B2T(μP) each time BCP reaches a fixpoint:

T-AnalyzeConflict extends CDCL's first-UIP analysis. Boolean conflicts produce the same clause as ordinary CDCL. Theory conflicts produce a mixed boolean-theory conflict clause, with both propositional variables and theory-implied literals in a single learned clause.

T-Backtrack adds the learned clause to ϕP (T-learning) and undoes all assignments above the target level (T-backjumping). The theory solver also rolls back its incremental state.

The payoff:

Delayed Theory Combination

Online DPLL(T) inlined the theory layer into CDCL but left Nelson-Oppen as a black box on the inside. Each T-Deduce call hands a conjunction to Nelson-Oppen and waits. The two solvers inside propagate equalities to each other via the L05 protocol, but CDCL cannot see those equalities or learn from them. The clean modularity hides exactly the facts CDCL would use for early pruning and conflict analysis.

Delayed Theory Combination (DTC) breaks that boundary by lifting Nelson-Oppen's internal handshake into CDCL.

Recall from L05: when two theories T1 and T2 share constants c1,,cn, they cooperate by exchanging equalities of the form eij=(ci=cj). These (n2) candidate equalities are the interface equalities: they are exactly the facts the two theories need to share to decide a joint formula. Nelson-Oppen propagates them inside its own loop, hidden from CDCL:

graph TB
    accTitle: Online DPLL(T) architecture before DTC
    accDescr: CDCL hands a joint assignment to a combined T1-T2 solver implemented as Nelson-Oppen, which propagates interface equalities between T1 and T2 internally.
    OC["CDCL"]
    ONO["T₁ ∪ T₂ solver
(Nelson-Oppen)"] OT1["T₁"] OT2["T₂"] OC -- "μ₁ ∪ μ₂" --> ONO ONO -- "∨eᵢⱼ" --> OT1 ONO -- "∨eᵢⱼ" --> OT2 OT1 -- "sat/unsat" --> ONO OT2 -- "sat/unsat" --> ONO

DTC pulls the interface equalities out. Each eij gets a fresh boolean variable bij, added to the CDCL abstraction:

ϕPϕP{bij}
graph TB
    accTitle: DTC architecture
    accDescr: CDCL holds interface-equality booleans b_ij in its abstraction. T1 and T2 each talk directly to CDCL, never to each other.
    DC["CDCL
(+ bᵢⱼ for each eᵢⱼ)"] DT1["T₁"] DT2["T₂"] DC -- "μ_T₁, μ_e" --> DT1 DC -- "μ_T₂, μ_e" --> DT2 DT1 -- "sat/unsat, implied eᵢⱼ" --> DC DT2 -- "sat/unsat, implied eᵢⱼ" --> DC

Each theory solver now runs separately and talks only to CDCL, never to the other theory directly. When T1 derives ci=cj under the current partial assignment, it propagates bij=true back to CDCL. CDCL then hands bij=true to T2 the same way it hands over any other assignment. What used to be an internal Nelson-Oppen handshake is now ordinary unit propagation on a boolean variable.

The Online DPLL(T) procedures adapt to the split. T-Deduce routes the partial assignment by theory: T1-atoms to T1, T2-atoms to T2, interface equalities to both. T-AnalyzeConflict and T-Backtrack collect implied literals from each side and merge them. Early pruning and theory propagation now apply independently to each theory.

This is what Z3 and CVC5 implement. Both use DTC or close variants rather than textbook Nelson-Oppen.

Three Variants of DPLL(T)

Three versions of DPLL(T), three abstraction boundaries crossed for performance:

Version Theory called on Theory cooperation Modularity cost
Offline DPLL(T) Complete assignments Nelson-Oppen black box None
Online DPLL(T) Partial assignments Nelson-Oppen black box Theory solver needs incremental interface
DTC Partial assignments Through CDCL booleans Theories expose interface equalities to CDCL

The descent is the standard story for performance work: a clean modular system is too slow, so abstractions get inlined and contracts get rewritten. Production solvers (Z3, CVC5) live at the DTC end. Textbook Nelson-Oppen is still the right conceptual frame for understanding the system, just not the implementation.

Source

The offline and online DPLL(T) algorithms are from Roberto Nieuwenhuis, Albert Oliveras, and Cesare Tinelli, "Solving SAT and SAT Modulo Theories: From an Abstract Davis-Putnam-Logemann-Loveland Procedure to DPLL(T)" (Journal of the ACM, 53(6), 2006). This is the paper that unified the offline, online, and DTC variants in a single abstract framework. The pseudocode on this page follows Emina Torlak's CSE 507 presentation of their Algorithm 1 (offline) and Algorithm 2 (online).

DTC is from Roberto Bruttomesso, Alessandro Cimatti, Anders Franzén, Alberto Griggio, and Roberto Sebastiani, "Delayed Theory Combination vs. Nelson-Oppen for Satisfiability Modulo Theories: a Comparative Analysis" (Annals of Mathematics and Artificial Intelligence, 55(1–2), 2009).