Practice
PEC is a 2009 tool for proving compiler optimizations correct. We walk it as a case study in solver-aided programming.
Compiler optimizations
A compiler takes source code and produces an executable. Modern compilers rewrite the program along the way for performance: reordering operations, hoisting computations out of loops, inlining functions, eliminating dead code. Every rewrite has to preserve the program's meaning. When one does not, the compiler silently changes what we wrote.
Compilers have bugs
CSmith generates random C programs and compares outputs across multiple compilers, flagging inconsistencies as compiler bugs (PLDI 2011). It has reported over 325 distinct bugs in mainstream C compilers. The GCC and LLVM bugs by compiler stage (paper Table 4):
| Stage | GCC | LLVM |
|---|---|---|
| Front end | 0 | 10 |
| Middle end | 49 | 75 |
| Back end | 17 | 74 |
| Unclassified | 13 | 43 |
| Total | 79 | 202 |
EMI is complementary. It takes a real program, mutates code that does not execute on the test input, and checks that the optimizer still produces equivalent output across the variants (PLDI 2014). Over eleven months, EMI reported 147 confirmed unique bugs against GCC and LLVM. The breakdown by bug kind (paper Table 2):
| Kind | GCC | LLVM |
|---|---|---|
| Wrong code | 46 | 49 |
| Crash | 23 | 10 |
| Performance | 10 | 9 |
| Total | 79 | 68 |
Optimization passes concentrate the bugs, and most are wrong-code: the program compiles cleanly but produces wrong output.
A buggy optimization
The C program below should return immediately. GCC -O3 used to compile it into an infinite loop.
int a, b, c, d, e;
int main() {
for (b = 4; b > -30; b--)
for (; c;)
for (;;) {
e = a > 2147483647 - b;
if (d) break;
}
return 0;
}
Global variables in C default to 0. So c is 0, and the condition of the second for loop is always false. The innermost loop never executes, and main returns immediately.
Two of GCC's loop optimizations interacted badly. Partial Redundancy Elimination identified 2147483647 - b as invariant for the inner loop, and Loop Invariant Motion hoisted it out. After hoisting, the expression overflowed for the negative values b takes during execution. GCC's signed-overflow analysis flagged this as undefined behavior, and the compiler emitted non-terminating code on that path.
EMI found this miscompilation (PLDI 2014, GCC PR 58731).
Of course, this example feels contrived; it was automatically generated in a project to find compiler bugs. But programs an optimizer actually sees often look stranger than that. C++ template instantiation emits code no human writes. Inlining fuses caller and callee, exposing dead branches that did not exist in the source. Code generators of every kind, including autotuners, DSL compilers, and ML frameworks, produce C that humans never type. The optimizer has to be correct on all of them.
Testing finds bugs that testing happens to hit. To trust an optimization on every input, we have to prove it correct.
Verifying optimizations
Two approaches to verified compiler optimization have been around for decades:
| Translation validation | A priori | |
|---|---|---|
| Timing | After each compilation | Once, before the compiler ships |
| Scope | This specific input vs this specific output | A class of inputs vs a class of outputs |
| Cost | Per compilation | One-time |
Translation validation runs the optimizer, then checks that this specific input mapped to this specific output preserving semantics. Pnueli et al. introduced the idea (TACAS 1998); Necula extended it to GCC's optimizer (PLDI 2000); Alive2 (Lopes et al., PLDI 2021) does it at LLVM scale today and finds optimizer bugs in mainline regularly. Checking equivalence of two concrete programs is often easier than reasoning statically about the optimization code itself, but the cost is paid every compilation.
A priori verification proves the transformation correct before the compiler ships. Cobalt and Rhodium (Lerner et al., PLDI 2003 and POPL 2005) proved single-statement rewrites. CompCert (Leroy, POPL 2006) verified a whole optimizing compiler, and PEC (Kundu, Tatlock, Lerner, PLDI 2009) extended the approach to many-to-many rewrites including loop optimizations. Alive (Lopes et al., PLDI 2015) brought it to LLVM peephole rules at production scale. The cost is paid once, but the proof obligations get harder as the optimizations get more complex.
PEC
PEC verifies compiler optimizations expressed as parameterized rewrite rules. A single rule can describe an optimization that fires on infinitely many concrete programs, and PEC proves the rule correct once and for all. PEC is expressive: it can support loop optimizations (software pipelining, unrolling, peeling, interchange, fusion) and classical scalar optimizations (common subexpression elimination, copy propagation, branch folding).
PEC proves a rule correct by checking that two parameterized programs are equivalent: the program before the rewrite and the program after. If the two are equivalent under the rule's side conditions, the rewrite is sound for every concrete instance.
This raises four questions:
- What is a parameterized program?
- How can we represent compiler optimizations as rewrites between parameterized programs?
- What does it mean for two parameterized programs to be equivalent?
- How can a solver check that?
A parameterized program
A parameterized program contains metavariables: placeholders that stand for arbitrary program pieces. Here is one:
I := 0;
while (I < E) {
S;
I++;
}
The three metavariables stand for different kinds of program text:
Istands for any program variable (a name likek,i, orcounter).Estands for any expression (a value-producing piece like100,n, orlength - 1).Sstands for any statement (a single operation or a block of straight-line code).
A parameterized program represents the set of all concrete programs you can produce by replacing each metavariable with text drawn from its category. A concrete program matches the parameterized program under a substitution: a mapping from each metavariable to the text that fills its place. Here is one substitution for the program above:
| Metavariable | Substitutes for |
|---|---|
I |
k |
E |
100 |
S |
a[k] += k |
Applying the substitution produces:
k := 0;
while (k < 100) {
a[k] += k;
k++;
}
The parameterized version represents this concrete program along with infinitely many others, one for each substitution.
Rewrite rules
PEC models compiler optimizations as rewrite rules. Each rule is a pair of parameterized programs plus side conditions on how they may be instantiated. The FIND program is what the rule expects to see in the input. The REPLACE program is what the rule produces. The compiler applies the rule by matching FIND against actual code and substituting REPLACE.
The loop peeling rule:
FIND
I := 0;
while (I < E) {
S;
I++;
}
REPLACE
I := 0;
while (I < E - 1) {
S;
I++;
}
S;
I++;
WHERE
E > 0
S does not modify I, E
The rule shifts one iteration of the loop out of the body. The side conditions constrain when the rule applies:
E > 0ensures the loop has at least one iteration to peel. IfE = 0, the original loop runs zero times; the REPLACE version would still runS; I++once and produce a different result.S does not modify I or Eensures the rewrite preserves the loop's behavior. If S could writeI, the peeled iteration'sI++would step a different counter than expected. A write toEwould shift the loop bound.
Applying the rule to a concrete loop:
k := 0;
while (k < 100) {
a[k] += k;
k++;
}
The substitution I ↦ k, E ↦ 100, S ↦ a[k] += k makes the FIND pattern match. The side conditions check: 100 > 0 holds, and a[k] += k writes only to a[k], not k or 100. Applying the same substitution to REPLACE produces:
k := 0;
while (k < 99) {
a[k] += k;
k++;
}
a[k] += k;
k++;
Both programs produce the same final state. The peeled version splits the original's 100 iterations into 99 in the loop plus one in the peeled tail.
Equivalence
L07's symbolic-execution engine reasoned about straight-line program fragments using the strongest postcondition. Each path through the code became a Z3 query. For programs with loops, the engine unrolled to a bound and reasoned within it.
L08 extended the approach to unbounded loops. The cost was loop invariants: we had to provide one for each loop, and the WP engine turned the annotated program into Z3 obligations.
PEC asks a stronger question: do two parameterized programs produce equivalent results, for every instantiation of their metavariables and every starting state?
The picture is two parallel runs from the same starting state:
flowchart TB
accTitle: Two parallel runs from the same starting state
accDescr: From σ, run FIND and REPLACE; both final states σ₁' and σ₂' must agree on live variables.
start([starting state σ])
F[run FIND]
R[run REPLACE]
e1([state σ₁'])
e2([state σ₂'])
agree{{agree on live vars}}
start --> F --> e1
start --> R --> e2
e1 --> agree
e2 --> agreeThe live variables are the variables the rest of the program might use. Temporary variables introduced by the rewrite do not need to match.
The loop-peeling example above is one such pair. Under the substitution I ↦ k, E ↦ 100, S ↦ a[k] += k, both the original and the peeled loop end with k = 100 and a holding the same sums. Formally: for every substitution satisfying the rule's side conditions and every starting state , running and from produces final states that agree on the live variables.
L08 verified a program against a property (a postcondition). PEC verifies a property that links two programs: their final states must agree on the live variables. The next section shows how PEC discharges this obligation, including how it handles the loops inside FIND and REPLACE.
Pairs of programs come up beyond compiler optimization. Refactoring a function, replacing one library with a faster equivalent, lifting a binary back to C: all of these are equivalence-checking problems with the same shape. The PEC technique gives you a starting point.
PEC's algorithm
L08 reasoned about loops with a user-supplied invariant. PEC tries to find the invariants on its own.
PEC discharges the equivalence obligation in three steps:
- Find synchronization points in the two CFGs.
- Generate invariants at each sync point.
- Check that every path between sync points preserves its invariants. Strengthen on failure.
We trace the three steps through loop peeling.
Synchronization points
A synchronization point is a pair of locations, one in FIND and one in REPLACE, where PEC tracks an invariant linking the two program states. PEC seeds the sync points from the CFGs:
- An entry sync point pairs the two program starts; an exit sync point pairs the two program ends.
- Interior sync points sit at statement metavariables. PEC walks both CFGs in lockstep and inserts a sync point at each occurrence; when
Ssits inside a loop, the sync point cuts the loop into bounded path segments. - PEC prunes paths the side conditions render infeasible. In loop peeling, the path that exits the loop immediately (
I ≥ E) cannot fire:I = 0at the loop test and side conditionE > 0together rule it out.
Initial invariants
An invariant at a sync point is a predicate over the two program states . These predicates compose into the formulas PEC sends to the solver in step 3. PEC seeds each one with (the states agree here) and conjoins any branch conditions taken along the path to this sync point.
We write for the value of expression when its variables are read from state , the same lookup L07's SE engine did when substituting state into an expression to build a path constraint. The seeded invariants for loop peeling:
| Sync point | Invariant |
|---|---|
| Entry | |
| A | |
| B | |
| Exit |
Invariant B captures the geometry of peeling: FIND is still inside the loop () while REPLACE has just exited (). The peeled iteration on the REPLACE side runs after.
The solver query
For each path between two sync points, PEC builds one query for the solver. The query says: if the predecessor invariant holds and we execute the FIND and REPLACE paths from there in parallel, the successor invariant holds at the other end.
We write for the state after executing program fragment starting in state ; this is the same forward-execution L07's SE engine performed statement by statement. Let be the FIND path S; I++; I < E and let be the REPLACE path S; I++; I >= E - 1, one trip through the loop body. The obligation says that starting from invariant A, executing and in parallel lands at invariant B:
PEC sends each obligation to the solver. Two outcomes:
- Valid. The invariant holds across this path. Move on.
- Invalid. The current invariants are not yet a simulation: some path does not preserve them. PEC strengthens the predecessor invariant by adding the weakest precondition of the successor invariant under the failing path, then retries every path that touches it. The strengthening loop converges in practice but is bounded to guarantee termination; exceeding the bound rejects the rule.
When every path returns Valid, the invariants hold at every sync point: the two parameterized programs agree on the live variables. The rule is proven correct once and for all, for every substitution that satisfies the side conditions.
Why this works
The sync points and their invariants form a simulation relation between FIND and REPLACE: related states step to related states. PEC's ATP queries check that the candidate relation is in fact a simulation; the strengthening loop refines the candidate until they all pass. By induction over execution traces, the relation propagates from entry to exit. The exit invariant forces , so the two programs agree on the live variables. The XCert follow-up (PLDI 2010) mechanized this argument in Rocq.
Going deeper
For readers who want to dig in:
- Talk. The PLDI 2009 talk walks the same material slide by slide.
- Code. The PEC repository is about 2,000 lines of OCaml. The main pieces:
src/synch.mlfinds sync points by walking the two CFGs in lockstep.src/check.mlbuilds the per-path obligations and runs the strengthening loop. Theobligationfunction (~30 lines) is the algorithm we walked above.src/semantics.mldefines the operator semantics PEC hands to the solver as background axioms.test/relate/p-loop-peel-01.rwris the loop-peeling rule we used as the running example, in PEC's rewrite-rule DSL.
Theory next: what writing this kind of tool looks like today.