Verification Pipeline
This page documents the internal architecture of Verum's SMT verification subsystem for compiler developers. The subsystem has two invocation points in the compilation pipeline:
The language layer talks to an abstract SMT backend. The current release dispatches to one or more solver adapters through the capability router; a Verum-native solver is on the roadmap. Specific solver names below are notes about the current implementation, not part of Verum's language contract.
- Phase 3a — contract verification — discharges
contract#"..."literals before the type checker sees the annotated function. - Phase 4 — refinement / dependent verification — runs as a
sub-step of semantic analysis (the
DependentVerifier), collecting refinement-type obligations,ensures/requiresclauses, and loop invariants.
Results from both flow into Phase 5 VBC codegen as tier-promotion and bounds-elimination hints. For the user-facing model, see Gradual verification; for the solver-selection policy, see SMT routing.
Sub-phase overview
The internal numbering (5.1–5.7) below is the subsystem's own internal stages — the solver work, not the public pipeline phases.
| Sub-phase | Stage | Summary |
|---|---|---|
| 5.1 | Obligation collection | refinement types, requires / ensures, loop invariants, CBGR hints |
| 5.2 | SMT encoding | verum_smt::expr_to_smtlib + @logic axiom injection |
| 5.3 | Capability router | theory classification → adapter / portfolio |
| 5.4 | Executor | synchronous / portfolio / cross-validate, per-obligation timeout |
| 5.5 | Proof extraction & certify | solver log → Verum proof term (machine-checked if @verify(certified)) |
| 5.6 | Caching | SMT-LIB fingerprint → result, target/smt-cache/ |
| 5.7 | Bounds elim. & CBGR hints | inform Phase 6 about provably-safe array accesses and reference tiers |
5.1 — Obligation collection
Walks the typed HIR and emits one VerificationObligation per
logical claim that must hold:
Each VerificationObligation carries an identifier, a kind, the
in-scope bindings that form its context, a goal predicate, the
source span it was emitted from, and a verify mode (runtime /
static / smt / portfolio / certified).
The ObligationKind enumeration covers every reason an
obligation can be raised:
| Kind | Source |
|---|---|
RefinementWellFormed | A refinement type such as Int { self > 0 }. |
RefinementSubsumption | A narrowing / subtyping boundary. |
Precondition | A where requires clause. |
Postcondition | A where ensures clause. |
LoopInvariant | An invariant clause. |
LoopTermination | A decreases clause. |
ArrayBounds | An xs[i] access with non-trivial index. |
ContextCapability | A capability-subsumption check. |
ReferencePromotion | Tier-promotion query (&T → &checked T). |
PatternExhaustiveness | A match completeness check. |
Sources: verum_compiler::phases::verification_phase::collect_obligations
plus hooks in verum_types::infer that emit obligations during flow
analysis.
5.2 — SMT encoding
verum_smt::expr_to_smtlib translates Verum expressions to SMT-LIB
2.6. The translator handles:
- Primitive types → SMT sorts (
Int,Real,Bool,Bitvector N). - Algebraic data types → SMT
declare-datatypes. @logicfunctions →define-fun-recwith termination measure.- Generic instantiations → monomorphised SMT sorts.
- Cubical / path types → projected to their computational content.
@logic axiom injection: the subset of @logic fns reachable from
the current obligation is collected (via transitive closure), and
their bodies emitted as (define-fun-rec …) before the obligation's
(assert (not goal)).
5.3 — Capability router
verum_smt::capability_router classifies each obligation by
theory usage. The classifier walks the goal AST and tags it
with the theories it touches; the router then maps the theory
set to an adapter choice.
| AST shape encountered | Theory tag added |
|---|---|
| Add / Sub / Mul | LIA |
| Mul of two non-constants | NIA (nonlinear) |
| Bit-and / Bit-or / Shift | BV |
| Index / Length | Array |
| Concat / Matches | String |
| Forall / Exists | Quantifier |
The mapping from the resulting tag set to an adapter choice follows the rules below. The router does not name any specific solver — it picks the highest-ranked adapter whose declared capabilities cover the tag set. When portfolio mode is in effect, several adapters race against each other.
| Tag set | Adapter choice |
|---|---|
Subset of LIA ∪ BV ∪ Array ∪ Quantifier | Adapter strongest on linear arithmetic. |
Contains String or NIA | Adapter strongest on those theories. |
Contains FiniteModelFinding | Adapter advertising finite-model finding. |
| Otherwise (and portfolio mode on) | Race the available adapters in parallel. |
| Otherwise | Default-ranked adapter. |
5.4 — Executor
Three execution strategies:
| Mode | Behaviour | Trigger |
|---|---|---|
| Single | Dispatch to the router's choice of solver; wait for result | @verify(formal) |
| Portfolio | Both solvers plus proof search in parallel; first answer wins | @verify(thorough) / @verify(reliable) |
| Cross-validate | Portfolio plus an orthogonal technique; require agreement; error on disagreement | @verify(certified) |
Timeout: default 5 s per obligation, configurable via
verum.toml [verify] solver_timeout_ms. On timeout, the fallback
strategy (other-solver by default) retries with the non-preferred
solver.
5.5 — Proof extraction & certification
For @verify(certified) obligations:
- The solver is asked to emit a proof log (
(set-option :produce-proofs true)). - The log is parsed into an AST of inference steps.
- Steps are normalised into Verum's
ProofTermenum:
type ProofTerm is
| ByAssumption(obligation_id: UInt64)
| ByReflexivity
| BySymmetry(Heap<ProofTerm>)
| ByTransitivity { left: Heap<ProofTerm>, right: Heap<ProofTerm> }
| ByInduction { base: Heap<ProofTerm>, step: Heap<ProofTerm> }
| ByTactic { name: Text, args: List<ProofTerm> }
| ByAxiom { axiom: Text }
| BySubst { lhs: Expr, rhs: Expr, proof: Heap<ProofTerm> }
| ByCase { scrutinee: Expr, arms: List<ProofTerm> };
- A verifier checks each step against the axioms + context — this is the machine check.
- On success, the proof term is serialised into the VBC archive's
proof_certificatessection.
Proof erasure (default, controlled by [codegen] proof_erasure):
proof terms are marked and stripped before Phase 5 VBC codegen. The
final binary carries no runtime proof verification cost — only
metadata required to reconstruct proofs offline.
5.6 — Caching
Key: SHA-256 of the SMT-LIB query text plus the solver version tuple.
Storage: target/smt-cache/, one file per obligation, value is
Result (sat / unsat / unknown / timeout) plus optional proof blob.
Invalidation:
- Obligation text change → new hash, no hit.
- Solver upgrade → version-tuple mismatch, invalidate.
- Manual: delete
target/.verum-cache/smt/or runverum clean --all.
Observed hit rate: 60–70% on typical incremental builds.
5.7 — Bounds elimination & CBGR hints
Verifier results feed forward into Phase 5 VBC codegen and Phase 6 monomorphization:
- Array-bounds elimination: if the solver proves
i < xs.len()for every call site, the bounds check is removed before codegen. - Reference-tier promotion: if escape analysis + refinement
results prove a
&Treference is never stored beyond its scope, it's promoted to&checked T— codegen emitsRefCheckedinstead ofRef, zero-cost. - Capability elision: a
Database with [Read]that never reaches aWrite-requiring method skips the capability check.
The subsystem's output carries a HintTable consumed by VBC codegen
and by later refinement-aware passes (passes/cbgr_integration.rs).
Performance
Empirical on a 50 K-LOC mixed project, Apple M3 Max:
| Theory mix | Count | Median (ms) | p95 | Adapter dispatch |
|---|---|---|---|---|
| LIA only | 2 100 | 8 | 35 | linear-arithmetic adapter |
| LIA + bitvector | 940 | 14 | 60 | linear-arithmetic adapter |
| LIA + string | 110 | 45 | 180 | string-capable adapter |
| Nonlinear (NIA) | 42 | 320 | 1800 | nonlinear-capable adapter |
| Cubical / path | 18 | 120 | 400 | cubical tactic → SMT layer |
| overall | 3 210 | 12 | 85 | — |
Telemetry
VERUM_SMT_TELEMETRY=1 emits a JSONL stream to
.verum/telemetry/routing.jsonl:
{"obligation": "search/postcond#3", "theories": ["lia","array"], "routed": "primary", "ms": 8, "result": "unsat"}
{"obligation": "parse/postcond#1", "theories": ["lia","string"], "routed": "string-capable", "ms": 72, "result": "unsat"}
{"obligation": "crypto/invariant#7", "theories": ["lia","bv"], "routed": "portfolio", "primary_ms": 20, "complementary_ms": 35, "agreed": true}
Used to tune the capability router and to detect regressions across solver upgrades.
Trusted kernel
Everything described above — VCGen, encoding, the capability router, the SMT executor, proof extraction, caching, bounds elimination — runs outside Verum's trusted computing base. The sole trusted checker is the trusted-base kernel, held to a single-reviewer / single-session audit budget.
Surface
The kernel exposes three entry points:
| Entry point | Role |
|---|---|
infer | Given a CoreTerm and an axiom registry, return the term's inferred type or a typed KernelError. |
verify_full | Given a CoreTerm and an expected type, confirm that infer agrees with the expected type. |
replay_smt_cert | Re-derive a checkable proof term from an SmtCertificate; fails closed if the certificate cannot be replayed. |
Each takes the typing context and the axiom registry as read-only inputs; none mutates global state.
CoreTerm is the explicit typed calculus the kernel checks — Π / Σ /
Path / HComp / Transp / Glue / Refine / Inductive / Elim / Axiom /
SmtProof plus the non-dependent constructors. Every other crate
emits CoreTerm values that the kernel re-checks.
Trusted computing base
After the kernel lands, Verum's TCB is exactly:
- The Rust compiler and its linked dependencies.
- The trusted-base kernel's
check/infer/verify_fullentry points and their subroutines (substitute,structural_eq, universe rules). - Axioms registered via
AxiomRegistry::register. Each registration stores aFrameworkIdattribution (framework name + citation), soverum audit --framework-axiomscan enumerate the entire trusted boundary of any proof corpus.
Consequences
SMT out of TCB. Every solver adapter — current external
adapters and the forthcoming native solver alike — produces an
SmtCertificate: a backend-neutral proof trace normalised by
verum_smt::proof_extraction. The kernel's replay_smt_cert
re-derives a CoreTerm witness from the certificate. A bug that
lets a solver emit a spurious "proof" fails the replay here and
cannot leak into an accepted theorem.
Tactics out of TCB. Every tactic — all 22 built-ins plus user-
defined tactics — produces a CoreTerm, which the kernel re-checks
via infer. A buggy tactic can refuse to build, or build an ill-
typed term that infer rejects, but it can never lie to the kernel.
Elaborator out of TCB. The bidirectional elaborator in
verum_types converts surface .vr syntax into CoreTerms; a bug
in elaboration manifests as "legal program refused" or "kernel
rejected the elaborated term", not "false theorem accepted".
Landing status
All CoreTerm constructors have real typing rules today except
SmtProof, whose checker is the dedicated replay_smt_cert path.
That routine is implemented per-adapter (one solver's proof format
first), completing the "SMT out of TCB" story. Kernel-side test
coverage tracks every replay path.
See also
- SMT integration — the surrounding SMT subsystem (solver adapters, proof search tactics).
- Verification → gradual verification — user-facing model,
@framework(...)attribute, trusted-boundary audit. - Verification → SMT routing — solver selection policy.
- proof stdlib → PCC — certificate format.