Skip to main content

Verification Pipeline

This page documents the internal architecture of Verum's SMT verification subsystem for compiler developers. The subsystem has two invocation points in the compilation pipeline:

Solver choice is an implementation detail

The language layer talks to an abstract SMT backend. The current release dispatches to one or more solver adapters through the capability router; a Verum-native solver is on the roadmap. Specific solver names below are notes about the current implementation, not part of Verum's language contract.

  • Phase 3a — contract verification — discharges contract#"..." literals before the type checker sees the annotated function.
  • Phase 4 — refinement / dependent verification — runs as a sub-step of semantic analysis (the DependentVerifier), collecting refinement-type obligations, ensures / requires clauses, and loop invariants.

Results from both flow into Phase 5 VBC codegen as tier-promotion and bounds-elimination hints. For the user-facing model, see Gradual verification; for the solver-selection policy, see SMT routing.

Sub-phase overview

The internal numbering (5.1–5.7) below is the subsystem's own internal stages — the solver work, not the public pipeline phases.

Sub-phaseStageSummary
5.1Obligation collectionrefinement types, requires / ensures, loop invariants, CBGR hints
5.2SMT encodingverum_smt::expr_to_smtlib + @logic axiom injection
5.3Capability routertheory classification → adapter / portfolio
5.4Executorsynchronous / portfolio / cross-validate, per-obligation timeout
5.5Proof extraction & certifysolver log → Verum proof term (machine-checked if @verify(certified))
5.6CachingSMT-LIB fingerprint → result, target/smt-cache/
5.7Bounds elim. & CBGR hintsinform Phase 6 about provably-safe array accesses and reference tiers

5.1 — Obligation collection

Walks the typed HIR and emits one VerificationObligation per logical claim that must hold:

Each VerificationObligation carries an identifier, a kind, the in-scope bindings that form its context, a goal predicate, the source span it was emitted from, and a verify mode (runtime / static / smt / portfolio / certified).

The ObligationKind enumeration covers every reason an obligation can be raised:

KindSource
RefinementWellFormedA refinement type such as Int { self > 0 }.
RefinementSubsumptionA narrowing / subtyping boundary.
PreconditionA where requires clause.
PostconditionA where ensures clause.
LoopInvariantAn invariant clause.
LoopTerminationA decreases clause.
ArrayBoundsAn xs[i] access with non-trivial index.
ContextCapabilityA capability-subsumption check.
ReferencePromotionTier-promotion query (&T → &checked T).
PatternExhaustivenessA match completeness check.

Sources: verum_compiler::phases::verification_phase::collect_obligations plus hooks in verum_types::infer that emit obligations during flow analysis.

5.2 — SMT encoding

verum_smt::expr_to_smtlib translates Verum expressions to SMT-LIB 2.6. The translator handles:

  • Primitive types → SMT sorts (Int, Real, Bool, Bitvector N).
  • Algebraic data types → SMT declare-datatypes.
  • @logic functions → define-fun-rec with termination measure.
  • Generic instantiations → monomorphised SMT sorts.
  • Cubical / path types → projected to their computational content.

@logic axiom injection: the subset of @logic fns reachable from the current obligation is collected (via transitive closure), and their bodies emitted as (define-fun-rec …) before the obligation's (assert (not goal)).

5.3 — Capability router

verum_smt::capability_router classifies each obligation by theory usage. The classifier walks the goal AST and tags it with the theories it touches; the router then maps the theory set to an adapter choice.

AST shape encounteredTheory tag added
Add / Sub / MulLIA
Mul of two non-constantsNIA (nonlinear)
Bit-and / Bit-or / ShiftBV
Index / LengthArray
Concat / MatchesString
Forall / ExistsQuantifier

The mapping from the resulting tag set to an adapter choice follows the rules below. The router does not name any specific solver — it picks the highest-ranked adapter whose declared capabilities cover the tag set. When portfolio mode is in effect, several adapters race against each other.

Tag setAdapter choice
Subset of LIA ∪ BV ∪ Array ∪ QuantifierAdapter strongest on linear arithmetic.
Contains String or NIAAdapter strongest on those theories.
Contains FiniteModelFindingAdapter advertising finite-model finding.
Otherwise (and portfolio mode on)Race the available adapters in parallel.
OtherwiseDefault-ranked adapter.

5.4 — Executor

Three execution strategies:

ModeBehaviourTrigger
SingleDispatch to the router's choice of solver; wait for result@verify(formal)
PortfolioBoth solvers plus proof search in parallel; first answer wins@verify(thorough) / @verify(reliable)
Cross-validatePortfolio plus an orthogonal technique; require agreement; error on disagreement@verify(certified)

Timeout: default 5 s per obligation, configurable via verum.toml [verify] solver_timeout_ms. On timeout, the fallback strategy (other-solver by default) retries with the non-preferred solver.

5.5 — Proof extraction & certification

For @verify(certified) obligations:

  1. The solver is asked to emit a proof log ((set-option :produce-proofs true)).
  2. The log is parsed into an AST of inference steps.
  3. Steps are normalised into Verum's ProofTerm enum:
type ProofTerm is
| ByAssumption(obligation_id: UInt64)
| ByReflexivity
| BySymmetry(Heap<ProofTerm>)
| ByTransitivity { left: Heap<ProofTerm>, right: Heap<ProofTerm> }
| ByInduction { base: Heap<ProofTerm>, step: Heap<ProofTerm> }
| ByTactic { name: Text, args: List<ProofTerm> }
| ByAxiom { axiom: Text }
| BySubst { lhs: Expr, rhs: Expr, proof: Heap<ProofTerm> }
| ByCase { scrutinee: Expr, arms: List<ProofTerm> };
  1. A verifier checks each step against the axioms + context — this is the machine check.
  2. On success, the proof term is serialised into the VBC archive's proof_certificates section.

Proof erasure (default, controlled by [codegen] proof_erasure): proof terms are marked and stripped before Phase 5 VBC codegen. The final binary carries no runtime proof verification cost — only metadata required to reconstruct proofs offline.

5.6 — Caching

Key: SHA-256 of the SMT-LIB query text plus the solver version tuple.

Storage: target/smt-cache/, one file per obligation, value is Result (sat / unsat / unknown / timeout) plus optional proof blob.

Invalidation:

  • Obligation text change → new hash, no hit.
  • Solver upgrade → version-tuple mismatch, invalidate.
  • Manual: delete target/.verum-cache/smt/ or run verum clean --all.

Observed hit rate: 60–70% on typical incremental builds.

5.7 — Bounds elimination & CBGR hints

Verifier results feed forward into Phase 5 VBC codegen and Phase 6 monomorphization:

  • Array-bounds elimination: if the solver proves i < xs.len() for every call site, the bounds check is removed before codegen.
  • Reference-tier promotion: if escape analysis + refinement results prove a &T reference is never stored beyond its scope, it's promoted to &checked T — codegen emits RefChecked instead of Ref, zero-cost.
  • Capability elision: a Database with [Read] that never reaches a Write-requiring method skips the capability check.

The subsystem's output carries a HintTable consumed by VBC codegen and by later refinement-aware passes (passes/cbgr_integration.rs).

Performance

Empirical on a 50 K-LOC mixed project, Apple M3 Max:

Theory mixCountMedian (ms)p95Adapter dispatch
LIA only2 100835linear-arithmetic adapter
LIA + bitvector9401460linear-arithmetic adapter
LIA + string11045180string-capable adapter
Nonlinear (NIA)423201800nonlinear-capable adapter
Cubical / path18120400cubical tactic → SMT layer
overall3 2101285

Telemetry

VERUM_SMT_TELEMETRY=1 emits a JSONL stream to .verum/telemetry/routing.jsonl:

{"obligation": "search/postcond#3", "theories": ["lia","array"], "routed": "primary", "ms": 8, "result": "unsat"}
{"obligation": "parse/postcond#1", "theories": ["lia","string"], "routed": "string-capable", "ms": 72, "result": "unsat"}
{"obligation": "crypto/invariant#7", "theories": ["lia","bv"], "routed": "portfolio", "primary_ms": 20, "complementary_ms": 35, "agreed": true}

Used to tune the capability router and to detect regressions across solver upgrades.

Trusted kernel

Everything described above — VCGen, encoding, the capability router, the SMT executor, proof extraction, caching, bounds elimination — runs outside Verum's trusted computing base. The sole trusted checker is the trusted-base kernel, held to a single-reviewer / single-session audit budget.

Surface

The kernel exposes three entry points:

Entry pointRole
inferGiven a CoreTerm and an axiom registry, return the term's inferred type or a typed KernelError.
verify_fullGiven a CoreTerm and an expected type, confirm that infer agrees with the expected type.
replay_smt_certRe-derive a checkable proof term from an SmtCertificate; fails closed if the certificate cannot be replayed.

Each takes the typing context and the axiom registry as read-only inputs; none mutates global state.

CoreTerm is the explicit typed calculus the kernel checks — Π / Σ / Path / HComp / Transp / Glue / Refine / Inductive / Elim / Axiom / SmtProof plus the non-dependent constructors. Every other crate emits CoreTerm values that the kernel re-checks.

Trusted computing base

After the kernel lands, Verum's TCB is exactly:

  1. The Rust compiler and its linked dependencies.
  2. The trusted-base kernel's check / infer / verify_full entry points and their subroutines (substitute, structural_eq, universe rules).
  3. Axioms registered via AxiomRegistry::register. Each registration stores a FrameworkId attribution (framework name + citation), so verum audit --framework-axioms can enumerate the entire trusted boundary of any proof corpus.

Consequences

SMT out of TCB. Every solver adapter — current external adapters and the forthcoming native solver alike — produces an SmtCertificate: a backend-neutral proof trace normalised by verum_smt::proof_extraction. The kernel's replay_smt_cert re-derives a CoreTerm witness from the certificate. A bug that lets a solver emit a spurious "proof" fails the replay here and cannot leak into an accepted theorem.

Tactics out of TCB. Every tactic — all 22 built-ins plus user- defined tactics — produces a CoreTerm, which the kernel re-checks via infer. A buggy tactic can refuse to build, or build an ill- typed term that infer rejects, but it can never lie to the kernel.

Elaborator out of TCB. The bidirectional elaborator in verum_types converts surface .vr syntax into CoreTerms; a bug in elaboration manifests as "legal program refused" or "kernel rejected the elaborated term", not "false theorem accepted".

Landing status

All CoreTerm constructors have real typing rules today except SmtProof, whose checker is the dedicated replay_smt_cert path. That routine is implemented per-adapter (one solver's proof format first), completing the "SMT out of TCB" story. Kernel-side test coverage tracks every replay path.

See also