Verification Pipeline

This page documents the internal architecture of Verum's SMT verification subsystem for compiler developers. The subsystem has two invocation points in the compilation pipeline:

Solver choice is an implementation detail

The language layer talks to an abstract SMT backend. The current release dispatches to one or more solver adapters through the capability router; a Verum-native solver is on the roadmap. Specific solver names below are notes about the current implementation, not part of Verum's language contract.

Phase 3a — contract verification — discharges contract#"..." literals before the type checker sees the annotated function.
Phase 4 — refinement / dependent verification — runs as a sub-step of semantic analysis (the DependentVerifier), collecting refinement-type obligations, ensures / requires clauses, and loop invariants.

Results from both flow into Phase 5 VBC codegen as tier-promotion and bounds-elimination hints. For the user-facing model, see Gradual verification; for the solver-selection policy, see SMT routing.

Sub-phase overview

The internal numbering (5.1–5.7) below is the subsystem's own internal stages — the solver work, not the public pipeline phases.

Sub-phase	Stage	Summary
5.1	Obligation collection	refinement types, `requires` / `ensures`, loop invariants, CBGR hints
5.2	SMT encoding	`verum_smt::expr_to_smtlib` + `@logic` axiom injection
5.3	Capability router	theory classification → adapter / portfolio
5.4	Executor	synchronous / portfolio / cross-validate, per-obligation timeout
5.5	Proof extraction & certify	solver log → Verum proof term (machine-checked if `@verify(certified)`)
5.6	Caching	SMT-LIB fingerprint → result, `target/smt-cache/`
5.7	Bounds elim. & CBGR hints	inform Phase 6 about provably-safe array accesses and reference tiers

5.1 — Obligation collection

Walks the typed HIR and emits one VerificationObligation per logical claim that must hold:

Each VerificationObligation carries an identifier, a kind, the in-scope bindings that form its context, a goal predicate, the source span it was emitted from, and a verify mode (runtime / static / smt / portfolio / certified).

The ObligationKind enumeration covers every reason an obligation can be raised:

Kind	Source
`RefinementWellFormed`	A refinement type such as `Int { self > 0 }`.
`RefinementSubsumption`	A narrowing / subtyping boundary.
`Precondition`	A `where requires` clause.
`Postcondition`	A `where ensures` clause.
`LoopInvariant`	An `invariant` clause.
`LoopTermination`	A `decreases` clause.
`ArrayBounds`	An `xs[i]` access with non-trivial index.
`ContextCapability`	A capability-subsumption check.
`ReferencePromotion`	Tier-promotion query (`&T → &checked T`).
`PatternExhaustiveness`	A `match` completeness check.

Sources: verum_compiler::phases::verification_phase::collect_obligations plus hooks in verum_types::infer that emit obligations during flow analysis.

5.2 — SMT encoding

verum_smt::expr_to_smtlib translates Verum expressions to SMT-LIB 2.6. The translator handles:

Primitive types → SMT sorts (Int, Real, Bool, Bitvector N).
Algebraic data types → SMT declare-datatypes.
@logic functions → define-fun-rec with termination measure.
Generic instantiations → monomorphised SMT sorts.
Cubical / path types → projected to their computational content.

@logic axiom injection: the subset of @logic fns reachable from the current obligation is collected (via transitive closure), and their bodies emitted as (define-fun-rec …) before the obligation's (assert (not goal)).

5.3 — Capability router

verum_smt::capability_router classifies each obligation by theory usage. The classifier walks the goal AST and tags it with the theories it touches; the router then maps the theory set to an adapter choice.

AST shape encountered	Theory tag added
Add / Sub / Mul	LIA
Mul of two non-constants	NIA (nonlinear)
Bit-and / Bit-or / Shift	BV
Index / Length	Array
Concat / Matches	String
Forall / Exists	Quantifier

The mapping from the resulting tag set to an adapter choice follows the rules below. The router does not name any specific solver — it picks the highest-ranked adapter whose declared capabilities cover the tag set. When portfolio mode is in effect, several adapters race against each other.

Tag set	Adapter choice
Subset of `LIA ∪ BV ∪ Array ∪ Quantifier`	Adapter strongest on linear arithmetic.
Contains `String` or `NIA`	Adapter strongest on those theories.
Contains `FiniteModelFinding`	Adapter advertising finite-model finding.
Otherwise (and portfolio mode on)	Race the available adapters in parallel.
Otherwise	Default-ranked adapter.

5.4 — Executor

Three execution strategies:

Mode	Behaviour	Trigger
Single	Dispatch to the router's choice of solver; wait for result	`@verify(formal)`
Portfolio	Both solvers plus proof search in parallel; first answer wins	`@verify(thorough)` / `@verify(reliable)`
Cross-validate	Portfolio plus an orthogonal technique; require agreement; error on disagreement	`@verify(certified)`

Timeout: default 5 s per obligation, configurable via verum.toml [verify] solver_timeout_ms. On timeout, the fallback strategy (other-solver by default) retries with the non-preferred solver.

5.5 — Proof extraction & certification

For @verify(certified) obligations:

The solver is asked to emit a proof log ((set-option :produce-proofs true)).
The log is parsed into an AST of inference steps.
Steps are normalised into Verum's ProofTerm enum:

type ProofTerm is
    | ByAssumption(obligation_id: UInt64)
    | ByReflexivity
    | BySymmetry(Heap<ProofTerm>)
    | ByTransitivity { left: Heap<ProofTerm>, right: Heap<ProofTerm> }
    | ByInduction    { base: Heap<ProofTerm>, step: Heap<ProofTerm> }
    | ByTactic       { name: Text, args: List<ProofTerm> }
    | ByAxiom        { axiom: Text }
    | BySubst        { lhs: Expr, rhs: Expr, proof: Heap<ProofTerm> }
    | ByCase         { scrutinee: Expr, arms: List<ProofTerm> };

A verifier checks each step against the axioms + context — this is the machine check.
On success, the proof term is serialised into the VBC archive's proof_certificates section.

Proof erasure (default, controlled by [codegen] proof_erasure): proof terms are marked and stripped before Phase 5 VBC codegen. The final binary carries no runtime proof verification cost — only metadata required to reconstruct proofs offline.

5.6 — Caching

Key: SHA-256 of the SMT-LIB query text plus the solver version tuple.

Storage: target/smt-cache/, one file per obligation, value is Result (sat / unsat / unknown / timeout) plus optional proof blob.

Invalidation:

Obligation text change → new hash, no hit.
Solver upgrade → version-tuple mismatch, invalidate.
Manual: delete target/.verum-cache/smt/ or run verum clean --all.

Observed hit rate: 60–70% on typical incremental builds.

5.7 — Bounds elimination & CBGR hints

Verifier results feed forward into Phase 5 VBC codegen and Phase 6 monomorphization:

Array-bounds elimination: if the solver proves i < xs.len() for every call site, the bounds check is removed before codegen.
Reference-tier promotion: if escape analysis + refinement results prove a &T reference is never stored beyond its scope, it's promoted to &checked T — codegen emits RefChecked instead of Ref, zero-cost.
Capability elision: a Database with [Read] that never reaches a Write-requiring method skips the capability check.

The subsystem's output carries a HintTable consumed by VBC codegen and by later refinement-aware passes (passes/cbgr_integration.rs).

Performance

Empirical on a 50 K-LOC mixed project, Apple M3 Max:

Theory mix	Count	Median (ms)	p95	Adapter dispatch
LIA only	2 100	8	35	linear-arithmetic adapter
LIA + bitvector	940	14	60	linear-arithmetic adapter
LIA + string	110	45	180	string-capable adapter
Nonlinear (NIA)	42	320	1800	nonlinear-capable adapter
Cubical / path	18	120	400	cubical tactic → SMT layer
overall	3 210	12	85	—

Telemetry

VERUM_SMT_TELEMETRY=1 emits a JSONL stream to .verum/telemetry/routing.jsonl:

{"obligation": "search/postcond#3", "theories": ["lia","array"], "routed": "primary", "ms": 8, "result": "unsat"}
{"obligation": "parse/postcond#1", "theories": ["lia","string"], "routed": "string-capable", "ms": 72, "result": "unsat"}
{"obligation": "crypto/invariant#7", "theories": ["lia","bv"], "routed": "portfolio", "primary_ms": 20, "complementary_ms": 35, "agreed": true}

Used to tune the capability router and to detect regressions across solver upgrades.

Trusted kernel

Everything described above — VCGen, encoding, the capability router, the SMT executor, proof extraction, caching, bounds elimination — runs outside Verum's trusted computing base. The sole trusted checker is the trusted-base kernel, held to a single-reviewer / single-session audit budget.

Surface

The kernel exposes three entry points:

Entry point	Role
`infer`	Given a `CoreTerm` and an axiom registry, return the term's inferred type or a typed `KernelError`.
`verify_full`	Given a `CoreTerm` and an expected type, confirm that `infer` agrees with the expected type.
`replay_smt_cert`	Re-derive a checkable proof term from an `SmtCertificate`; fails closed if the certificate cannot be replayed.

Each takes the typing context and the axiom registry as read-only inputs; none mutates global state.

CoreTerm is the explicit typed calculus the kernel checks — Π / Σ / Path / HComp / Transp / Glue / Refine / Inductive / Elim / Axiom / SmtProof plus the non-dependent constructors. Every other crate emits CoreTerm values that the kernel re-checks.

Trusted computing base

After the kernel lands, Verum's TCB is exactly:

The Rust compiler and its linked dependencies.
The trusted-base kernel's check / infer / verify_full entry points and their subroutines (substitute, structural_eq, universe rules).
Axioms registered via AxiomRegistry::register. Each registration stores a FrameworkId attribution (framework name + citation), so verum audit --framework-axioms can enumerate the entire trusted boundary of any proof corpus.

Consequences

SMT out of TCB. Every solver adapter — current external adapters and the forthcoming native solver alike — produces an SmtCertificate: a backend-neutral proof trace normalised by verum_smt::proof_extraction. The kernel's replay_smt_cert re-derives a CoreTerm witness from the certificate. A bug that lets a solver emit a spurious "proof" fails the replay here and cannot leak into an accepted theorem.

Tactics out of TCB. Every tactic — all 22 built-ins plus user- defined tactics — produces a CoreTerm, which the kernel re-checks via infer. A buggy tactic can refuse to build, or build an ill- typed term that infer rejects, but it can never lie to the kernel.

Elaborator out of TCB. The bidirectional elaborator in verum_types converts surface .vr syntax into CoreTerms; a bug in elaboration manifests as "legal program refused" or "kernel rejected the elaborated term", not "false theorem accepted".

Landing status

All CoreTerm constructors have real typing rules today except SmtProof, whose checker is the dedicated replay_smt_cert path. That routine is implemented per-adapter (one solver's proof format first), completing the "SMT out of TCB" story. Kernel-side test coverage tracks every replay path.

Sub-phase overview​

5.1 — Obligation collection​

5.2 — SMT encoding​

5.3 — Capability router​

5.4 — Executor​

5.5 — Proof extraction & certification​

5.6 — Caching​

5.7 — Bounds elimination & CBGR hints​

Performance​

Telemetry​

Trusted kernel​

Surface​

Trusted computing base​

Consequences​

Landing status​

See also​