Soundness gates in proof validation
The proof validator (proof_validator module)
sits between user-supplied proof terms and the kernel. Its job is to
turn every "trust the user" path into an explicit verification gate
that either accepts a proof for a real reason or rejects it with a
diagnostic. This page enumerates the soundness gates that guard each
proof rule, what happens when the gate trips, and how to extend the
gate set when adding a new rule.
The architectural rule across every gate is the same: the validator never advances state from a proof term unless the term structurally corresponds to the goal it claims to discharge. Syntactic similarity is not a proof of mathematical property; shared shape under the wrong operator is not equality.
Catalogue of gates
apply — user-defined inference rules
Path: validate_apply → apply_inference_rule → catch-all arm.
The catch-all arm routes through lookup_registered_rule, which
reads self.inference_rules populated via
register_inference_rule(name, premises, conclusion). There is
no fallback to "trust the user" when the rule name is unknown:
the gate returns
ValidationError::ValidationFailed { "unknown inference rule '<name>'" }
with a hint pointing at register_inference_rule as the
remediation.
Arity check happens at the same gate: the user must supply exactly
the number of premises the rule schema declares, otherwise
"rule '<name>' expects N premises, got M".
forall_elim / exists_intro — quantifier rules
Path: apply_inference_rule → "forall_elim" | "univ_elim" and
"exists_intro" | "exist_intro" arms.
The forall_elim gate requires premises[0] to be syntactically
ExprKind::Forall { .. }. The exists_intro gate requires the
expected conclusion to be syntactically ExprKind::Exists { .. }.
Without these gates, both rules accepted any premise + any
expected, mirroring the catch-all soundness leak.
Full higher-order matching (verifying expected = body[x := t] for some t) is tracked separately; the current shape gates catch the common misuse of applying these rules to non-quantified terms.
Hypothesis { id, formula } — referencing a hypothesis
Path: validate_hypothesis.
Three gates fire:
- Sanity:
formula == expected(user's formula matches the claim). - Reference:
h{id}is in scope. - Content: the hypothesis at
h{id}actually carriesformulaas its proposition. Without this gate, a hypothesish0 : Pcould be re-labeled by the user as provingQwheneverQ == expectedsyntactically.
Mismatch on (3) returns
ValidationError::PropositionMismatch { expected, actual }.
Rewrite { hypothesis, ... } — substitution via equality
Path: try_rewrite (proof_search) and try_rewrite_at.
The named hypothesis is resolved through find_hypothesis_index.
The resolved hypothesis must structurally be a Binary { op: Eq, .. }; otherwise the gate returns
"rewrite: hypothesis '<name>' is not an equality". There is no
fallback to scanning for the first equality regardless of name —
the user's choice of hypothesis is honoured, including the
diagnostic when that choice is wrong.
The recursive walker try_rewrite_once covers the full ExprKind
shape — Call, MethodCall, Field, TupleIndex, Index, Tuple, Array,
Cast, Pipeline, NullCoalesce, Try — so subterm matching reaches
into function calls and structured literals, not just
arithmetic / Boolean trees.
Exact { term } — discharging by exhibition
Path: apply_exact (tactic_evaluation).
Discharges the goal only when the term satisfies one of:
- A
Pathwhose name resolves to a hypothesis whose proposition is structurally equal to the goal — the canonicalexact hshape. - The goal expression itself, by structural equality (covers
literal
true ⊢ true, reflexive equalities, and verbatim re-statement of the goal).
Anything else returns TacticError::Failed naming the proof term
shape and the goal it failed to discharge.
Commutativity { left, right } — algebraic equation
Path: validate_commutativity → is_commutative_pair.
The operator must be on a whitelist of mathematically commutative
operators: Add, Mul, And, Or, Eq, Ne, BitAnd, BitOr,
BitXor. Anything else (notably Sub, Div, Imply, Lt,
Le, Gt, Ge, Concat, Shl, Shr, In) returns false from
the structural matcher even when operands are syntactically swapped.
Without this gate, "commutativity" proofs of 5 - 3 = 3 - 5 were
accepted.
The Call arm of is_commutative_pair returns false unconditionally
— arbitrary user-defined functions are not assumed commutative by
the validator. Registering specific commutative functions is a
future extension.
QuickExplain { unsat_core, explanation } — unsat-core conclusion
Path: validate_quick_explain.
QuickExplain's semantics: the conjunction of unsat_core is
unsatisfiable, so the rule concludes False. The conclusion is
the literal false regardless of core size. Pre-fix only the
empty-core branch checked expected == False; the non-empty
branch fell through to unconditional accept, so a user could
supply any unsat-core list and claim arbitrary propositions.
Post-fix the gate is uniform: expr_eq(make_false(), expected)
for both empty and non-empty cores. Mismatch returns a
diagnostic naming the core size and what the user claimed.
Full unsat-core soundness — re-checking the conjunction with SMT to confirm it really is unsatisfiable — is tracked separately.
ApplyDef { def_proof, original, name } — definition unfolding
Path: validate_apply_def.
The rule's semantics is "unfold definition name in original
to get expected." The gate requires def_proof.conclusion() to
be a Binary { op: Eq, .. } — the shape of a definitional
equality name == body. Without this gate, a user could pair
any proof of expected with arbitrary original/name claims;
pre-fix def_proof could conclude anything as long as it equalled
expected.
Full apply-def soundness — verifying
expected = original[name := body] — requires substitution
machinery and is tracked separately. The shape gate catches the
most common misuse: a def_proof that isn't a definitional
equality at all.
SkHack { formula, skolemized } — Skolemization premise shape
Path: validate_sk_hack.
Skolemization is the transformation ∃x. P(x) → P(f(free_vars)).
The gate requires formula.kind to be ExprKind::Exists { .. } —
otherwise SkHack is being applied to a non-existential formula
and the rule cannot soundly fire. Pre-fix the formula parameter
was _-bound and completely ignored, so a user could Skolemize
any term (literal true, an arithmetic expression, anything) as
long as skolemized == expected matched syntactically.
Full Skolemization soundness — verifying
skolemized = body[x := f(free_vars)] for some fresh f —
requires higher-order substitution and is tracked separately. The
shape gate covers the most common misuse: applying SkHack to a
formula that isn't existential at all.
PullQuantifier / PushQuantifier / ElimUnusedVars — quantifier rewrites
Path: validate_pull_quantifier, validate_push_quantifier,
validate_elim_unused_vars.
Each rule has a structural prerequisite on its input formula:
- Pull:
(∀x.P) ∧ Q → ∀x.(P ∧ Q)requires the input to be aBinary { left, right, .. }where at least one operand is a quantifier. Otherwise there is nothing to pull out. - Push:
∀x.(P ∧ Q) → (∀x.P) ∧ (∀x.Q)requires a quantifier whose body is aBinary { .. }. Otherwise there is no inner structure to push the quantifier into. - ElimUnusedVars:
∀x.P → P(when x ∉ FV(P)) requires the input to be a quantifier. Non-quantified formulas have nothing to eliminate.
Pre-fix all three validators had their _formula parameter bound
to _ and ignored it entirely; only result == expected was
checked. A user could apply any of these "rewrites" to a literal
true and the rule passed. Post-fix each rule rejects
shape-mismatched inputs with a diagnostic naming the formula that
failed the shape check. Full free-variable / freshness checks are
tracked separately.
IffOEq { iff_proof, left, right } — biconditional to oriented equality
Path: validate_iff_oeq.
After validating iff_proof internally and checking the claim
shape (left = right) == expected, the gate verifies that
iff_proof.conclusion() is structurally Binary { op: Iff | Eq, left: l, right: r } where l == left and r == right. Without
this gate, a user could pair an iff proof of P <=> Q with a
claim A = B for unrelated A, B — pre-fix accepted the
mismatch since the iff-proof internally validated and (A = B) == expected matched.
Symmetry is NOT silently inferred: B <=> A does not satisfy a
claim of A = B even though equality is commutative, because the
gate checks structural shape rather than semantic equivalence —
explicit symmetry must come from a separate proof step.
Distributivity { formula } — algebraic equation
Path: validate_distributivity → is_distributivity_shape.
The formula must structurally match one of the canonical distributivity shapes for one of the recognised (outer, inner) operator pairs:
| Outer | Inner | Domain |
|---|---|---|
Mul | Add | arithmetic |
And | Or | logical |
Or | And | logical, dual |
BitAnd | BitOr | bitwise |
BitOr | BitAnd | bitwise, dual |
Both directions check: left distributivity
a OUTER (b INNER c) = (a OUTER b) INNER (a OUTER c) and right
distributivity (a INNER b) OUTER c = (a OUTER c) INNER (b OUTER c).
Sub-expression comparison uses expr_eq, so arbitrary
sub-expressions work — the rule is structural, not literal. A
formula like BoolLit(true) does not match any shape and is
rejected.
Rewrite-rule conditions — discharge_rewrite_condition
Path: validate_rewrite_rule_application → for each
rule.conditions[i]: discharge_rewrite_condition.
A conditional rewrite (e.g. safe_div(a, b) → a / b requiring
b ≠ 0) is only sound when each instantiated condition holds.
The gate accepts exactly four shapes:
- The literal
true. - Reflexive equality
x == x. - The condition matches a registered axiom.
- The condition matches a hypothesis currently in scope.
Anything else returns
ValidationError::RewriteError { "unverified condition" }
naming the condition. There is no silent acceptance — without one
of these structural matches, the rewrite is rejected with a
remediation hint pointing at register_axiom /
add_hypothesis.
recheck_with_smt — translating hypotheses
Path: recheck_with_smt.
Each hypothesis in scope is translated to a the SMT backend expression via the
shared translator and asserted on the solver. Hypotheses that
don't translate cleanly are skipped (sound conservative — operates
without that assumption rather than asserting vacuous truth).
Pre-fix the loop bound a fresh Bool::new_const(name) and
asserted that, completely discarding the hypothesis's
proposition. the SMT backend saw every hypothesis as h0 := true regardless of
content; re-checks that should have found counterexamples
silently passed.
recheck_with_smt — sort tracking for free variables
Path: interpolation::quantifier_eliminate.
Walks the formula's AST via
collect_typed_variables_from_bool to harvest each variable's
actual sort (Int, Bool, Real, Bool, …). Bound-variable
construction uses Dynamic::fresh_const(name, &sort) so the
existential quantifier binds the same the SMT backend constants that appear
free in the body. Pre-fix all variables were defaulted to Bool,
which left the QE tactic operating on a vacuous quantifier (the SMT backend
distinguishes constants by name AND sort, so Bool::new_const("x")
does not bind a free Int::new_const("x")).
Hypothesis context
HypothesisContext::iter_propositions walks every hypothesis in
every scope, giving soundness-checking code an O(n) way to
recognise a target proposition is currently assumed regardless of
which name it was bound to. Used by discharge_rewrite_condition
and by ad-hoc gates that need to recognise an instantiated
condition as already-in-context.
Bool-typed hypotheses
The proof search engine carries an externally-populated
bool_typed_hypotheses: HashSet<Text> registry. Callers register a
hypothesis as Bool via register_bool_hypothesis(name). When the
cases_on h flow encounters a Path-shaped hypothesis whose name is
in this set, it splits into two subgoals: one carrying
hyp == true and one carrying hyp == false as derived
hypotheses (not bare boolean literals — the equalities connect
the chosen branch back to the variable being analysed).
The engine ships with the registry empty: there are no hardcoded "this name looks like a Bool" heuristics. Bool-ness comes from the caller's typing context.
Variant-type metadata
A single registry, variant_map: HashMap<TypeName, Vec<CtorName>>
populated via register_variant_type(type_name, ctors), drives
multiple gates:
- Constructor case-analysis (
try_cases_on): aCallwhose head is in some registered variant produces one subgoal per registered constructor of that variant. - Variable-type inference (
infer_variable_type): walks the goal/hypotheses forvar == Ctor(...)shapes, looks Ctor up invariant_map, returns the parent type. No name heuristics, no default to "Nat". - Constructor lookup for induction (
get_type_constructors): reads ctor names from the registry; recursive-arg metadata comes from the parallelvariant_recursive_argsregistry populated viaregister_variant_recursion.
The proof engine carries no hardcoded knowledge of stdlib type names. Every variant-type-aware decision flows through the registries.
Reading a soundness diagnostic
When a gate trips, the diagnostic carries enough information to locate the offending proof step and the remediation. Typical shapes:
unknown inference rule '<name>'. Register it via ProofValidator::register_inference_rule before referencing it in a proof.rule '<name>' expects N premises, got Mrewrite: hypothesis '<name>' is not an equality (got <kind>)rewrite rule '<name>' has unverified condition '<expr>' — register it as an axiom or introduce it as a hypothesis before rewritingcommutativity: expressions are not commutative variantsdistributivity: formula <expr> does not match a canonical distributivity shapeinduction: type '<name>' is not registered. Call ProofSearchEngine::register_variant_type first.induction: cannot infer type of variable '<var>'. Register the variant type via register_variant_type, or attach the type to a hypothesis like '<var> == SomeCtor(...)'.
Each diagnostic names what failed, why, and the API call that fixes it. The validator never emits a generic "proof failed" — if the user's tactic is wrong, they know which gate tripped.
Adding a new rule
When implementing validate_<new_rule>:
- Identify the structural premise the rule depends on (e.g. "the formula must be an equality with two halves", "the premise must be quantified").
- Implement that premise as a separate
is_<shape>predicate so it can be exercised independently in tests. - The validator function calls
is_<shape>(args)first; on mismatch returnsValidationError::ValidationFailedwith a diagnostic that names the actual shape and the expected shape. - Never return
Ok(expected.clone())after a no-op pass. The catch-all arm ofapply_inference_rulewas the source of multiple soundness leaks (seeinference_rule_soundness.rsandcommutativity_soundness.rsregression suites) precisely because it took that shortcut.
Regression test corpus
Each gate is locked by a regression test that constructs a proof
term that pre-fix would have silently passed and asserts the
post-fix rejects it. The corpus lives under
the verification test suite:
inference_rule_soundness.rs— unknown rules + arity + quantifier shape gates.hypothesis_content_check.rs— hypothesis-content gate.rewrite_condition_soundness.rs— conditional rewrite gate.recheck_smt_hypothesis_translation.rs— SMT recheck context.expr_eq_alpha_equivalence.rs— α-equivalence inexpr_eq_with_binding.commutativity_soundness.rs— commutativity-operator whitelist.exact_tactic_soundness.rs—exactstructural verification.vcgen_errdefer_normal_path.rs— errdefer no-op on normal path.iff_oeq_soundness.rs— iff_proof-to-claimed-pair link inIffOEq.apply_def_soundness.rs— def_proof-must-conclude-equality gate inApplyDef.
Across the corpus, every "for now, just trust the user" path documented in the source has at least one negative test that exercises the soundness gate.