This PR replaces the previous tactic configuration system with a
significantly more efficient one that supports custom configuration
syntaxes and processing. On a simple benchmark, configuration evaluation
takes 6.2% of the time it used to. The `declare_config_elab` command
generates a configuration elaborator that now directly constructs
configuration objects; previously it relied on `Meta.evalExpr'`, which
involved running a configuration through the full term elaboration,
compilation, and evaluation processes. The generated configuration
elaborators now also have the capability to do direct `Syntax`
evaluation in common cases, skipping term elaboration. Furthermore, the
elaborator accepts configurations more liberally: any user-defined
syntax that has the form of an `optConfig`-style configuration or
configuration item (including, e.g., `namedArgument`s) is accepted.
Import `Lean.Elab.ConfigEval` to use the system; see this module for
some documentation in addition to the docstrings in
`Lean.Elab.ConfigEval.Commands`. Furthermore, the `simp` tactic now also
has `(user.optionName := ...)` user configuration options, which can be
declared using a global `tactic.simp.user.optionName` option; use
`getUserConfigOption` and `withUserConfig` to access and set these in
metaprograms.
Other features:
- `declare_config_elab` creates a function that exposes an `init`
parameter for the configuration that will be modified. It also now has a
`where` clause, enabling defining custom handlers for specific keys.
- Elaborators can be given additional binders, to make parameterized
elaborators. This is used by `simp` and `grind ` to support multiple
default configurations with the correct expected type for `(config :=
...)` elaboration.
- The `EvalTerm` class supports direct evaluation of `Syntax`, skipping
term elaboration. The system will attempt to automatically derive this
class when generating the elaborator.
- In case `EvalTerm` does not recognize the term, then the syntax is
elaborated to an expression and an `EvalExpr` instance is applied to
evaluate the expression. The system will similarly automatically derive
these instances if possible.
- Automatic derivation is *transitive*. It is able to seek instances
through other instances; e.g. if it needs an `EvalTerm (List T)`
instance it will be able to reduce this to seeking an `EvalTerm T`
instance.
- The system is designed to be flexible, and the various components can
be combined to construct a configuration elaborator. There are also now
`declare_core_config_elab` and `declare_term_config_elab` for
conveniently generating elaborators for `CoreM` and `TermElabM`. The
difference is that the first takes an explicit flag for whether to log
exceptions, and the second uses the current `errToSorry` state.
**Warning:** if you use the `TermElabM` one from `TacticM`, it will be
unaware of the current `recover` state. The only differences between
these macros are the ways error recovery is handled per monad.
Other changes:
- `#reduce` tactic configuration now makes use of this system and has
more options
- The module `Lean.Elab.Tactic.ConfigSetter` is removed; the
`declare_config_elab`-family macros subsume its functionality.
- The module `Lean.Elab.Tactic.Config` is deprecated and will be
removed; migration notes appear in the module docs there. Import
`Lean.Elab.ConfigEval` instead.
- One of the mvcgen benchmarks got significantly slower, but it turned
out to be caused by the new tactic configuration elaboration no longer
resetting the MetaM caches. Adding an explicit `resetCache` into the
test driver fixed the benchmark.
### Notes for metaprogram authors
If you are using the module system, you just need a `meta import
Lean.Elab.ConfigEval` to use the macros, and it should serve as a
drop-in replacement to the previous system so long as
1. your configuration type is a `structure` with no parameters, indices,
or universes (only `Type` is supported);
2. default values are self-contained and not dependent on other fields;
and
3. all fields have types that are composed from `Option`, `List`,
`Array`, `String`, `DataValue`, and inductive types in `Type` with no
parameters or indices, whose fields are similarly composed.
The macros synthesize a self-contained configuration elaboration
procedure, analyzing the `EvalTerm` and `EvalExpr` instances that are
currently available or can be automatically derived. These are the
components of the system:
- `EvalTerm` instances provide `Term → TermElabM α` functions for
evaluation of raw syntax; these handle the common cases where an option
value is a identifier, application, or other simple expression. They are
responsible for adding TermInfo when info is enabled, to support hovers.
One can invoke derivation of a `EvalTerm T` instance with the
`ensure_eval_term_instance T` command (after `open scoped
Lean.Elab.ConfigEval`).
- `EvalExpr` instances provide `Expr → TermElabM α` functions for
evaluation of elaborated expressions; these handle cases where an option
value may require reduction to evaluate. Similarly, one can invoke
derivation of an `EvalExpr T` instance with the
`ensure_eval_expr_instance T` command. If needed, there's also
`derive_eval_expr_instance_using_meta_eval T` for creating a
`Meta.evalExpr'`-based evaluator.
- Functions like `ConfigEval.evalExprWithElab` compose `EvalTerm` and
`EvalExpr` instances into a single procedure that first attempts
`EvalTerm`, and, if that fails, applies the standard term elaborator and
then attempts `EvalExpr`. This way term elaboration can be skipped in
all but uncommon cases.
- Configuration item interpretation is through `ConfigEval.foldConfigM`,
which is a procedure with a lax specification for what counts as a
configuration item, calling the provided function on each recognized
configuration item. The idea is:
- Null nodes are lists of configurations
- One-argument nodes are considered to be wrappers like `optConfig` or
`configItem`
- Two-argument nodes of the form `("+"<|>"-") (atom<|>ident)` are
considered to be boolean options
- Five-argument nodes of the form `"(" (atom<|>ident) ":=" syntax ")"`
are considered to be general configuration items. (It only checks for
the presence of `(` and that there are two-to-five arguments.)
- Bare atoms are considered to be positive boolean options
- Configuration evaluation then uses `EvalConfigItem.set` on each item
produced by the fold, for an `EvalConfigItem` structure defined for the
given configuration type. The `def_eval_config_item` command can be used
to generate this structure. It analyzes which `EvalTerm` and `EvalExpr`
instances exist and derives missing ones, then builds an efficient
procedure to process configuration items and apply evaluators.
- Lastly, there are the `declare_core_config_elab`,
`declare_term_config_elab`, `declare_config_elab`, and
`declare_command_config_elab` macros for conveniently running the
`def_eval_config_item` command and constructing a self-contained
elaboration function.
The derivation procedures analyze which `EvalTerm`/`EvalExpr` instances
already exist and only derive the "leaf" instances that are necessary to
construct `EvalTerm` and `EvalExpr` instances. The derived instances are
made `private local` — since configuration elaborators are meant to be
self-contained, we decided not to let the additional instances be a side
effect of the macros. The instances can be globally added by manually
using the `ensure_*` commands.
The macros support making parameterized elaborators with arbitrary
additional binders. See `make_elab_grind_config` and
`make_elab_simp_config` in core Lean for examples of generating a single
elaborator that's used with multiple default value configurations.
To see how to create a key handler that matches all configuration keys
with a given prefix, see `make_elab_simp_config`.
There is a todo item at `Lean.Elab.ConfigEval.ReflectConfigItems` for
reflecting configurations back to syntax, which is not yet supported.
### Performance evaluation
A legacy configuration parser was temporarily added to
`Lean.Elab.Tactic.Grind.Config` using `declare_term_config_elab_legacy
elabGrindConfigLegacy Grind.Config`, and then this file was used for
measuring elaboration time:
```lean
import Lean
open Lean Elab Meta Tactic Parser
def cfgs : Array Syntax := Unhygienic.run do
return #[
← `(Tactic.optConfig| ),
← `(Tactic.optConfig| +clean),
← `(Tactic.optConfig| +trace +markInstances -lookahead -useSorry),
← `(Tactic.optConfig| (trace := true) (markInstances := true) (lookahead := false) (useSorry := false)),
← `(Tactic.optConfig| -trace (splits := 20) +revert (maxSuggestions := some 3) (ematch := 2)),
← `(Tactic.optConfig| (gen := 5) -reducible +splitImp -funCC),
← `(Tactic.optConfig| (config := { trace := true, lookahead := false, maxSuggestions := some 3 })),
]
def testGrindElab (cfgs : Array Syntax) (n : Nat) : TacticM Unit := do
profileitM Exception "test grind elab" (← getOptions) do
let mut ematch := 0
for _ in [0:n] do
for cfg in cfgs do
let c ← Tactic.elabGrindConfig cfg
ematch := ematch + c.ematch
logInfo m!"sum = {ematch}"
def testGrindElabLegacy (cfgs : Array Syntax) (n : Nat) : TacticM Unit := do
profileitM Exception "test grind elab legacy" (← getOptions) do
let mut ematch := 0
for _ in [0:n] do
for cfg in cfgs do
let c ← Tactic.elabGrindConfigLegacy cfg
ematch := ematch + c.ematch
logInfo m!"sum = {ematch}"
def runTest (info : Bool) (test : TacticM Unit) : TermElabM Unit := do
withEnableInfoTree info do
let mvar ← mkFreshExprMVar none
discard <| Tactic.run mvar.mvarId! test
set_option maxHeartbeats 0
set_option profiler true
set_option profiler.threshold 1
def iters : Nat := 1000
#eval runTest false <| testGrindElab cfgs iters
#eval runTest true <| testGrindElab cfgs iters
#eval runTest false <| testGrindElabLegacy cfgs iters
#eval runTest true <| testGrindElabLegacy cfgs iters
```
A representative output is
```
test grind elab took 315ms
test grind elab took 333ms
test grind elab legacy took 5.22s
test grind elab legacy took 5.33s
```
Computing `(315.0 + 333.0) / (5220 + 5330)` and rounding up to the
nearest tenth gives the 6.2% figure.
---
The #13426 draft PR includes some LSP modifications to support
completions for `simp` user configuration options.
This PR adds an experimental tactic `mvcgen'` that will soon replace
`mvcgen`. It has been reimplemented from the ground up using the new
`SymM`-based framework for efficient symbolic evaluation and can
outperform `mvcgen` by a factor of >100x for some synthetic benchmarks.
`mvcgen'` aspires to be feature-complete with `mvcgen`. Known exceptions
currently are join point sharing, introduction of local specs and
smaller bugs.
The implementation of `mvgen'` used to live in the benchmark suite for
rapid prototyping; this commit merely moves it into the Lean toolchain.
Doing so results in an build time instruction count increase in
seemingly unrelated tests such as `elab/delayed_assign//instructions`;
the reason is that the builtin elaborator attribute now pulls in
substantially more import code on startup.
---------
Co-authored-by: Sebastian Graf <sg@lean-fro.org>
This PR fixes the semantics of `mvcgen' with grind`: it now logs an
error per VC that `grind` cannot close and throws at the end, instead of
silently leaving the unsolved VC as a residual. The previous
silent-fallback behaviour is preserved as `mvcgen' with (try grind)`,
which `elabPreTac` recognises and routes to the same efficient `.grind`
path with a `silent` flag.
Co-authored-by: Sebastian Graf <sg@lean-fro.org>
This PR adds a `+debug` config option to `mvcgen'` and a
`BackwardRule.applyChecked` wrapper around `BackwardRule.apply`. On
apply failure with `+debug` set, the wrapper retries on the
`unfoldReducible`-normalized goal type; if the retry succeeds, an
earlier step missed a normalization and `mvcgen'` raises a hard error
naming the rule (auto-derived from `rule.expr.getAppFn` when available)
and showing the original vs. normalized types.
---------
Co-authored-by: Sebastian Graf <sg@lean-fro.org>
This PR cleans up `tests/bench/mvcgen/sym/lib/VCGen/Util.lean`.
- Drops the four `mkApp{Rev,RevRange,Range,N}S` helpers that were
vendored locally; they are now public in `Lean.Meta.Sym.Internal`
(`Lean.Meta.Sym.AlphaShareBuilder`).
- Moves `Std.HashMap.getDM` out of the root `Std.HashMap` namespace into
`namespace VCGen` and relocates it to `RuleCache.lean`, where its only
call sites live.
---------
Co-authored-by: Sebastian Graf <sg@lean-fro.org>
This PR fixes an issue in the `mvcgen'` prototype where user-provided
invariant values were elaborated against a goal type containing
reducible abbreviations like `PostShape.args ps`, baking
`(PostShape.args PostShape.pure)` into the assignment instead of `[]`.
After `tryInlineInvariant` confirms the user tactic assigned the
invariant metavariable, reduce its assignment with `unfoldReducible`.
Fixes the `mvcgen'` migration path for several `tests/elab/*` proofs
that had been blocked on the entailment-phase apply failure.
Co-authored-by: Sebastian Graf <sg@lean-fro.org>
This PR brings the Sym-based `mvcgen'` to feature parity with `mvcgen`;
the only remaining gap is `+jp` (join-point handling).
The slight benchmark regressions are due to simplifying VCs out of
`SPred` form, hitting hardest on cases with linearly many VCs like
`PurePreCond`. The ~10% vcgen slowdown is worth it for the cleaner
user-visible VCs.
---------
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
This PR drops `LEAN_NUM_THREADS=1` from the `run_test.sh` of
`bench/mvcgen/sym`. The single-threaded restriction was originally there
to get reproducible benchmark timings, but `run_test.sh` runs as a test
rather than a benchmark, and we do not care about timing reproducibility
for tests. Allowing the default thread count cuts the wall time of what
was the slowest ctest entry from ~30s to ~20s.
This PR applies two minor tweaks:
- `tests/bench/sym/simp_1.lean`: share-common the proof term before
counting objects in `getProofSize`, so the reported size reflects the
shared representation.
- `tests/elab/sym_simp_3.lean`: use `>>` instead of `.andThen` when
composing `Sym.Simp` methods.
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This PR updates two Sym benchmarks (`add_sub_cancel.lean` and
`meta_simp_1.lean`) to use the current `SymM.run` API. Both files still
referenced `run'`, which no longer exists, so they failed to elaborate.
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This PR removes a FIXME in Sym-based mvcgen' concerning a hardcoded step
limit for grind simplification. I tested that this is no longer
necessary even at highest setting for the GetThrowSetGrind benchmark.
This PR replaces eager let-expression zeta-reduction in the sym-based
mvcgen with on-demand unfolding that mirrors the production mvcgen's
behavior.
Previously, all let-expressions in the program head were immediately
zeta-reduced. Now, let-expressions are hoisted to the top of the goal
target, and the value is only inlined if it is duplicable (literals,
fvars, consts, `OfNat.ofNat`). Complex values are introduced into the
local context via `introsSimp`, preserving SymM's maximal sharing
invariants, and unfolded on demand when the fvar later appears as the
program head.
---------
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
This PR switches the mvcgen worklist from BFS (queue) to DFS (stack)
ordering for subgoal processing.
With the new do elaborator, `if`-without-`else` generates asymmetric
bind depth between branches (`pure () >>= cont` is optimized to just
`cont` in the else branch). This caused BFS-based VC numbering to depend
on elaborator internals, swapping vc10/vc11 in test cases. DFS ordering
follows the syntactic program structure more naturally and is robust to
such bind-depth asymmetries.
---------
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
This PR adds `simplifying_assumptions [Nat.add_assoc]` to the
`vcgen_get_throw_set_grind` benchmark, recovering hypothesis
simplification lost in a54eafb ("refactor: decouple solve from grind").
That refactor introduced `PreTac.processHypotheses` which wraps
`simpNewHyps`, but the call sites in `work` and `main` call
`Grind.processHypotheses` directly, leaving `simpNewHyps` as dead code.
Without it, long `s + 1 + … + 1` chains are never collapsed, causing an
asymptotic slowdown visible by a factor of 2 at n=150 (largest radar
input size).
Benchmark results (VCGen time in ms):
| n | Before | After | Speedup |
|---|--------|-------|---------|
| 50 | 222 | 186 | 1.2× |
| 100 | 391 | 251 | 1.6× |
| 150 | 647 | 329 | 2.0× |
| 200 | 995 | 415 | 2.4× |
| 300 | 1894 | 589 | 3.2× |
🤖 Generated with [Claude Code](https://claude.com/claude-code)
---------
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
This PR replaces the manual `simpForallDomains` / `implies_congr`
machinery in `introsSimp` with `Sym.Simp.simpTelescope` as the
pre-combinator, which already handles simplifying forall telescope
domains.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
---------
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
This PR adds a `simplifying_assumptions` clause to the `mvcgen'` tactic
that allows users to specify Sym.simp rewrite theorems for simplifying
hypotheses during VC generation. The syntax is `mvcgen'
simplifying_assumptions [thm₁, thm₂, ...]`. This replaces the previous
approach of hardcoding `reassocNatAdd` in `mvcgen' with grind` mode,
making hypothesis simplification user-extensible and independent of
grind.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
This PR refactors the sym-based VCGen (`tests/bench/mvcgen/sym`) to
separate concerns between
goal decomposition and VC discharge, following the architecture of
loom2's `mvcgen'`.
- `solve` now operates on plain `MVarId` with no knowledge of grind,
returning `List MVarId`
in `SolveResult.goals`.
- `work` handles grind E-graph internalization: after `solve` returns
multiple subgoals, it
calls `processHypotheses` on the parent goal to share context before
forking.
- `emitVC` dispatches on a new `PreTac` enum (`.none`, `.grind`,
`.tactic`) to try solving
each VC, replacing the previous inline grind logic and post-hoc tactic
loop in the elaborator.
- The redundant `WorkItem` wrapper (which duplicated `Grind.Goal`'s
`mvarId`) is removed; the
worklist operates directly on `Grind.Goal`.
- `GrindContext` is replaced by `PreTac` + `hypSimpMethods` fields in
`VCGen.Context`, cleanly
separating hypothesis simplification from the discharge strategy.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
This PR fixes the sym mvcgen benchmark script and tunes input sizes.
**run_bench.sh**: Replace `| tee` with the `capture` helper from
`util.sh`.
Without `pipefail`, piping through `tee` masks non-zero exit codes from
`lake build`, so build failures (OOM, stack overflow) go unnoticed.
**Benchmark sizes**: Scale down inputs for benchmarks that exceeded the
2s
budget so each benchmark completes in 1-2s across its 3 linearly
increasing
inputs.
**Metric collision**: Copy `GetThrowSet.Goal` into a `GetThrowSetGrind`
namespace so the grind variant reports as `GetThrowSetGrind(n)` instead
of
colliding with `GetThrowSet(n)` in `measurements.jsonl`.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
---------
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
This PR adds iota reduction to the sym-based `mvcgen'` tactic by calling
`reduceRecMatcher?` before falling back to the match split backward
rule.
When a matcher/recursor has a concrete discriminant, it is reduced
directly
instead of constructing and applying a splitting backward rule, which is
significantly faster for benchmarks like `MatchIota` (previously
`MatchSplit`)
where `loop n` unrolls into `n` nested matches with known `Nat`
discriminants.
The old `MatchSplit` test case (concrete discriminants) is renamed to
`MatchIota`
and a new `MatchSplit` test case with symbolic discriminants (matching
on state)
is added to keep exercising the split backward rule code path.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
This PR extends the sym-based `mvcgen'` tactic with two new modes:
1. `mvcgen' with <tac>`: run VCGen, then apply `<tac>` to each remaining
VC.
2. `mvcgen' with grind`: integrate grind into the VCGen loop for
incremental context internalization. Each VC inherits the parent's
E-graph state, so hypothesis processing is shared across sibling VCs,
avoiding O(n) re-internalization per VC.
The grind mode accepts the full grind configuration syntax (`mvcgen'
with grind (config := { ... }) [params]`).
A persistent `Sym.Simp` cache with a `reassocNatAdd` simproc normalizes
hypothesis types (e.g., `s + 1 + 1 + 1` → `s + 3`) before grind
internalization, achieving O(1) amortized simplification per VC.
Benchmark results for GetThrowSet (`mvcgen' with grind`):
- n=100: 400ms total, 180ms kernel
- n=250: 855ms total, 1.8s kernel
- n=500: 1.9s total, 11.8s kernel
Kernel checking time grows superlinearly and is the dominant cost at
larger sizes. This is a separate issue from VCGen performance.
---------
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
The ultimate goal of this work is to turn production of `.ir` files into
separate build step so that it does not block non-`meta` imports and can
be skipped entirely when not needed. This PR implements the main logic
of this new `leanir` compiler executable and runs it after `lean` inside
the same Lake build step but leaves its use disabled behind a
`compiler.postponeCompile` flag until further Lake adjustments move it
to a separate facet so that its use can be actually beneficial.
---------
Co-authored-by: Joscha <joscha@plugh.de>
This PR adds per-result `contextDependent` tracking to `Sym.Simp.Result`
and splits the simplifier cache into persistent (context-independent)
and transient (context-dependent, cleared on binder entry). This
replaces the coarse `wellBehavedMethods` flag.
Key changes:
- Add `contextDependent : Bool := false` to `Result.rfl` and
`Result.step`
- Split `State.cache` into `persistentCache` and `transientCache`
- Remove `wellBehavedMethods` from `Methods`
- Replace `withoutModifyingCacheIfNotWellBehaved` with
`withFreshTransientCache`
- Change `DischargeResult` to an inductive (`.failed`/`.solved`)
- Add `dischargeAssumption` (context-dependent discharger for testing)
- Add `sym.simp.debug.cache` trace class
- Propagate `contextDependent` through all combinators (congruence,
transitivity, control flow, arrows, rewriting)
- Add `mkRflResult`/`mkRflResultCD` to avoid dynamic allocation of rfl
results
- Fix `isRfl` to ignore `contextDependent` (was silently broken by the
extra field)
Propagation invariant: when combining sub-results, `cd` is the
disjunction of ALL sub-results' flags — including `.rfl` results. If
`simp` returned `.rfl (contextDependent := true)`, it means `simp` might
take a completely different code path in another local context, so all
downstream results must be marked context-dependent.
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The tests need to run with certain environment variables set that only
cmake really knows and that differ between stages. Cmake could just set
the variables directly when running the tests and benchmarks, but that
would leave no good way to manually run a single benchmark. So cmake
generates some stage-specific scripts instead that set the required
environment variables.
Previously, those scripts were sourced directly by the individual
`run_*` scripts, so the env scripts of different stages would overwrite
each other. This PR changes the setup so they can instead be generated
next to each other. This also simplifies the `run_*` scripts themselves
a bit, and makes `tests/bench/build` less of a hack.
This PR generalizes the sym MVCGen's match splitting from `ite`-only to
`ite`, `dite`, and arbitrary matchers. Previously, only `ite` was
supported; `dite` and match expressions were rejected with an error.
`mkBackwardRuleForSplit` uses `SplitInfo.splitWith` to build the
splitting proof. Hypothesis types are discovered via `rwIfOrMatcher`
inside the splitter telescope, and `TransformAltFVars.all` provides the
proper fvars for `mkForallFVars`. Subgoal type metavariables use
`mkFreshExprSyntheticOpaqueMVar` so that `rwIfOrMatcher`'s internal
`assumption` tactic cannot assign them.
Adds `DiteSplit`, `MatchSplit`, and `MatchSplitState` test cases and a
`vcgen_match_split` benchmark.
---------
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
This PR adds support for simp/equational spec theorems in the SymM-based
`mvcgen'` tactic,
catching up with a feature that the original `mvcgen` has supported for
a long time.
Users can write `@[spec] theorem : get (m := StateT σ m) = fun s => pure
(s, s) := rfl`
instead of manually specifying equivalent Hoare triples. The equational
form is more
concise and natural for specs that simply unfold definitions.
The universe level normalization (`normalizeLevelsExpr`) applied in
`work` and the backward
rule constructors is a workaround; ideally this should be integrated
into
`preprocessMVar`/`preprocessExpr` in the SymM framework so all users
benefit.
Changes:
- Add `SpecTheoremKind` to distinguish triple vs simp specs in
`SpecTheoremNew`
- Add `mkSpecTheoremNewFromSimpDecl?` to create spec entries from
equational lemmas, filtering no-op equations
- Add `mkBackwardRuleFromSimpSpec` to build backward rules via
`Eq.mpr`/`congrArg`, with instance synthesis, projection reduction, and
`unfoldReducible` on the RHS
- Migrate simp theorems from `SimpTheorems` database during
`migrateSpecTheoremsDatabase`
- Normalize universe levels so structural matching in
`BackwardRule.apply` succeeds when `max u v` vs `max v u` arise from
different code paths
- Simplify `mkSpecContext` by removing the mock `simp` context
construction
- Use `mkBackwardRuleFromExpr` instead of `mkAuxLemma` for triple specs,
since the proof may contain free variables from the goal context
- Add `AddSubCancelSimp` benchmark case and test exercising the simp
spec code path
- Change `AddSubCancel` spec proofs from `mvcgen` to `mvcgen'`
(dogfooding)
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
This PR extracts the example programs from the sym mvcgen benchmarks
into
shared `Cases.*` modules so that both benchmarks and a new fast test
suite
can reuse them. It also renames `vcgen_deep_add_sub_cancel` to
`vcgen_add_sub_cancel_deep` for consistency.
The test suite (`test_vcgen.lean`) runs all cases at n=10, completing in
~2s vs minutes for the full benchmarks. It is wired up as a `lake test`
driver and integrated with the lean4 test/bench infrastructure via
`run_test`/`run_bench` scripts registered in `CMakeLists.txt`.
Benchmark output now uses aligned `CaseName(n):` labels. The `run_bench`
script extracts per-case vcgen and kernel timings into
`measurements.jsonl`.
Benchmarks run single-threaded (`LEAN_NUM_THREADS=1`) for
reproducibility.
`vcgen_get_throw_set` is excluded from benchmarks due to pathological
`instantiateMVars` behavior.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
---------
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
This PR implements a merge sort algorithm on arrays. It has been
measured to be about twice as fast as `List.mergeSort` for large arrays
with random elements, but for small or almost sorted ones, the list
implementation is faster. Compared to `Array.qsort`, it is stable and
has O(n log n) worst-case cost. Note: There is still a lot of potential
for optimization. The current implementation allocates O(n log n)
arrays, one per recursive call.
---------
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
This PR deprecates `levelZero` in favor of `Level.zero` and `levelOne`
in favor of the new `Level.one`, and updates all usages throughout the
codebase. The `levelZero` alias was previously required for computed
field `data` to work, but this is no longer needed.
🤖 Prepared with Claude Code
This PR changes the spec lookup procedure in Sym-based mvcgen so that
1. Spec candidates are sorted first before being filtered
2. Instead of filtering the whole set of candidates using
`spec.pattern.match?`, we take the first match with the highest
priority.
The second point means we will do a lot fewer matches when the highest
priority spec matches immediately. In this case, the one match is still
partially redundant with the final application of the backward rule
application. It would be great if could somehow specialize the backward
rule after it has been created. Still, this yields some welcome
speedups. Before and after for each.
```
vcgen_add_sub_cancel:
goal_1000: 865 ms, 1 VCs by grind: 228 ms, kernel: 435 ms
goal_1000: 540 ms, 1 VCs by grind: 229 ms, kernel: 426 ms
vcgen_ping_pong:
goal_1000: 458 ms, 0 VCs, kernel: 431 ms
goal_1000: 454 ms, 0 VCs, kernel: 443 ms (unchanged, because there is only ever one candidate spec)
vcgen_deep_add_sub_cancel:
goal_1000: 986 ms, 1 VCs by grind: 234 ms, kernel: 735 ms
goal_1000: 728 ms, 1 VCs by grind: 231 ms, kernel: 708 ms
vcgen_reader_state:
goal_1000: 746 ms, 1 VCs by sorry: 1 ms, kernel: 803 ms
goal_1000: 525 ms, 1 VCs by sorry: 1 ms, kernel: 840 ms
```
This PR adds support for generating and discharging postcondition VCs in
Sym-based `mvcgen`. It also adds a new benchmark case
`vcgen_ping_pong.lean` that tests this functionality. This benchmark
required a more diligent approach to maintain maximal sharing in goal
preprocessing. Goal preprocessing was subsequently merged into the main
VC generation function.
This PR sets up the new integrated test/bench suite. It then migrates
all benchmarks and some related tests to the new suite. There's also
some documentation and some linting.
For now, a lot of the old tests are left alone so this PR doesn't become
even larger than it already is. Eventually, all tests should be migrated
to the new suite though so there isn't a confusing mix of two systems.
This PR changes all `lean-toolchain` to use relative toolchain paths
instead of `lean4` and `lean4-stage0` identifiers, which removes the
need for manually linking toolchains via Elan.
After this PR, at least Elan 4.2.0 and 0.0.224 of the Lean VS Code
extension will be needed to edit core.
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
This PR adds the benchmark vcgen_reader_state that is a variant of
vcgen_add_sub_cancel that takes the value to subtract from a `ReaderT`
layer. Measurements:
```
goal_100: 201 ms, 1 VCs by sorry: 0 ms, kernel: 52 ms
goal_500: 382 ms, 1 VCs by sorry: 0 ms, kernel: 327 ms
goal_1000: 674 ms, 1 VCs by sorry: 1 ms, kernel: 741 ms
```
Which suggests it scales linearly. The generated VC triggers superlinear
behavior in `grind`, though, hence it is discharged by `sorry`.
This PR adds a System F formalization as a `cbv` tactic benchmark. It is
a translation of the Rocq case study from:
*Definitional Proof Irrelevance Made Accessible* by Thiago Felicissimo,
Yann Leray, Loïc Pujet, Nicolas Tabareau, Éric Tanter, Théo Winterhalter
The authors have given permission to use their development.
The benchmark includes:
- A full System F formalization (substitution lemmas, confluence of
λ-calculus, strong normalization)
- A `pow2DoubleEq` benchmark that verifies 2^(n+1) = 2^n + 2^n via
normalization in System F, measuring both `cbv` tactic time and kernel
checking time for n = 0..6
Co-Authored-By: @david-christiansen
Co-authored-by: David Thrane Christiansen <david@davidchristiansen.dk>
This PR improves the Sym VCGen such that we can use Sym.simp to unfold
definitions in the benchmark driver. To do so, it adds support for
zeta-reduction in the VCGen and ensures that proof terms are maximally
shared before being sent to the kernel.