lean4-htt

Author	SHA1	Message	Date
Sebastian Graf	422920f643	feat: mvcgen', an experimental SymM-based implementation mvcgen (#13644 ) This PR adds an experimental tactic `mvcgen'` that will soon replace `mvcgen`. It has been reimplemented from the ground up using the new `SymM`-based framework for efficient symbolic evaluation and can outperform `mvcgen` by a factor of >100x for some synthetic benchmarks. `mvcgen'` aspires to be feature-complete with `mvcgen`. Known exceptions currently are join point sharing, introduction of local specs and smaller bugs. The implementation of `mvgen'` used to live in the benchmark suite for rapid prototyping; this commit merely moves it into the Lean toolchain. Doing so results in an build time instruction count increase in seemingly unrelated tests such as `elab/delayed_assign//instructions`; the reason is that the builtin elaborator attribute now pulls in substantially more import code on startup. --------- Co-authored-by: Sebastian Graf <sg@lean-fro.org>	2026-05-07 12:53:02 +00:00
Sebastian Graf	36a54dbe9c	test: `mvcgen' with grind` should fail when `grind` fails (#13672 ) This PR fixes the semantics of `mvcgen' with grind`: it now logs an error per VC that `grind` cannot close and throws at the end, instead of silently leaving the unsolved VC as a residual. The previous silent-fallback behaviour is preserved as `mvcgen' with (try grind)`, which `elabPreTac` recognises and routes to the same efficient `.grind` path with a `silent` flag. Co-authored-by: Sebastian Graf <sg@lean-fro.org>	2026-05-06 21:41:38 +00:00
Sebastian Graf	a81628ddbf	test: add `BackwardRule.applyChecked` debug wrapper for `mvcgen'` (#13671 ) This PR adds a `+debug` config option to `mvcgen'` and a `BackwardRule.applyChecked` wrapper around `BackwardRule.apply`. On apply failure with `+debug` set, the wrapper retries on the `unfoldReducible`-normalized goal type; if the retry succeeds, an earlier step missed a normalization and `mvcgen'` raises a hard error naming the rule (auto-derived from `rule.expr.getAppFn` when available) and showing the original vs. normalized types. --------- Co-authored-by: Sebastian Graf <sg@lean-fro.org>	2026-05-06 21:33:27 +00:00
Sebastian Graf	d34811b5c9	test: tidy `Util.lean` in the `mvcgen'` prototype (#13667 ) This PR cleans up `tests/bench/mvcgen/sym/lib/VCGen/Util.lean`. - Drops the four `mkApp{Rev,RevRange,Range,N}S` helpers that were vendored locally; they are now public in `Lean.Meta.Sym.Internal` (`Lean.Meta.Sym.AlphaShareBuilder`). - Moves `Std.HashMap.getDM` out of the root `Std.HashMap` namespace into `namespace VCGen` and relocates it to `RuleCache.lean`, where its only call sites live. --------- Co-authored-by: Sebastian Graf <sg@lean-fro.org>	2026-05-06 18:00:25 +00:00
Sebastian Graf	65906d9ccc	test: unfold reducible abbreviations in inline-elaborated invariant values (#13668 ) This PR fixes an issue in the `mvcgen'` prototype where user-provided invariant values were elaborated against a goal type containing reducible abbreviations like `PostShape.args ps`, baking `(PostShape.args PostShape.pure)` into the assignment instead of `[]`. After `tryInlineInvariant` confirms the user tactic assigned the invariant metavariable, reduce its assignment with `unfoldReducible`. Fixes the `mvcgen'` migration path for several `tests/elab/*` proofs that had been blocked on the entailment-phase apply failure. Co-authored-by: Sebastian Graf <sg@lean-fro.org>	2026-05-06 17:58:03 +00:00
Sebastian Graf	42eb0385a5	test: bring Sym-based `mvcgen'` on par with `mvcgen` (#13578 ) This PR brings the Sym-based `mvcgen'` to feature parity with `mvcgen`; the only remaining gap is `+jp` (join-point handling). The slight benchmark regressions are due to simplifying VCs out of `SPred` form, hitting hardest on cases with linearly many VCs like `PurePreCond`. The ~10% vcgen slowdown is worth it for the cleaner user-visible VCs. --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-05-05 13:17:20 +00:00
Sebastian Graf	cae4decead	test: speed up `bench/mvcgen/sym` ctest entry (#13498 ) This PR drops `LEAN_NUM_THREADS=1` from the `run_test.sh` of `bench/mvcgen/sym`. The single-threaded restriction was originally there to get reproducible benchmark timings, but `run_test.sh` runs as a test rather than a benchmark, and we do not care about timing reproducibility for tests. Allowing the default thread count cuts the wall time of what was the slowest ctest entry from ~30s to ~20s.	2026-04-22 13:41:07 +00:00
Sebastian Ullrich	80cbab1642	chore: don't fail on running build bench on built stage3 (#13467 )	2026-04-18 22:07:21 +00:00
Leonardo de Moura	c0a53ffe97	chore: minor tweaks to `Sym.simp` test and benchmark (#13468 ) This PR applies two minor tweaks: - `tests/bench/sym/simp_1.lean`: share-common the proof term before counting objects in `getProofSize`, so the reported size reflects the shared representation. - `tests/elab/sym_simp_3.lean`: use `>>` instead of `.andThen` when composing `Sym.Simp` methods. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-18 21:11:30 +00:00
Leonardo de Moura	615f45ad7a	chore: fix Sym benchmarks using stale `run'` API (#13450 ) This PR updates two Sym benchmarks (`add_sub_cancel.lean` and `meta_simp_1.lean`) to use the current `SymM.run` API. Both files still referenced `run'`, which no longer exists, so they failed to elaborate. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-17 21:31:57 +00:00
Sebastian Ullrich	b09d39a766	chore: build bench: replace warmup with selective build (#13432 )	2026-04-17 08:01:17 +00:00
Jason Yuen	3770b3dcb8	chore: fix spelling errors (#13274 ) This PR fixed typos: ``` pip install codespell --upgrade codespell --summary --ignore-words-list enew,forin,fro,happend,hge,ihs,iterm,spred --skip stage0 --check-filenames codespell --summary --ignore-words-list enew,forin,fro,happend,hge,ihs,iterm,spred --skip stage0 --check-filenames --regex '[A-Z][a-z]' codespell --summary --ignore-words-list enew,forin,fro,happend,hge,ihs,iterm,spred --skip stage0 --check-filenames --regex "\b[a-z']" ```	2026-04-04 07:34:34 +00:00
Sebastian Ullrich	5bf590e710	chore: fixes from #13103 "enable separate codegen" (#13241 )	2026-04-02 11:13:22 +00:00
Sebastian Graf	ffc2c0ab1a	chore: remove hardcoded maxSteps limit in Sym mvcgen' (#13252 ) This PR removes a FIXME in Sym-based mvcgen' concerning a hardcoded step limit for grind simplification. I tested that this is no longer necessary even at highest setting for the GetThrowSetGrind benchmark.	2026-04-02 09:16:43 +00:00
Sebastian Graf	504e099c5d	test: lazy let-binding unfolding in sym mvcgen (#13210 ) This PR replaces eager let-expression zeta-reduction in the sym-based mvcgen with on-demand unfolding that mirrors the production mvcgen's behavior. Previously, all let-expressions in the program head were immediately zeta-reduced. Now, let-expressions are hoisted to the top of the goal target, and the value is only inlined if it is duplicable (literals, fvars, consts, `OfNat.ofNat`). Complex values are introduced into the local context via `introsSimp`, preserving SymM's maximal sharing invariants, and unfolded on demand when the fvar later appears as the program head. --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-31 15:29:11 +00:00
Sebastian Graf	a88f81bc28	test: use DFS ordering for subgoals in mvcgen (#13193 ) This PR switches the mvcgen worklist from BFS (queue) to DFS (stack) ordering for subgoal processing. With the new do elaborator, `if`-without-`else` generates asymmetric bind depth between branches (`pure () >>= cont` is optimized to just `cont` in the else branch). This caused BFS-based VC numbering to depend on elaborator internals, swapping vc10/vc11 in test cases. DFS ordering follows the syntactic program structure more naturally and is robust to such bind-depth asymmetries. --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-30 17:11:13 +00:00
Sebastian Graf	75ec8e42c8	test: simplify assumptions in `mvcgen' with grind` benchmark (#13186 ) This PR adds `simplifying_assumptions [Nat.add_assoc]` to the `vcgen_get_throw_set_grind` benchmark, recovering hypothesis simplification lost in `a54eafb` ("refactor: decouple solve from grind"). That refactor introduced `PreTac.processHypotheses` which wraps `simpNewHyps`, but the call sites in `work` and `main` call `Grind.processHypotheses` directly, leaving `simpNewHyps` as dead code. Without it, long `s + 1 + … + 1` chains are never collapsed, causing an asymptotic slowdown visible by a factor of 2 at n=150 (largest radar input size). Benchmark results (VCGen time in ms): \| n \| Before \| After \| Speedup \| \|---\|--------\|-------\|---------\| \| 50 \| 222 \| 186 \| 1.2× \| \| 100 \| 391 \| 251 \| 1.6× \| \| 150 \| 647 \| 329 \| 2.0× \| \| 200 \| 995 \| 415 \| 2.4× \| \| 300 \| 1894 \| 589 \| 3.2× \| 🤖 Generated with [Claude Code](https://claude.com/claude-code) --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-30 10:03:43 +00:00
Sebastian Graf	42854412c3	refactor: use `simpTelescope` in `mvcgen' simplifying_assumptions` (#13159 ) This PR replaces the manual `simpForallDomains` / `implies_congr` machinery in `introsSimp` with `Sym.Simp.simpTelescope` as the pre-combinator, which already handles simplifying forall telescope domains. 🤖 Generated with [Claude Code](https://claude.com/claude-code) --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-27 15:48:35 +00:00
Sebastian Graf	210d4d00fa	test: add `simplifying_assumptions` clause to `mvcgen'` tactic (#13156 ) This PR adds a `simplifying_assumptions` clause to the `mvcgen'` tactic that allows users to specify Sym.simp rewrite theorems for simplifying hypotheses during VC generation. The syntax is `mvcgen' simplifying_assumptions [thm₁, thm₂, ...]`. This replaces the previous approach of hardcoding `reassocNatAdd` in `mvcgen' with grind` mode, making hypothesis simplification user-extensible and independent of grind. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-27 13:16:23 +00:00
Sebastian Graf	a54eafb84f	refactor: decouple `solve` from grind in sym-based mvcgen (#13133 ) This PR refactors the sym-based VCGen (`tests/bench/mvcgen/sym`) to separate concerns between goal decomposition and VC discharge, following the architecture of loom2's `mvcgen'`. - `solve` now operates on plain `MVarId` with no knowledge of grind, returning `List MVarId` in `SolveResult.goals`. - `work` handles grind E-graph internalization: after `solve` returns multiple subgoals, it calls `processHypotheses` on the parent goal to share context before forking. - `emitVC` dispatches on a new `PreTac` enum (`.none`, `.grind`, `.tactic`) to try solving each VC, replacing the previous inline grind logic and post-hoc tactic loop in the elaborator. - The redundant `WorkItem` wrapper (which duplicated `Grind.Goal`'s `mvarId`) is removed; the worklist operates directly on `Grind.Goal`. - `GrindContext` is replaced by `PreTac` + `hypSimpMethods` fields in `VCGen.Context`, cleanly separating hypothesis simplification from the discharge strategy. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-26 11:19:08 +00:00
Sebastian Graf	51f67be2bd	chore: remove unnecessary level normalization from Sym-based mvcgen (#13119 ) This PR removes level normalization from Sym-based mvcgen that is unnecessary after #12923.	2026-03-25 16:06:57 +00:00
Sebastian Graf	e60078db3b	test: harden sym mvcgen bench script and tune benchmark sizes (#13107 ) This PR fixes the sym mvcgen benchmark script and tunes input sizes. run_bench.sh: Replace `\| tee` with the `capture` helper from `util.sh`. Without `pipefail`, piping through `tee` masks non-zero exit codes from `lake build`, so build failures (OOM, stack overflow) go unnoticed. Benchmark sizes: Scale down inputs for benchmarks that exceeded the 2s budget so each benchmark completes in 1-2s across its 3 linearly increasing inputs. Metric collision: Copy `GetThrowSet.Goal` into a `GetThrowSetGrind` namespace so the grind variant reports as `GetThrowSetGrind(n)` instead of colliding with `GetThrowSet(n)` in `measurements.jsonl`. 🤖 Generated with [Claude Code](https://claude.com/claude-code) --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-24 21:14:36 +00:00
Sebastian Graf	a824e5b85e	test: add iota reduction via reduceRecMatcher? to sym-based mvcgen' (#13100 ) This PR adds iota reduction to the sym-based `mvcgen'` tactic by calling `reduceRecMatcher?` before falling back to the match split backward rule. When a matcher/recursor has a concrete discriminant, it is reduced directly instead of constructing and applying a splitting backward rule, which is significantly faster for benchmarks like `MatchIota` (previously `MatchSplit`) where `loop n` unrolls into `n` nested matches with known `Nat` discriminants. The old `MatchSplit` test case (concrete discriminants) is renamed to `MatchIota` and a new `MatchSplit` test case with symbolic discriminants (matching on state) is added to keep exercising the split backward rule code path. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-24 12:52:01 +00:00
Sebastian Graf	83c6f6e5ac	test: add `mvcgen' with <tac>` and `mvcgen' with grind` to sym-based VCGen (#12893 ) This PR extends the sym-based `mvcgen'` tactic with two new modes: 1. `mvcgen' with <tac>`: run VCGen, then apply `<tac>` to each remaining VC. 2. `mvcgen' with grind`: integrate grind into the VCGen loop for incremental context internalization. Each VC inherits the parent's E-graph state, so hypothesis processing is shared across sibling VCs, avoiding O(n) re-internalization per VC. The grind mode accepts the full grind configuration syntax (`mvcgen' with grind (config := { ... }) [params]`). A persistent `Sym.Simp` cache with a `reassocNatAdd` simproc normalizes hypothesis types (e.g., `s + 1 + 1 + 1` → `s + 3`) before grind internalization, achieving O(1) amortized simplification per VC. Benchmark results for GetThrowSet (`mvcgen' with grind`): - n=100: 400ms total, 180ms kernel - n=250: 855ms total, 1.8s kernel - n=500: 1.9s total, 11.8s kernel Kernel checking time grows superlinearly and is the dominant cost at larger sizes. This is a separate issue from VCGen performance. --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-24 11:27:13 +00:00
Henrik Böving	9676f54cc5	chore: pass the previous stage libLake as plugin (#13000 ) This PR avoids bootstrapping headaches when ABI breakages affect lake.	2026-03-20 12:23:20 +00:00
Sebastian Ullrich	c6a89cc716	feat: experimental option to move non-meta compilation out of `lean` build step (#10291 ) The ultimate goal of this work is to turn production of `.ir` files into separate build step so that it does not block non-`meta` imports and can be skipped entirely when not needed. This PR implements the main logic of this new `leanir` compiler executable and runs it after `lean` inside the same Lake build step but leaves its use disabled behind a `compiler.postponeCompile` flag until further Lake adjustments move it to a separate facet so that its use can be actually beneficial. --------- Co-authored-by: Joscha <joscha@plugh.de>	2026-03-20 10:39:39 +00:00
Leonardo de Moura	d2907b5c96	feat: add `contextDependent` to Sym.simp Result with two-tier cache (#12996 ) This PR adds per-result `contextDependent` tracking to `Sym.Simp.Result` and splits the simplifier cache into persistent (context-independent) and transient (context-dependent, cleared on binder entry). This replaces the coarse `wellBehavedMethods` flag. Key changes: - Add `contextDependent : Bool := false` to `Result.rfl` and `Result.step` - Split `State.cache` into `persistentCache` and `transientCache` - Remove `wellBehavedMethods` from `Methods` - Replace `withoutModifyingCacheIfNotWellBehaved` with `withFreshTransientCache` - Change `DischargeResult` to an inductive (`.failed`/`.solved`) - Add `dischargeAssumption` (context-dependent discharger for testing) - Add `sym.simp.debug.cache` trace class - Propagate `contextDependent` through all combinators (congruence, transitivity, control flow, arrows, rewriting) - Add `mkRflResult`/`mkRflResultCD` to avoid dynamic allocation of rfl results - Fix `isRfl` to ignore `contextDependent` (was silently broken by the extra field) Propagation invariant: when combining sub-results, `cd` is the disjunction of ALL sub-results' flags — including `.rfl` results. If `simp` returned `.rfl (contextDependent := true)`, it means `simp` might take a completely different code path in another local context, so all downstream results must be marked context-dependent. --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-20 00:22:08 +00:00
Garmelon	e14230c0f3	chore: import measure.py instead of calling it (#12962 ) Thereby avoiding the overhead of one python interpreter per wrapped lean call during the build benchmarks.	2026-03-18 10:23:32 +00:00
Garmelon	49715fe63c	chore: improve how test suite interacts with stages (#12913 ) The tests need to run with certain environment variables set that only cmake really knows and that differ between stages. Cmake could just set the variables directly when running the tests and benchmarks, but that would leave no good way to manually run a single benchmark. So cmake generates some stage-specific scripts instead that set the required environment variables. Previously, those scripts were sourced directly by the individual `run_` scripts, so the env scripts of different stages would overwrite each other. This PR changes the setup so they can instead be generated next to each other. This also simplifies the `run_` scripts themselves a bit, and makes `tests/bench/build` less of a hack.	2026-03-16 15:20:03 +00:00
Sebastian Graf	e6d9220eee	test: add dite and match splitting to sym-based MVCGen (#12903 ) This PR generalizes the sym MVCGen's match splitting from `ite`-only to `ite`, `dite`, and arbitrary matchers. Previously, only `ite` was supported; `dite` and match expressions were rejected with an error. `mkBackwardRuleForSplit` uses `SplitInfo.splitWith` to build the splitting proof. Hypothesis types are discovered via `rwIfOrMatcher` inside the splitter telescope, and `TransformAltFVars.all` provides the proper fvars for `mkForallFVars`. Subgoal type metavariables use `mkFreshExprSyntheticOpaqueMVar` so that `rwIfOrMatcher`'s internal `assumption` tactic cannot assign them. Adds `DiteSplit`, `MatchSplit`, and `MatchSplitState` test cases and a `vcgen_match_split` benchmark. --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-12 22:39:43 +00:00
Garmelon	6a2a884372	chore: migrate pkg tests (#12889 ) Also refactor util.sh in the process, so test scripts become easier to write (inspired in part by lake's test suite).	2026-03-11 18:55:46 +00:00
Sebastian Graf	b626c6d326	test: apply simp theorems in SymM mvcgen' (#12872 ) This PR adds support for simp/equational spec theorems in the SymM-based `mvcgen'` tactic, catching up with a feature that the original `mvcgen` has supported for a long time. Users can write `@[spec] theorem : get (m := StateT σ m) = fun s => pure (s, s) := rfl` instead of manually specifying equivalent Hoare triples. The equational form is more concise and natural for specs that simply unfold definitions. The universe level normalization (`normalizeLevelsExpr`) applied in `work` and the backward rule constructors is a workaround; ideally this should be integrated into `preprocessMVar`/`preprocessExpr` in the SymM framework so all users benefit. Changes: - Add `SpecTheoremKind` to distinguish triple vs simp specs in `SpecTheoremNew` - Add `mkSpecTheoremNewFromSimpDecl?` to create spec entries from equational lemmas, filtering no-op equations - Add `mkBackwardRuleFromSimpSpec` to build backward rules via `Eq.mpr`/`congrArg`, with instance synthesis, projection reduction, and `unfoldReducible` on the RHS - Migrate simp theorems from `SimpTheorems` database during `migrateSpecTheoremsDatabase` - Normalize universe levels so structural matching in `BackwardRule.apply` succeeds when `max u v` vs `max v u` arise from different code paths - Simplify `mkSpecContext` by removing the mock `simp` context construction - Use `mkBackwardRuleFromExpr` instead of `mkAuxLemma` for triple specs, since the proof may contain free variables from the goal context - Add `AddSubCancelSimp` benchmark case and test exercising the simp spec code path - Change `AddSubCancel` spec proofs from `mvcgen` to `mvcgen'` (dogfooding) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-10 17:15:04 +00:00
Sebastian Graf	49ed556479	test: add VCGen test suite for sym mvcgen benchmarks (#12855 ) This PR extracts the example programs from the sym mvcgen benchmarks into shared `Cases.*` modules so that both benchmarks and a new fast test suite can reuse them. It also renames `vcgen_deep_add_sub_cancel` to `vcgen_add_sub_cancel_deep` for consistency. The test suite (`test_vcgen.lean`) runs all cases at n=10, completing in ~2s vs minutes for the full benchmarks. It is wired up as a `lake test` driver and integrated with the lean4 test/bench infrastructure via `run_test`/`run_bench` scripts registered in `CMakeLists.txt`. Benchmark output now uses aligned `CaseName(n):` labels. The `run_bench` script extracts per-case vcgen and kernel timings into `measurements.jsonl`. Benchmarks run single-threaded (`LEAN_NUM_THREADS=1`) for reproducibility. `vcgen_get_throw_set` is excluded from benchmarks due to pathological `instantiateMVars` behavior. 🤖 Generated with [Claude Code](https://claude.com/claude-code) --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-10 13:32:13 +00:00
Sebastian Graf	e9e46f4199	chore: fix two semantic merge errors in SymM mvcgen (#12845 )	2026-03-09 11:00:01 +00:00
Paul Reichert	68ea28c24f	feat: `Array.mergeSort` (#12385 ) This PR implements a merge sort algorithm on arrays. It has been measured to be about twice as fast as `List.mergeSort` for large arrays with random elements, but for small or almost sorted ones, the list implementation is faster. Compared to `Array.qsort`, it is stable and has O(n log n) worst-case cost. Note: There is still a lot of potential for optimization. The current implementation allocates O(n log n) arrays, one per recursive call. --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-06 13:18:13 +00:00
Kim Morrison	66bc9ae177	chore: deprecate levelZero and levelOne (#12720 ) This PR deprecates `levelZero` in favor of `Level.zero` and `levelOne` in favor of the new `Level.one`, and updates all usages throughout the codebase. The `levelZero` alias was previously required for computed field `data` to work, but this is no longer needed. 🤖 Prepared with Claude Code	2026-03-04 01:03:08 +00:00
Sebastian Graf	4cd7a85334	test: speed up Sym mvcgen by doing fewer redundant program matches (#12712 ) This PR changes the spec lookup procedure in Sym-based mvcgen so that 1. Spec candidates are sorted first before being filtered 2. Instead of filtering the whole set of candidates using `spec.pattern.match?`, we take the first match with the highest priority. The second point means we will do a lot fewer matches when the highest priority spec matches immediately. In this case, the one match is still partially redundant with the final application of the backward rule application. It would be great if could somehow specialize the backward rule after it has been created. Still, this yields some welcome speedups. Before and after for each. ``` vcgen_add_sub_cancel: goal_1000: 865 ms, 1 VCs by grind: 228 ms, kernel: 435 ms goal_1000: 540 ms, 1 VCs by grind: 229 ms, kernel: 426 ms vcgen_ping_pong: goal_1000: 458 ms, 0 VCs, kernel: 431 ms goal_1000: 454 ms, 0 VCs, kernel: 443 ms (unchanged, because there is only ever one candidate spec) vcgen_deep_add_sub_cancel: goal_1000: 986 ms, 1 VCs by grind: 234 ms, kernel: 735 ms goal_1000: 728 ms, 1 VCs by grind: 231 ms, kernel: 708 ms vcgen_reader_state: goal_1000: 746 ms, 1 VCs by sorry: 1 ms, kernel: 803 ms goal_1000: 525 ms, 1 VCs by sorry: 1 ms, kernel: 840 ms ```	2026-02-27 03:24:34 +00:00
Sebastian Graf	668f07039c	chore: do not use Sym.inferType in mvcgen if inputs are not shared (#12713 )	2026-02-27 01:15:09 +00:00
Sebastian Graf	f2438a1830	test: support postcondition VCs in Sym VCGen (#12711 ) This PR adds support for generating and discharging postcondition VCs in Sym-based `mvcgen`. It also adds a new benchmark case `vcgen_ping_pong.lean` that tests this functionality. This benchmark required a more diligent approach to maintain maximal sharing in goal preprocessing. Goal preprocessing was subsequently merged into the main VC generation function.	2026-02-26 16:34:15 +00:00
Garmelon	08eb78a5b2	chore: switch to new test/bench suite (#12590 ) This PR sets up the new integrated test/bench suite. It then migrates all benchmarks and some related tests to the new suite. There's also some documentation and some linting. For now, a lot of the old tests are left alone so this PR doesn't become even larger than it already is. Eventually, all tests should be migrated to the new suite though so there isn't a confusing mix of two systems.	2026-02-25 13:51:53 +00:00
Marc Huisinga	168c125cf5	chore: relative lean-toolchains (#12652 ) This PR changes all `lean-toolchain` to use relative toolchain paths instead of `lean4` and `lean4-stage0` identifiers, which removes the need for manually linking toolchains via Elan. After this PR, at least Elan 4.2.0 and 0.0.224 of the Lean VS Code extension will be needed to edit core. Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-02-25 10:23:35 +00:00
Sebastian Graf	bb8d8da1af	test: add benchmark vcgen_reader_state (#12671 ) This PR adds the benchmark vcgen_reader_state that is a variant of vcgen_add_sub_cancel that takes the value to subtract from a `ReaderT` layer. Measurements: ``` goal_100: 201 ms, 1 VCs by sorry: 0 ms, kernel: 52 ms goal_500: 382 ms, 1 VCs by sorry: 0 ms, kernel: 327 ms goal_1000: 674 ms, 1 VCs by sorry: 1 ms, kernel: 741 ms ``` Which suggests it scales linearly. The generated VC triggers superlinear behavior in `grind`, though, hence it is discharged by `sorry`.	2026-02-24 13:19:15 +00:00
Sebastian Graf	8916246be5	test: speed up vcgen_get_throw_set.lean by partially evaluating specs (#12670 ) This PR speeds up the vcgen_get_throw_set benchmark by a factor of 4 by partially evaluating specs.	2026-02-24 13:10:42 +00:00
Wojciech Różowski	65f112a165	chore: rename prime filter benchmark and fix the merge sort benchmark (#12669 ) This PR renames the "Eratosthenes' sieve" benchmark description to "prime filter" in the speedcenter config (following the discussion in https://leanprover.zulipchat.com/#narrow/channel/270676-lean4/topic/sieve.20of.20Eratosthenes.20benchmark/with/575310824), and adds the missing `#eval runBenchmarks` call to the merge sort benchmark so it actually executes.	2026-02-24 10:57:47 +00:00
Wojciech Różowski	722813105d	test: add System F cbv benchmark (#12623 ) This PR adds a System F formalization as a `cbv` tactic benchmark. It is a translation of the Rocq case study from: Definitional Proof Irrelevance Made Accessible by Thiago Felicissimo, Yann Leray, Loïc Pujet, Nicolas Tabareau, Éric Tanter, Théo Winterhalter The authors have given permission to use their development. The benchmark includes: - A full System F formalization (substitution lemmas, confluence of λ-calculus, strong normalization) - A `pow2DoubleEq` benchmark that verifies 2^(n+1) = 2^n + 2^n via normalization in System F, measuring both `cbv` tactic time and kernel checking time for n = 0..6 Co-Authored-By: @david-christiansen Co-authored-by: David Thrane Christiansen <david@davidchristiansen.dk>	2026-02-20 16:46:07 +00:00
Sebastian Graf	06f36b61b8	test: use Sym.simp to unfold in VCGen benchmarks (#12593 ) This PR improves the Sym VCGen such that we can use Sym.simp to unfold definitions in the benchmark driver. To do so, it adds support for zeta-reduction in the VCGen and ensures that proof terms are maximally shared before being sent to the kernel.	2026-02-19 14:42:54 +00:00
Wojciech Różowski	fad343d9ef	test: add `List.mergeSort` benchmark for `cbv` tactic (#12588 ) This PR adds a benchmark for `cbv` tactic that involves evaluating `List.mergeSort` on a reversed list on natural numbers.	2026-02-19 13:59:42 +00:00
Sebastian Graf	14c973db4e	test: use `Sym.Pattern`s for discrimination tree matching in Sym VCGen (#12579 )	2026-02-19 08:08:53 +00:00
Sebastian Graf	172c5c3ba8	test: flag to use Sym.simp in Sym mvcgen benchmark driver (#12578 )	2026-02-19 07:37:01 +00:00
Sebastian Graf	e639b66d62	chore: rename SpecTheorems.add to SpecTheorems.insert, add SpecProof.getProof (#12574 ) This PR renames `SpecTheorems.add` to `SpecTheorems.insert`	2026-02-19 07:04:27 +00:00

1 2 3 4 5 ...

408 commits