lean4-htt

Author	SHA1	Message	Date
Wojciech Różowski	fad343d9ef	test: add `List.mergeSort` benchmark for `cbv` tactic (#12588 ) This PR adds a benchmark for `cbv` tactic that involves evaluating `List.mergeSort` on a reversed list on natural numbers.	2026-02-19 13:59:42 +00:00
Sebastian Graf	14c973db4e	test: use `Sym.Pattern`s for discrimination tree matching in Sym VCGen (#12579 )	2026-02-19 08:08:53 +00:00
Sebastian Graf	172c5c3ba8	test: flag to use Sym.simp in Sym mvcgen benchmark driver (#12578 )	2026-02-19 07:37:01 +00:00
Sebastian Graf	e639b66d62	chore: rename SpecTheorems.add to SpecTheorems.insert, add SpecProof.getProof (#12574 ) This PR renames `SpecTheorems.add` to `SpecTheorems.insert`	2026-02-19 07:04:27 +00:00
Sebastian Graf	c0b9ff1148	test: measure VC discharging separately in Sym mvcgen benchmarks (#12551 ) This PR refactors the benchmark driver of the Sym mvcgen benchmarks such that time spent for discharging VCs and instantiation of MVars is measured separately from VC generation. Example output: ``` baseline_add_sub_cancel goal_100: 57 ms, 0 VCs, kernel: 22 ms goal_500: 353 ms, 0 VCs, kernel: 160 ms goal_1000: 755 ms, 0 VCs, kernel: 437 ms vcgen_add_sub_cancel goal_100: 36 ms, 1 VCs by grind: 21 ms, kernel: 35 ms goal_500: 149 ms, 1 VCs by grind: 115 ms, kernel: 214 ms goal_1000: 314 ms, 1 VCs by grind: 249 ms, kernel: 478 ms vcgen_deep_add_sub_cancel goal_100: 65 ms, 1 VCs by grind: 23 ms, kernel: 82 ms goal_500: 262 ms, 1 VCs by grind: 123 ms, kernel: 539 ms goal_1000: 611 ms, 1 VCs by grind: 292 ms, kernel: 1075 ms vcgen_get_throw_set goal_100: 87 ms, 101 VCs by sorry: 16 ms, kernel: 93 ms goal_500: 332 ms, 501 VCs by sorry: 289 ms, instantiate > 1000ms: 23363 ms, kernel: 770 ms goal_1000: 794 ms, 1001 VCs by sorry: 1332 ms, instantiate > 1000ms: 334614 ms, kernel: 1882 ms ```	2026-02-18 12:03:47 +00:00
Sebastian Graf	c5c0ddcc56	test: remove `let` handling from Sym mvcgen (#12505 ) This PR removes the unnecessary and potentially broken handling of `let`s by zeta-reduction in Sym-based `mvcgen`. It turns out to be unnecessary for the benchmarks so far, so there is a lack of motivation to publicize `betaRevS` which would be needed to fix it.	2026-02-16 15:58:36 +00:00
Sebastian Graf	f084ce1497	test: share benchmark driver for Sym mvcgen; don't measure unfolding (#12501 ) This PR shares the driver code from the Sym-based mvcgen benchmarks. It also moves the `simp only [loop, step]` call out of the measured section, so that we measure purely the overhead of VC generation. The new benchmark results are as follows. All measurements for n=1000: ``` baseline_add_sub_cancel: 719.318425 ms, kernel: 382.708178 ms vcgen_add_sub_cancel: 306.883079 ms, kernel: 455.050825 ms vcgen_deep_add_sub_cancel: 543.350543 ms, kernel: 896.926298 ms vcgen_get_throw_set: 669.566541 ms, kernel: 60754.202714 ms ``` Note that `vcgen_add_sub_cancel` sped up by 100% because we no longer measure unfolding `loop` and `step`. The baseline didn't speed up as much because it unfolded in the same `Sym.simp` call that also does other rewrites, so there was no `simp` pass that could be eliminated.	2026-02-16 13:17:00 +00:00
Wojciech Różowski	f3b8f76ec4	test: add `cbv` benchmark for evaluating `Decidable.decide` (#12467 ) This PR adds a benchmark for `cbv` tactic for evaluating `Decidable.decide` for a `Decidable` instance for a problem of checking if a number is not a prime power. The test has been inspired by a recent discussion on the [leanprover zulip](https://leanprover.zulipchat.com/#narrow/channel/287929-mathlib4/topic/Can't.20decide.20if.2015.20is.20a.20prime.20power/with/572513330).	2026-02-13 13:25:35 +00:00
Sebastian Graf	204555ba83	test: add Sym vcgen benchmarks get_throw_set and deep_add_sub_cancel (#12447 ) This PR adds two more benchmarks for the Sym-based mvcgen prototype in the style of `add_sub_cancel`. The first is `deep_add_sub_cancel`, which is like `add_sub_cancel` but with a much deeper monad stack: ```lean abbrev M := ExceptT String <\| ReaderT String <\| ExceptT Nat <\| StateT Nat <\| ExceptT Unit <\| StateM Unit ``` By specializing the specs for `get` and `set`, we get competitive performance: ``` goal_100: 180.365086 ms, kernel: 79.634989 ms goal_200: 313.465611 ms, kernel: 187.808631 ms goal_300: 478.278585 ms, kernel: 270.210634 ms goal_400: 638.884320 ms, kernel: 380.381127 ms goal_500: 759.802772 ms, kernel: 472.662882 ms goal_600: 933.575180 ms, kernel: 649.040746 ms goal_700: 1174.367200 ms, kernel: 759.470010 ms goal_800: 1298.866482 ms, kernel: 864.420171 ms goal_900: 1475.315552 ms, kernel: 1008.662783 ms goal_1000: 1627.957444 ms, kernel: 1078.627830 ms ``` Recall that `add_sub_cancel` had `goal_1000: 824.476962 ms, kernel: 477.069045 ms`, but that doesn't need to repeatedly unwrap 3 layers of the monad. The second benchmark is `get_throw_set`. Its kernel is ```lean def step (lim : Nat) : ExceptT String (StateM Nat) Unit := do let s ← get if s > lim then throw "s is too large" set (s + 1) def loop (n : Nat) : ExceptT String (StateM Nat) Unit := do match n with \| 0 => pure () \| n+1 => loop n; step n def Goal (n : Nat) : Prop := ⦃fun s => ⌜s = 0⌝⦄ loop n ⦃⇓_ s => ⌜s = n⌝⦄ ``` It will generate `n+1` VCs. We get `n` VCs of the form ``` s✝ : Nat _ : ¬0 < s✝ ... _ : n < s✝ + 1 ...<n times>... + 1 ⊢ ⌜s✝ = 0⌝ ⊢ₛ ⌜False⌝ (s✝ + ...<n times>...) ``` and one VC of the form ``` ⌜s✝ = 0⌝ ⊢ₛ ⌜s✝ + 1 + <n times> ... + 1 = n⌝ ``` which can be discharged by `grind`, but presently are discharged with `sorry`. Statistics: ``` goal_100: 209.435869 ms, kernel: 128.768919 ms goal_200: 386.639441 ms, kernel: 482.244717 ms goal_300: 559.795137 ms, kernel: 1251.777405 ms goal_400: 753.243978 ms, kernel: 3020.878177 ms goal_500: 1014.939522 ms, kernel: 5182.120327 ms goal_600: 1229.173622 ms, kernel: 9296.551442 ms goal_700: 1410.024180 ms, kernel: 16655.954682 ms goal_800: 1684.059305 ms, kernel: 32065.951705 ms goal_900: 1905.602401 ms, kernel: 55299.942894 ms goal_1000: 2172.823244 ms, kernel: 84082.492485 ms ``` Need to look at kernel times here, but tactic time looks about alright. Using `grind` to discharge just `n=100` goals took 8s.	2026-02-11 19:54:23 +00:00
Wojciech Różowski	9bfd16ef5e	refactor: main loop of the `cbv` tactic (#12417 ) This PR refactors the main loop of the `cbv` tactic. Rather than using multiple simprocs, a central pre simproc is introduced. Moreover, let expressions are no longer immediately zeta-reduced due to performance on one of the benchmarks (`leroy.lean`). Stacked on top of #12416	2026-02-11 11:47:18 +00:00
Sebastian Graf	7a8e011603	test: support ite splitting and lifting through ExceptT to Sym mvcgen (#12392 )	2026-02-09 13:41:35 +00:00
Paul Reichert	4070ee3b8e	fix: sigmaIterator benchmark (#12364 ) This PR fixes breakage in the sigmaIterator benchmark.	2026-02-06 19:45:42 +00:00
Sebastian Graf	fa7f51038b	test: document Sym-based mvcGen and tidy up benchmark project (#12350 )	2026-02-06 14:12:29 +00:00
Sebastian Graf	8b5293382b	test: copy shallow_add_sub_cancel into mvcgen benchmark for precompilation (#12347 )	2026-02-06 13:07:25 +00:00
Wojciech Różowski	d2176cb5fb	test: add tests and benchmarks for `cbv` tactic (#12345 ) This PR adds two benchmarks (sieve of Eratosthenes, removing duplicates from the list) and one test (a function with sublinear complexity defined via well-founded recursion evaluated on large naturals with up to `60` digits). The tests have been suggested by @b-mehta.	2026-02-06 11:41:03 +00:00
Wojciech Różowski	bd8c861316	test: add program evaluation benchmark to `cbv` (#12280 ) This PR adds a benchmark based on Xavier Leroy's compiler verification course to test call-by-value tactic. Stacked on top of #12279	2026-02-04 15:37:00 +00:00
Sebastian Graf	00c1f0d3a9	test: create aux theorems for backward rules in SymM-based mvcgen (#12295 ) This PR improves and simplifies the SymM-based mvcgen prototype by creating `BackwardRule.apply`-ready auxiliary theorems for spec theorems. These auxiliary theorems have types that have reducible definitions unfolded and shared, just like the rest of the SymM world assumes. Furthermore, in order to aid kernel checking times, definitional reductions leave behind expected type hints. With #12290, we get the following numbers: ``` goal_100: 100.671964 ms, kernel: 34.104676 ms goal_200: 152.650808 ms, kernel: 70.653251 ms goal_300: 222.973242 ms, kernel: 105.874266 ms goal_400: 294.032333 ms, kernel: 150.025106 ms goal_500: 366.748098 ms, kernel: 193.483843 ms goal_600: 442.509542 ms, kernel: 236.845115 ms goal_700: 517.527685 ms, kernel: 268.804230 ms goal_800: 601.657910 ms, kernel: 310.765606 ms goal_900: 681.020759 ms, kernel: 357.428032 ms goal_1000: 762.212989 ms, kernel: 403.789517 ms ``` The baseline is `shallow_add_sub_cancel`: ``` goal_100: 62.721757 ms, kernel: 22.109237 ms goal_200: 140.118652 ms, kernel: 45.219512 ms goal_300: 241.077690 ms, kernel: 78.779379 ms goal_400: 363.274462 ms, kernel: 128.951250 ms goal_500: 517.350791 ms, kernel: 155.498217 ms goal_600: 678.291416 ms, kernel: 212.325487 ms goal_700: 881.479043 ms, kernel: 258.690695 ms goal_800: 1092.357375 ms, kernel: 351.996079 ms goal_900: 1247.759480 ms, kernel: 319.197608 ms goal_1000: 1497.203628 ms, kernel: 364.532560 ms ``` The latter is with the main solving loop in interpreter mode, but the kernel checking times are still representative. Earlier experiments suggest that the precompiled baseline performs at roughly 650ms for `goal_1000`, so the new mvcgen is getting close.	2026-02-03 17:43:33 +00:00
Sebastian Graf	7416309805	test: teach SymM mvcgen to recognize specialized theorem applications (#12256 ) This PR recognizes certain kinds of composite proof terms of the form `hpre.trans hspec \|> (wp prog).mono _ _ hpost` and abstracts them into bespoke theorems. This should yield smaller proof terms. Sadly, kernel checking time is unaffected, even regressing a bit. The number of shared terms stays almost the same (+- a constant). Hence I deactivate the code path in this patch. We keep the code, though, because it might be useful in the future, also there are a few other improvements.	2026-01-30 17:00:59 +00:00
Sebastian Graf	59f3abd0bd	test: add SymM `mvcgen` and port the `add_sub_cancel` benchmark (#12251 ) This PR adds a clone of the `mvcgen` tactic based on `SymM` and evaluates it based on a ported `add_sub_cancel` benchmark. Notably, it can reuse all the existing `@[spec]`-annotated theorems to generate VCs. (It doesn't do control-flow splitting, simp rules on the program expression or handling of lets; we'll get there.) It is quite fast already, with the kernel being the bottle-neck: ``` goal_50: 69.524305 ms, kernel: 155.327778 ms goal_100: 93.834221 ms, kernel: 407.370786 ms goal_150: 131.364098 ms, kernel: 762.936720 ms goal_200: 169.577172 ms, kernel: 1181.199093 ms goal_250: 206.421738 ms, kernel: 1707.539380 ms ``` ``` goal_200: 169.458637 ms, kernel: 1186.221085 ms goal_400: 322.819718 ms, kernel: 3791.613854 ms goal_600: 474.929013 ms, kernel: 7763.373757 ms goal_800: 634.379422 ms, kernel: 13107.810430 ms ``` It is best compared to the `solveUsingSym <n> false true` measurements of the SymM `add_sub_cancel` benchmark (`false`: without intermediate eager simplification). For `n=200`, it reports ``` goal_200: 779.482300 ms, kernel: 742.097404 ms ``` suggesting that the generated proof term could be improved for kernel reduction. (TODO.) I'm unsure whether `solveUsingSym` is run in interpreted mode, so take the >400% speedup with a grain of salt. We can definitely conclude that VC generation time is currently not a bottleneck compared to kernel checking time. Plot for discharging goals of sizes 100..800: <img width="1000" height="600" alt="Code_Generated_Image(1)" src="https://github.com/user-attachments/assets/90e76a45-fa46-4d02-912a-c3355e2aa094" /> Plot comparing Kernel and Goal time: <img width="1000" height="600" alt="Code_Generated_Image(2)" src="https://github.com/user-attachments/assets/5849ba0f-1d83-4f2d-98dd-fa65b840bb4e" />	2026-01-30 15:12:54 +00:00
Paul Reichert	16919852d9	refactor: remove last appearances of `allowNontermination` (#12211 ) This PR updates docstrings and function signatures in order to complete the transition from `Iter.Partial` to `Iter.Total` (extrinsically terminating by default). It also deprecates `allowNontermination` and adds `Iter.Total.atIdxSlow?`.	2026-01-29 07:22:19 +00:00
Joachim Breitner	08f43acefb	perf: add introSubstEq shortcut (#12190 ) This PR adds the `introSubstEq` MetaM tactic, as an optimization over `intro h; subst h` that avoids introducing `h : a = b` if it can be avoided, which is the case when `b` can be reverted without reverting anything else. Speeds up the generation of `injEq` theorem.	2026-01-28 12:33:14 +00:00
Leonardo de Moura	50bbf101be	test: delayed assignment performance issue (#12201 ) This PR adds a benchmark that exposes a performance issue at `instantiateMVars` when there are many nested delayed assignments.	2026-01-28 02:08:39 +00:00
Joachim Breitner	36bae38e6b	test: add big_struct_dep1 benchmark (#12191 )	2026-01-27 14:36:09 +00:00
Sebastian Graf	e8870da205	chore: improve performance of mpure_intro and mvcgen by avoiding whnfD (#12165 ) New measurements: ``` goal_10: 181.910200 ms, kernel: 37.241050 ms goal_20: 386.540215 ms, kernel: 83.497428 ms goal_30: 648.282057 ms, kernel: 117.038447 ms goal_40: 946.733191 ms, kernel: 168.369124 ms goal_50: 1325.846873 ms, kernel: 223.838786 ms goal_60: 1734.175705 ms, kernel: 285.594486 ms goal_70: 2199.522317 ms, kernel: 351.659865 ms goal_80: 2700.077802 ms, kernel: 428.303337 ms goal_90: 3260.446641 ms, kernel: 515.976499 ms goal_100: 3865.503733 ms, kernel: 600.229962 ms ``` Previously, goal_100 took 7.8s.	2026-01-26 17:58:33 +00:00
Sebastian Graf	b44c7e161c	test: add two benchmarks for mvcgen in the style of SymM (#12163 ) This PR adds two benchmarks for mvcgen in the style of Leo's SymM benchmarks. While performance on add_sub_cancel_StateM.lean is in the same order of magnitude as the corresponding MetaM benchmark, add_if_sub_StateM.lean is far slower. Measurements for add_sub_cancel: ``` goal_10: 245.576221 ms, kernel: 134.134182 ms goal_20: 613.945320 ms, kernel: 115.453811 ms goal_30: 1074.053596 ms, kernel: 179.076070 ms goal_40: 1680.678302 ms, kernel: 252.066677 ms goal_50: 2457.209584 ms, kernel: 293.974096 ms goal_60: 3271.773330 ms, kernel: 368.394386 ms goal_70: 3981.247921 ms, kernel: 434.297822 ms goal_80: 5077.300540 ms, kernel: 507.047772 ms goal_90: 6486.990060 ms, kernel: 556.952095 ms goal_100: 7791.399986 ms, kernel: 623.605163 ms ``` Measurements for add_if_sub: ``` goal_2: 89.762349 ms, kernel: 43.320205 ms goal_3: 190.655546 ms, kernel: 38.888499 ms goal_4: 434.461936 ms, kernel: 75.234581 ms goal_5: 1110.295284 ms, kernel: 161.698707 ms goal_6: 3241.383031 ms, kernel: 326.137173 ms goal_7: 11675.609970 ms, kernel: 684.907188 ms ``` Much room for improvement.	2026-01-26 13:17:47 +00:00
Leonardo de Moura	45862d5486	feat: improves `simpArrowTelescope` simproc (#12153 ) This PR improves the `simpArrowTelescope` simproc that simplifies non-dependent arrow telescopes: `p₁ → p₂ → ... → q`. The simproc now also applies telescope-specific simplifications: - `False → q` to `True` (when `q : Prop`) - `True → q` to `q` (when `q : Prop`) - `p → True` to `True`	2026-01-25 22:29:38 +00:00
Leonardo de Moura	ba8c2ed4ee	feat: add `simpArrowTelescope` for compact proofs of arrow simplification (#12152 ) This PR adds `simpArrowTelescope`, a simproc that simplifies telescopes of non-dependent arrows (p₁ → p₂ → ... → q) while avoiding quadratic proof growth. When using `Expr.forallE` to represent nested implications, each nesting level bumps de Bruijn indices in subterms, destroying sharing even with hash-consing. For example, a free variable `x` gets different de Bruijn representations at each depth, causing proof terms to grow. `simpArrowTelescope` works by: - Converting arrows to `Arrow p q` (a definitional wrapper) - Simplifying each component - Converting back to `→` form Since `Arrow` arguments are not under binders, subterms remain identical across nesting levels and can be shared. The `simp_4` benchmark demonstrates the improvement: With `forallE`: ~160ms, proof_size ≈ 173k With `Arrow`: ~43ms, proof_size ≈ 16k Tradeoff: `simpArrowTelescope` misses simplifications that depend on the arrow structure (e.g., `p → p` to `True`), since post-methods aren't applied to intermediate arrows. Thus, it is not used by default. to use it, one has to set `simpArrowTelescope` as a `pre`-method.	2026-01-25 20:43:59 +00:00
Leonardo de Moura	e90f6f77db	test: local rewrite with `Sym.simp` (#12147 ) This PR adds a new API for helping users write focused rewrites.	2026-01-25 01:32:50 +00:00
Leonardo de Moura	6de7100f69	feat: add `Goal` API for `SymM` + `grind` (#12143 ) This PR adds an API for building symbolic simulation engines and verification condition generators that leverage `grind`. The API wraps `Sym` operations to work with `grind`'s `Goal` type, enabling lightweight symbolic execution while carrying `grind` state for discharge steps. New operations on `Goal`: - `mkGoal`: create a `Goal` from an `MVarId` - `introN`, `intros`: introduce binders - `apply`: apply backward rules - `simp`, `simpIgnoringNoProgress`: simplify using `Sym.Simp` - `internalize`, `internalizeAll`: add hypotheses to the E-graph - `grind`: attempt to close the goal using `grind` - `assumption`: close by matching a hypothesis A new test demonstrates the API on a stateful program with conditionals, using `grind` to discharge arithmetic side conditions.	2026-01-24 20:30:08 +00:00
Leonardo de Moura	4c1e4a77b4	test: `MetaM` vs `SymM` on `do` notation (#12134 ) This PR adds a new benchmark `shallow_add_sub_cancel.lean` that demonstrates symbolic simulation using a shallow embedding into monadic `do` notation, as opposed to the deep embedding approach in `add_sub_cancel.lean`. The shallow embedding approach: - Uses Lean's `StateM` monad directly instead of a custom command language - Defines `Exec s k post` as a simple predicate: `post (k s).1 (k s).2` - Proves helper theorems for reasoning about monadic operations (`pure`, `bind`, `get`, `set`, `modify`, `ite`) - Programs are written in actual `do`-notation rather than a custom AST The benchmark solves goals using both the `MetaM` and `SymM` frameworks, showing that the shallow embedding integrates well with the symbolic simulation infrastructure. `SymM` is again way faster than `MetaM` ### Symbolic simulation benchmark — tactic time only Problem size `n` corresponds to a program with `4·n` monadic actions. \| n \| MetaM tactic (ms) \| SymM tactic (ms) \| Speedup \| \|-----\|-------------------\|------------------\|---------\| \| 10 \| 82.10 \| 11.37 \| ~7.2× \| \| 20 \| 176.21 \| 17.71 \| ~9.9× \| \| 30 \| 306.47 \| 25.39 \| ~12.1× \| \| 40 \| 509.52 \| 34.53 \| ~14.7× \| \| 50 \| 689.19 \| 43.51 \| ~15.8× \| \| 60 \| 905.86 \| 52.47 \| ~17.3× \| \| 70 \| 1172.31 \| 62.50 \| ~18.8× \| \| 80 \| 1448.48 \| 70.65 \| ~20.5× \| \| 90 \| 1787.15 \| 80.89 \| ~22.1× \| \| 100 \| 2128.12 \| 90.77 \| ~23.5× \| <img width="580" height="455" alt="image" src="https://github.com/user-attachments/assets/3511aaab-4d53-4520-8302-65d2d100df4a" />	2026-01-24 03:38:02 +00:00
Leonardo de Moura	c81a8897a9	feat: improve `Sym.simp` APIs and new benchmark data (#12101 ) This PR improves the the `Sym.simp` APIs. It is now easier to reuse the simplifier cache between different simplification steps. We use the APIs to improve the benchmark at #12100. ### Symbolic simulation with simplifier cache reuse (SymM) Problem size `n` corresponds to a program with `2·n + 2` instructions. \| n \| Tactic time (ms) \| Kernel time (ms) \| \|-----\|------------------\|------------------\| \| 10 \| 4.53 \| 4.29 \| \| 20 \| 5.56 \| 6.91 \| \| 30 \| 6.46 \| 8.67 \| \| 40 \| 8.07 \| 11.20 \| \| 50 \| 9.37 \| 13.63 \| \| 60 \| 11.89 \| 15.43 \| \| 70 \| 12.43 \| 18.28 \| \| 80 \| 14.07 \| 20.72 \| \| 90 \| 15.62 \| 23.41 \| \| 100 \| 17.39 \| 24.80 \| \| 200 \| 30.35 \| 48.39 \| \| 300 \| 45.41 \| 72.84 \| \| 400 \| 59.17 \| 97.67 \| \| 500 \| 79.63 \| 138.99 \| \| 600 \| 100.05 \| 173.67 \| \| 700 \| 119.77 \| 208.80 \| <img width="571" height="455" alt="image" src="https://github.com/user-attachments/assets/70da7ea2-b5d2-405e-985c-bfa358455afc" />	2026-01-22 03:37:16 +00:00
Leonardo de Moura	fa40491c78	test: benchmark `MetaM` vs `SymM` (#12100 ) This PR adds a comparison between `MetaM` and `SymM` for a benchmark was proposed during the Lean@Google Hackathon. ### Benchmark description In this benchmark, we define the semantics of a very simple imperative language using an inductive predicate ``` Exec prog events mem lctx post ``` The predicate holds if, when executing the program `prog` with an initial list of events `events`, memory `mem`, and local context `lctx`, the postcondition `post` holds. We then consider the following program: ``` input b a := b a := a + a a := a - b ... a := a + a a := a - b ``` That is, after reading an input value `b`, the program repeatedly updates the variable `a` by doubling it and then subtracting `b`. We prove that, for any initial memory `m` and local context `l`, and starting from the empty list of events, the following postcondition holds: ``` fun t' m' l' => m' = m ∧ -- memory did not change ∃ v : Word, t' = [IOEvent.IN v] ∧ -- exactly one input event l'.get "a" = some v -- `a` contains the input value ``` In other words, executing the program produces exactly one input event, leaves the memory unchanged, and ensures that the final value of `a` is equal to the input value. ### Symbolic simulation benchmark (problem size `n`, with `2·n + 2` instructions) \| Problem size (n) \| MetaM time (ms) \| MetaM kernel (ms) \| SymM time (ms) \| SymM kernel (ms) \| Total speedup \| \|------------------\|------------------\|-------------------\|----------------\|------------------\|---------------\| \| 10 \| 94.83 \| 6.60 \| 7.04 \| 6.18 \| ~13.5× \| \| 20 \| 218.92 \| 13.33 \| 14.15 \| 13.02 \| ~15.5× \| \| 30 \| 375.10 \| 22.95 \| 26.51 \| 19.81 \| ~14.2× \| \| 40 \| 563.82 \| 34.99 \| 40.42 \| 29.55 \| ~14.0× \| \| 50 \| 815.89 \| 53.78 \| 60.84 \| 42.25 \| ~13.4× \| \| 60 \| 1081.09 \| 73.46 \| 80.99 \| 53.52 \| ~13.3× \| \| 70 \| 1400.80 \| 102.70 \| 106.02 \| 68.61 \| ~13.2× \| \| 80 \| 1772.19 \| 126.65 \| 134.23 \| 87.64 \| ~13.2× \| \| 90 \| 2203.41 \| 161.68 \| 168.26 \| 115.52 \| ~13.1× \| \| 100 \| 2474.09 \| 191.23 \| 209.13 \| 143.86 \| ~11.8× \| <img width="580" height="455" alt="image" src="https://github.com/user-attachments/assets/bc7058fa-e71a-4c2c-be28-860f39166965" /> ### Symbolic simulation with extra simplification (SymM) Problem size `n` corresponds to a program with `2·n + 2` instructions. \| n \| Total time (ms) \| Kernel time (ms) \| Non-kernel time (ms) \| \|-----\|------------------\|------------------\|----------------------\| \| 10 \| 6.33 \| 3.97 \| 2.36 \| \| 20 \| 10.30 \| 5.59 \| 4.71 \| \| 30 \| 13.72 \| 7.38 \| 6.34 \| \| 40 \| 17.85 \| 8.84 \| 9.01 \| \| 50 \| 21.90 \| 10.63 \| 11.27 \| \| 60 \| 27.00 \| 12.56 \| 14.44 \| \| 70 \| 32.02 \| 14.04 \| 17.98 \| \| 80 \| 37.25 \| 15.76 \| 21.49 \| \| 90 \| 42.55 \| 17.95 \| 24.60 \| \| 100 \| 49.30 \| 20.03 \| 29.27 \| \| 200 \| 125.56 \| 38.21 \| 87.36 \| \| 300 \| 293.58 \| 66.79 \| 226.79 \| \| 400 \| 361.87 \| 78.96 \| 282.91 \| \| 500 \| 518.51 \| 102.51 \| 416.00 \| \| 600 \| 716.63 \| 122.81 \| 593.82 \|	2026-01-22 01:38:56 +00:00
Leonardo de Moura	af438425d5	perf: avoid `mkAppM` in `Sym.simp` (#12099 ) This PR ensures `Sym.simpGoal` does not use `mkAppM`. It also increases the default number of maximum steps in `Sym.simp`.	2026-01-22 00:01:43 +00:00
Leonardo de Moura	f84aa23d6d	feat: metavar cleanup in `Sym.simp` (#12096 ) This PR cleanups temporary metavariables generated when applying rewriting rules in `Sym.simp`.	2026-01-21 21:36:17 +00:00
Leonardo de Moura	34d8eeb3be	chore: fix and rename `sym_add_sub_cancel` benchmark (#12092 )	2026-01-21 17:47:40 +00:00
Leonardo de Moura	08e6f714ca	chore: normalize `Sym` APIs (#12088 ) This PR cleanups the Sym APIs for `apply` and `simp`.	2026-01-21 17:02:22 +00:00
Joachim Breitner	ad43266357	test: add a big dependent struct test (#12061 ) This PR adds a test for a big dependent structure, exhibiting some bad performance in `injEq` generation.	2026-01-20 12:00:25 +00:00
Leonardo de Moura	e9a1c9ef63	feat: offset terms in `Sym` (#12053 ) This PR adds support for offset terms in `SymM`. This is essential for handling equational theorems for functions that pattern match on natural numbers in `Sym.simp`. Without this, it cannot handle simple examples such as ```lean def pw (n : Nat) : Nat := match n with \| 0 => 1 \| n+1 => 2 * pw n example : pw 4 = 16 := by sym_simp [pw.eq_1, pw.eq_2] example : pw (a + 2) = 2 * (2 * pw a) := by sym_simp [pw.eq_2] ```	2026-01-20 04:57:52 +00:00
Leonardo de Moura	df8ff255cb	test: benchmark from Lean Hackathon (#12051 )	2026-01-20 01:32:41 +00:00
Joachim Breitner	c2918b2701	test: add benchmark for #11992 (#11997 )	2026-01-13 21:15:32 +00:00
Leonardo de Moura	58e599f2f9	perf: optimize congruence proof construction in `Sym.simp` (#11974 ) This PR optimizes congruence proof construction in `Sym.simp` by avoiding `inferType` calls on expressions that are less likely to be cached. Instead of inferring types of expressions like `@HAdd.hAdd Nat Nat Nat instAdd 5`, we infer the type of the function prefix `@HAdd.hAdd Nat Nat Nat instAdd` and traverse the forall telescope. The key insight is that function prefixes are more likely shared across many call sites (e.g., all `Nat` additions use the same `@HAdd.hAdd Nat Nat Nat instAdd`), so they benefit from `inferType` caching. Benchmark results show improvements on workloads with shared function prefixes: - `many_rewrites_5000`: 48.8ms → 43.1ms (-12%) - `term_tree_5000`: 53.4ms → 30.5ms (-43%)	2026-01-11 23:00:19 +00:00
Leonardo de Moura	d7cbdebf0b	chore: cleanup `simp` benchmark (#11971 )	2026-01-11 19:55:39 +00:00
Leonardo de Moura	d57f71c1c0	perf: optimize kernel type-checking for `have`-telescope simplification in `Sym.simp` (#11967 ) This PR implements a new strategy for simplifying `have`-telescopes in `Sym.simp` that achieves linear kernel type-checking time instead of quadratic. ## Problem When simplifying deep `have`-telescopes, the previous approach using `have_congr'` produced proofs that type-checked in quadratic time. The simplifier itself was fast, but the kernel became the bottleneck for large telescopes. For example, at n=100: - Before: simp = 2.4ms, kernel = 225ms - After: simp = 3.5ms, kernel = 10ms The quadratic behavior occurred because the kernel creates fresh free variables for each binder when type-checking, destroying sharing and producing O(n²) intermediate terms. ## Solution We transform sequential `have`-telescopes into a parallel beta-application form: ``` have x₁ := v₁; have x₂ := v₂[x₁]; b[x₁, x₂] ↓ (definitionally equal) (fun x₁ x₂' => b[x₁, x₂' x₁]) v₁ (fun x₁ => v₂[x₁]) ``` This parallel form leverages the efficient simplifier for lambdas in `Sym.simp`. This form enables: 1. Independent simplification of each argument 2. Proof construction using standard congruence lemmas 3. Linear kernel type-checking time The algorithm has three phases: 1. `toBetaApp`: Transform telescope → parallel beta-application 2. `simpBetaApp`: Simplify using `congr`/`congrArg`/`congrFun'` and `simpLambda` 3. `toHave`: Convert back to `have` form ## Benchmark Results ### Benchmark 1: Chain with all variables used in body \| n \| Before (simp) \| Before (kernel) \| After (simp) \| After (kernel) \| \|---\|---------------\|-----------------\|--------------\|----------------\| \| 50 \| 1.2ms \| 32ms \| 1.6ms \| 4.4ms \| \| 100 \| 2.4ms \| 225ms \| 3.5ms \| 10ms \| \| 200 \| 4.5ms \| — \| 8.4ms \| 27ms \| \| 500 \| 11.7ms \| — \| 33.6ms \| 128ms \| ### Benchmark 3: Parallel declarations (simplified values) \| n \| Before (simp) \| Before (kernel) \| After (simp) \| After (kernel) \| \|---\|---------------\|-----------------\|--------------\|----------------\| \| 50 \| 0.5ms \| 24ms \| 0.8ms \| 1.8ms \| \| 100 \| 1.2ms \| 169ms \| 1.8ms \| 5.3ms \| \| 200 \| 2.2ms \| — \| 3.9ms \| 17ms \| \| 500 \| 5.9ms \| — \| 12.3ms \| 93ms \| ### Benchmark 5: Chain with single dependency \| n \| Before (simp) \| Before (kernel) \| After (simp) \| After (kernel) \| \|---\|---------------\|-----------------\|--------------\|----------------\| \| 100 \| 1.6ms \| 6.2ms \| 1.8ms \| 6.2ms \| \| 200 \| 2.8ms \| 21.6ms \| 4.4ms \| 16.5ms \| \| 500 \| 7.3ms \| 125ms \| 12.8ms \| 72ms \| Key observations: - Kernel time is now linear in telescope depth (previously quadratic) - Simp time increases slightly due to the transformation overhead - Total time (simp + kernel) is dramatically reduced for large telescopes - The improvement is most pronounced when the body depends on many variables ## Trade-offs - Proof sizes are larger (more congruence lemma applications) - Simp time has ~1.5x overhead from the transformation - For very small telescopes (n < 10), the overhead may not pay off The optimization targets the critical path: kernel type-checking was the bottleneck preventing scaling to realistic symbolic simulation workloads.	2026-01-11 02:20:47 +00:00
Sebastian Ullrich	eaf8cf15ff	test: add `leanchecker` benchmark (#11959 )	2026-01-10 20:52:11 +00:00
Leonardo de Moura	cae739c27c	test: `implies` vs `Arrow` `Sym.simp` benchmark (#11966 )	2026-01-10 18:51:54 +00:00
Leonardo de Moura	d92cdae8e9	feat: `simpForall` and `simpArrow` in `Sym.simp` (#11950 ) This PR implements `simpForall` and `simpArrow` in `Sym.simp`.	2026-01-09 06:20:04 +00:00
Leonardo de Moura	0e4794a1a9	test: benchmarks for `lambda`-telescopes (#11929 )	2026-01-08 00:20:03 +00:00
Leonardo de Moura	8484dbad5d	test: benchmarks for `have`-telescopes (#11927 )	2026-01-07 23:24:46 +00:00
Leonardo de Moura	ff87bcb8e5	feat: add option for simplifying `have` decls in two passes (#11923 ) This PR adds a new option to the function `simpHaveTelescope` in which the `have` telescope is simplified in two passes: * In the first pass, only the values and the body are simplified. * In the second pass, unused declarations are eliminated. This new mode eliminates superlinear behavior in the benchmark `simp_3.lean`. Note that the kernel type checker still exhibits quadratic behavior in this example, because it does not have support for expanding a `have`/`let` telescope in a single step.	2026-01-07 01:58:36 +00:00
Leonardo de Moura	8154453bb5	feat: simplify `have` blocks in `Sym.simp` (#11920 ) This PR implements support for simplifying `have` telescopes in `Sym.simp`.	2026-01-07 00:10:47 +00:00

1 2 3 4 5 ...

362 commits