lean4-htt

Author	SHA1	Message	Date
Leonardo de Moura	45862d5486	feat: improves `simpArrowTelescope` simproc (#12153 ) This PR improves the `simpArrowTelescope` simproc that simplifies non-dependent arrow telescopes: `p₁ → p₂ → ... → q`. The simproc now also applies telescope-specific simplifications: - `False → q` to `True` (when `q : Prop`) - `True → q` to `q` (when `q : Prop`) - `p → True` to `True`	2026-01-25 22:29:38 +00:00
Leonardo de Moura	ba8c2ed4ee	feat: add `simpArrowTelescope` for compact proofs of arrow simplification (#12152 ) This PR adds `simpArrowTelescope`, a simproc that simplifies telescopes of non-dependent arrows (p₁ → p₂ → ... → q) while avoiding quadratic proof growth. When using `Expr.forallE` to represent nested implications, each nesting level bumps de Bruijn indices in subterms, destroying sharing even with hash-consing. For example, a free variable `x` gets different de Bruijn representations at each depth, causing proof terms to grow. `simpArrowTelescope` works by: - Converting arrows to `Arrow p q` (a definitional wrapper) - Simplifying each component - Converting back to `→` form Since `Arrow` arguments are not under binders, subterms remain identical across nesting levels and can be shared. The `simp_4` benchmark demonstrates the improvement: With `forallE`: ~160ms, proof_size ≈ 173k With `Arrow`: ~43ms, proof_size ≈ 16k Tradeoff: `simpArrowTelescope` misses simplifications that depend on the arrow structure (e.g., `p → p` to `True`), since post-methods aren't applied to intermediate arrows. Thus, it is not used by default. to use it, one has to set `simpArrowTelescope` as a `pre`-method.	2026-01-25 20:43:59 +00:00
Leonardo de Moura	e90f6f77db	test: local rewrite with `Sym.simp` (#12147 ) This PR adds a new API for helping users write focused rewrites.	2026-01-25 01:32:50 +00:00
Leonardo de Moura	6de7100f69	feat: add `Goal` API for `SymM` + `grind` (#12143 ) This PR adds an API for building symbolic simulation engines and verification condition generators that leverage `grind`. The API wraps `Sym` operations to work with `grind`'s `Goal` type, enabling lightweight symbolic execution while carrying `grind` state for discharge steps. New operations on `Goal`: - `mkGoal`: create a `Goal` from an `MVarId` - `introN`, `intros`: introduce binders - `apply`: apply backward rules - `simp`, `simpIgnoringNoProgress`: simplify using `Sym.Simp` - `internalize`, `internalizeAll`: add hypotheses to the E-graph - `grind`: attempt to close the goal using `grind` - `assumption`: close by matching a hypothesis A new test demonstrates the API on a stateful program with conditionals, using `grind` to discharge arithmetic side conditions.	2026-01-24 20:30:08 +00:00
Leonardo de Moura	4c1e4a77b4	test: `MetaM` vs `SymM` on `do` notation (#12134 ) This PR adds a new benchmark `shallow_add_sub_cancel.lean` that demonstrates symbolic simulation using a shallow embedding into monadic `do` notation, as opposed to the deep embedding approach in `add_sub_cancel.lean`. The shallow embedding approach: - Uses Lean's `StateM` monad directly instead of a custom command language - Defines `Exec s k post` as a simple predicate: `post (k s).1 (k s).2` - Proves helper theorems for reasoning about monadic operations (`pure`, `bind`, `get`, `set`, `modify`, `ite`) - Programs are written in actual `do`-notation rather than a custom AST The benchmark solves goals using both the `MetaM` and `SymM` frameworks, showing that the shallow embedding integrates well with the symbolic simulation infrastructure. `SymM` is again way faster than `MetaM` ### Symbolic simulation benchmark — tactic time only Problem size `n` corresponds to a program with `4·n` monadic actions. \| n \| MetaM tactic (ms) \| SymM tactic (ms) \| Speedup \| \|-----\|-------------------\|------------------\|---------\| \| 10 \| 82.10 \| 11.37 \| ~7.2× \| \| 20 \| 176.21 \| 17.71 \| ~9.9× \| \| 30 \| 306.47 \| 25.39 \| ~12.1× \| \| 40 \| 509.52 \| 34.53 \| ~14.7× \| \| 50 \| 689.19 \| 43.51 \| ~15.8× \| \| 60 \| 905.86 \| 52.47 \| ~17.3× \| \| 70 \| 1172.31 \| 62.50 \| ~18.8× \| \| 80 \| 1448.48 \| 70.65 \| ~20.5× \| \| 90 \| 1787.15 \| 80.89 \| ~22.1× \| \| 100 \| 2128.12 \| 90.77 \| ~23.5× \| <img width="580" height="455" alt="image" src="https://github.com/user-attachments/assets/3511aaab-4d53-4520-8302-65d2d100df4a" />	2026-01-24 03:38:02 +00:00
Leonardo de Moura	c81a8897a9	feat: improve `Sym.simp` APIs and new benchmark data (#12101 ) This PR improves the the `Sym.simp` APIs. It is now easier to reuse the simplifier cache between different simplification steps. We use the APIs to improve the benchmark at #12100. ### Symbolic simulation with simplifier cache reuse (SymM) Problem size `n` corresponds to a program with `2·n + 2` instructions. \| n \| Tactic time (ms) \| Kernel time (ms) \| \|-----\|------------------\|------------------\| \| 10 \| 4.53 \| 4.29 \| \| 20 \| 5.56 \| 6.91 \| \| 30 \| 6.46 \| 8.67 \| \| 40 \| 8.07 \| 11.20 \| \| 50 \| 9.37 \| 13.63 \| \| 60 \| 11.89 \| 15.43 \| \| 70 \| 12.43 \| 18.28 \| \| 80 \| 14.07 \| 20.72 \| \| 90 \| 15.62 \| 23.41 \| \| 100 \| 17.39 \| 24.80 \| \| 200 \| 30.35 \| 48.39 \| \| 300 \| 45.41 \| 72.84 \| \| 400 \| 59.17 \| 97.67 \| \| 500 \| 79.63 \| 138.99 \| \| 600 \| 100.05 \| 173.67 \| \| 700 \| 119.77 \| 208.80 \| <img width="571" height="455" alt="image" src="https://github.com/user-attachments/assets/70da7ea2-b5d2-405e-985c-bfa358455afc" />	2026-01-22 03:37:16 +00:00
Leonardo de Moura	fa40491c78	test: benchmark `MetaM` vs `SymM` (#12100 ) This PR adds a comparison between `MetaM` and `SymM` for a benchmark was proposed during the Lean@Google Hackathon. ### Benchmark description In this benchmark, we define the semantics of a very simple imperative language using an inductive predicate ``` Exec prog events mem lctx post ``` The predicate holds if, when executing the program `prog` with an initial list of events `events`, memory `mem`, and local context `lctx`, the postcondition `post` holds. We then consider the following program: ``` input b a := b a := a + a a := a - b ... a := a + a a := a - b ``` That is, after reading an input value `b`, the program repeatedly updates the variable `a` by doubling it and then subtracting `b`. We prove that, for any initial memory `m` and local context `l`, and starting from the empty list of events, the following postcondition holds: ``` fun t' m' l' => m' = m ∧ -- memory did not change ∃ v : Word, t' = [IOEvent.IN v] ∧ -- exactly one input event l'.get "a" = some v -- `a` contains the input value ``` In other words, executing the program produces exactly one input event, leaves the memory unchanged, and ensures that the final value of `a` is equal to the input value. ### Symbolic simulation benchmark (problem size `n`, with `2·n + 2` instructions) \| Problem size (n) \| MetaM time (ms) \| MetaM kernel (ms) \| SymM time (ms) \| SymM kernel (ms) \| Total speedup \| \|------------------\|------------------\|-------------------\|----------------\|------------------\|---------------\| \| 10 \| 94.83 \| 6.60 \| 7.04 \| 6.18 \| ~13.5× \| \| 20 \| 218.92 \| 13.33 \| 14.15 \| 13.02 \| ~15.5× \| \| 30 \| 375.10 \| 22.95 \| 26.51 \| 19.81 \| ~14.2× \| \| 40 \| 563.82 \| 34.99 \| 40.42 \| 29.55 \| ~14.0× \| \| 50 \| 815.89 \| 53.78 \| 60.84 \| 42.25 \| ~13.4× \| \| 60 \| 1081.09 \| 73.46 \| 80.99 \| 53.52 \| ~13.3× \| \| 70 \| 1400.80 \| 102.70 \| 106.02 \| 68.61 \| ~13.2× \| \| 80 \| 1772.19 \| 126.65 \| 134.23 \| 87.64 \| ~13.2× \| \| 90 \| 2203.41 \| 161.68 \| 168.26 \| 115.52 \| ~13.1× \| \| 100 \| 2474.09 \| 191.23 \| 209.13 \| 143.86 \| ~11.8× \| <img width="580" height="455" alt="image" src="https://github.com/user-attachments/assets/bc7058fa-e71a-4c2c-be28-860f39166965" /> ### Symbolic simulation with extra simplification (SymM) Problem size `n` corresponds to a program with `2·n + 2` instructions. \| n \| Total time (ms) \| Kernel time (ms) \| Non-kernel time (ms) \| \|-----\|------------------\|------------------\|----------------------\| \| 10 \| 6.33 \| 3.97 \| 2.36 \| \| 20 \| 10.30 \| 5.59 \| 4.71 \| \| 30 \| 13.72 \| 7.38 \| 6.34 \| \| 40 \| 17.85 \| 8.84 \| 9.01 \| \| 50 \| 21.90 \| 10.63 \| 11.27 \| \| 60 \| 27.00 \| 12.56 \| 14.44 \| \| 70 \| 32.02 \| 14.04 \| 17.98 \| \| 80 \| 37.25 \| 15.76 \| 21.49 \| \| 90 \| 42.55 \| 17.95 \| 24.60 \| \| 100 \| 49.30 \| 20.03 \| 29.27 \| \| 200 \| 125.56 \| 38.21 \| 87.36 \| \| 300 \| 293.58 \| 66.79 \| 226.79 \| \| 400 \| 361.87 \| 78.96 \| 282.91 \| \| 500 \| 518.51 \| 102.51 \| 416.00 \| \| 600 \| 716.63 \| 122.81 \| 593.82 \|	2026-01-22 01:38:56 +00:00
Leonardo de Moura	af438425d5	perf: avoid `mkAppM` in `Sym.simp` (#12099 ) This PR ensures `Sym.simpGoal` does not use `mkAppM`. It also increases the default number of maximum steps in `Sym.simp`.	2026-01-22 00:01:43 +00:00
Leonardo de Moura	f84aa23d6d	feat: metavar cleanup in `Sym.simp` (#12096 ) This PR cleanups temporary metavariables generated when applying rewriting rules in `Sym.simp`.	2026-01-21 21:36:17 +00:00
Leonardo de Moura	34d8eeb3be	chore: fix and rename `sym_add_sub_cancel` benchmark (#12092 )	2026-01-21 17:47:40 +00:00
Leonardo de Moura	08e6f714ca	chore: normalize `Sym` APIs (#12088 ) This PR cleanups the Sym APIs for `apply` and `simp`.	2026-01-21 17:02:22 +00:00
Leonardo de Moura	e9a1c9ef63	feat: offset terms in `Sym` (#12053 ) This PR adds support for offset terms in `SymM`. This is essential for handling equational theorems for functions that pattern match on natural numbers in `Sym.simp`. Without this, it cannot handle simple examples such as ```lean def pw (n : Nat) : Nat := match n with \| 0 => 1 \| n+1 => 2 * pw n example : pw 4 = 16 := by sym_simp [pw.eq_1, pw.eq_2] example : pw (a + 2) = 2 * (2 * pw a) := by sym_simp [pw.eq_2] ```	2026-01-20 04:57:52 +00:00
Leonardo de Moura	df8ff255cb	test: benchmark from Lean Hackathon (#12051 )	2026-01-20 01:32:41 +00:00
Leonardo de Moura	58e599f2f9	perf: optimize congruence proof construction in `Sym.simp` (#11974 ) This PR optimizes congruence proof construction in `Sym.simp` by avoiding `inferType` calls on expressions that are less likely to be cached. Instead of inferring types of expressions like `@HAdd.hAdd Nat Nat Nat instAdd 5`, we infer the type of the function prefix `@HAdd.hAdd Nat Nat Nat instAdd` and traverse the forall telescope. The key insight is that function prefixes are more likely shared across many call sites (e.g., all `Nat` additions use the same `@HAdd.hAdd Nat Nat Nat instAdd`), so they benefit from `inferType` caching. Benchmark results show improvements on workloads with shared function prefixes: - `many_rewrites_5000`: 48.8ms → 43.1ms (-12%) - `term_tree_5000`: 53.4ms → 30.5ms (-43%)	2026-01-11 23:00:19 +00:00
Leonardo de Moura	d7cbdebf0b	chore: cleanup `simp` benchmark (#11971 )	2026-01-11 19:55:39 +00:00
Leonardo de Moura	d57f71c1c0	perf: optimize kernel type-checking for `have`-telescope simplification in `Sym.simp` (#11967 ) This PR implements a new strategy for simplifying `have`-telescopes in `Sym.simp` that achieves linear kernel type-checking time instead of quadratic. ## Problem When simplifying deep `have`-telescopes, the previous approach using `have_congr'` produced proofs that type-checked in quadratic time. The simplifier itself was fast, but the kernel became the bottleneck for large telescopes. For example, at n=100: - Before: simp = 2.4ms, kernel = 225ms - After: simp = 3.5ms, kernel = 10ms The quadratic behavior occurred because the kernel creates fresh free variables for each binder when type-checking, destroying sharing and producing O(n²) intermediate terms. ## Solution We transform sequential `have`-telescopes into a parallel beta-application form: ``` have x₁ := v₁; have x₂ := v₂[x₁]; b[x₁, x₂] ↓ (definitionally equal) (fun x₁ x₂' => b[x₁, x₂' x₁]) v₁ (fun x₁ => v₂[x₁]) ``` This parallel form leverages the efficient simplifier for lambdas in `Sym.simp`. This form enables: 1. Independent simplification of each argument 2. Proof construction using standard congruence lemmas 3. Linear kernel type-checking time The algorithm has three phases: 1. `toBetaApp`: Transform telescope → parallel beta-application 2. `simpBetaApp`: Simplify using `congr`/`congrArg`/`congrFun'` and `simpLambda` 3. `toHave`: Convert back to `have` form ## Benchmark Results ### Benchmark 1: Chain with all variables used in body \| n \| Before (simp) \| Before (kernel) \| After (simp) \| After (kernel) \| \|---\|---------------\|-----------------\|--------------\|----------------\| \| 50 \| 1.2ms \| 32ms \| 1.6ms \| 4.4ms \| \| 100 \| 2.4ms \| 225ms \| 3.5ms \| 10ms \| \| 200 \| 4.5ms \| — \| 8.4ms \| 27ms \| \| 500 \| 11.7ms \| — \| 33.6ms \| 128ms \| ### Benchmark 3: Parallel declarations (simplified values) \| n \| Before (simp) \| Before (kernel) \| After (simp) \| After (kernel) \| \|---\|---------------\|-----------------\|--------------\|----------------\| \| 50 \| 0.5ms \| 24ms \| 0.8ms \| 1.8ms \| \| 100 \| 1.2ms \| 169ms \| 1.8ms \| 5.3ms \| \| 200 \| 2.2ms \| — \| 3.9ms \| 17ms \| \| 500 \| 5.9ms \| — \| 12.3ms \| 93ms \| ### Benchmark 5: Chain with single dependency \| n \| Before (simp) \| Before (kernel) \| After (simp) \| After (kernel) \| \|---\|---------------\|-----------------\|--------------\|----------------\| \| 100 \| 1.6ms \| 6.2ms \| 1.8ms \| 6.2ms \| \| 200 \| 2.8ms \| 21.6ms \| 4.4ms \| 16.5ms \| \| 500 \| 7.3ms \| 125ms \| 12.8ms \| 72ms \| Key observations: - Kernel time is now linear in telescope depth (previously quadratic) - Simp time increases slightly due to the transformation overhead - Total time (simp + kernel) is dramatically reduced for large telescopes - The improvement is most pronounced when the body depends on many variables ## Trade-offs - Proof sizes are larger (more congruence lemma applications) - Simp time has ~1.5x overhead from the transformation - For very small telescopes (n < 10), the overhead may not pay off The optimization targets the critical path: kernel type-checking was the bottleneck preventing scaling to realistic symbolic simulation workloads.	2026-01-11 02:20:47 +00:00
Leonardo de Moura	cae739c27c	test: `implies` vs `Arrow` `Sym.simp` benchmark (#11966 )	2026-01-10 18:51:54 +00:00
Leonardo de Moura	d92cdae8e9	feat: `simpForall` and `simpArrow` in `Sym.simp` (#11950 ) This PR implements `simpForall` and `simpArrow` in `Sym.simp`.	2026-01-09 06:20:04 +00:00
Leonardo de Moura	0e4794a1a9	test: benchmarks for `lambda`-telescopes (#11929 )	2026-01-08 00:20:03 +00:00
Leonardo de Moura	8484dbad5d	test: benchmarks for `have`-telescopes (#11927 )	2026-01-07 23:24:46 +00:00
Leonardo de Moura	ff87bcb8e5	feat: add option for simplifying `have` decls in two passes (#11923 ) This PR adds a new option to the function `simpHaveTelescope` in which the `have` telescope is simplified in two passes: * In the first pass, only the values and the body are simplified. * In the second pass, unused declarations are eliminated. This new mode eliminates superlinear behavior in the benchmark `simp_3.lean`. Note that the kernel type checker still exhibits quadratic behavior in this example, because it does not have support for expanding a `have`/`let` telescope in a single step.	2026-01-07 01:58:36 +00:00
Leonardo de Moura	8154453bb5	feat: simplify `have` blocks in `Sym.simp` (#11920 ) This PR implements support for simplifying `have` telescopes in `Sym.simp`.	2026-01-07 00:10:47 +00:00
Leonardo de Moura	175661b6c3	refactor: reorganize `SymM` and `GrindM` monad hierarchy (#11909 ) This PR reorganizes the monad hierarchy for symbolic computation in Lean. ## Motivation We want a clean layering where: 1. A foundational monad (`SymM`) provides maximally shared terms and structural/syntactic `isDefEq` 2. `GrindM` builds on this foundation, adding E-graphs, congruence closure, and decision procedures 3. Symbolic execution / VCGen uses `GrindM` directly without introducing a third monad ## Changes The core symbolic computation layer still lives in `Lean.Meta.Sym`. This monad (`SymM`) provides: - Maximally shared terms with pointer-based equality - Structural/syntactic `isDefEq` and matching (no reduction, predictable cost) - Monotonic local contexts (no `revert` or `clear`), enabling O(1) metavariable validation - Efficient `intro`, `apply`, and `simp` implementations The name "Sym" reflects that this is infrastructure for symbolic computation: symbolic simulation, verification condition generation, and decision procedures. ### Updated hierarchy ``` Lean.Meta.Sym -- SymM: shared terms, syntactic isDefEq, intro, apply, simp Lean.Meta.Grind -- GrindM: E-graphs, congruence closure (extends SymM) ``` Symbolic execution is a usage pattern of `GrindM` operating on `Grind.Goal`, not a separate monad. This keeps the API surface minimal: users learn two monads, and VCGen is "how you use `GrindM`" (for users that want to use `grind`) rather than a third abstraction to understand.	2026-01-06 01:12:07 +00:00
Leonardo de Moura	82f60a7ff3	feat: `pre` and `post` may return "done" in `Sym.simp` (#11900 ) This PR adds a `done` flag to the result returned by `Simproc`s in `Sym.simp`. The `done` flag controls whether simplification should continue after the result: - `done = false` (default): Continue with subsequent simplification steps - `done = true`: Stop processing, return this result as final ## Use cases for `done = true` ### In `pre` simprocs Skip simplification of certain subterms entirely: ``` def skipLambdas : Simproc := fun e => if e.isLambda then return .rfl (done := true) else return .rfl ``` ### In `post` simprocs Perform single-pass normalization without recursive simplification: ``` def singlePassNormalize : Simproc := fun e => if let some (e', h) ← tryNormalize e then return .step e' h (done := true) else return .rfl ``` With `done = true`, the result `e'` won't be recursively simplified.	2026-01-05 02:10:06 +00:00
Leonardo de Moura	f1c903ca65	feat: simplify lambdas in `Sym.simp` (#11898 ) This PR adds support for simplifying lambda expressions in `Sym.simp`. It is much more efficient than standard simp for very large lambda expressions with many binders. The key idea is to generate a custom function extensionality theorem for the type of the lambda being simplified. This technique is compatible with the standard `simp` tactic, and will be ported in a separate PR. <img width="581" height="455" alt="image" src="https://github.com/user-attachments/assets/5911dc6c-03f0-48ed-843b-b8cb4f67ee61" /> ### `lambda` benchmark summary \| Lambda size \| MetaM (ms) \| SymM (ms) \| Speedup \| \|-------------\|------------\|-----------\|---------\| \| 50 \| 22.7 \| 0.74 \| ~31× \| \| 100 \| 120.5 \| 1.75 \| ~69× \| \| 150 \| 359.6 \| 2.90 \| ~124× \| \| 200 \| 809.5 \| 4.51 \| ~180× \|	2026-01-05 01:00:30 +00:00
Leonardo de Moura	609d99e860	chore: include free variables (#11894 ) This PR includes free variable in a `simp` benchmark to stress the default `simp` matching procedure.	2026-01-04 18:51:18 +00:00
Leonardo de Moura	78c9a01bb2	feat: check `Sym.simp` thresholds (#11890 ) This PR ensures that `Sym.simp` checks thresholds for maximum recursion depth and maximum number of steps. It also invokes `checkSystem`. Additionally, this PR simplifies the main loop. Assigned metavariables and `zetaDelta` reduction are now handled by installing `pre`/`post` methods.	2026-01-04 04:27:46 +00:00
Leonardo de Moura	bc72487aed	refactor: `Sym.simp` (#11888 ) This PR refactors `Sym.simp` to make it more general and customizable. It also moves the code to its own subdirectory `Meta/Sym/Simp`.	2026-01-04 02:17:23 +00:00
Leonardo de Moura	b40dabdecd	feat: add discrimination tree retrieval for `Sym` (#11886 ) This PR adds `getMatch` and `getMatchWithExtra` for retrieving patterns from discrimination trees in the symbolic simulation framework. The PR also adds uses `DiscrTree` to implement indexing in `Sym.simp`.	2026-01-03 20:28:07 +00:00
Leonardo de Moura	4e8b5cfc46	test: benchmark `Sym` and `Meta` simplifiers (#11870 ) This PR adds simple benchmarks for comparing the `MetaM` and `SymM` simplifiers. The `SymM` simplifier is still working in progress. ### Big picture across benchmarks \| Benchmark \| MetaM scaling \| SymM scaling \| Speedup (approx.) \| \|-------------------------\|-------------------\|--------------\|-------------------\| \| `trans_chain` \| Linear \| Linear \| ~8–9× \| \| `congr_arg_explosion` \| Super-linear \| Linear \| ~100× \| \| `many_rewrites` \| Super-linear \| Linear \| ~10–16× \| <img width="598" height="455" alt="image" src="https://github.com/user-attachments/assets/8bd9021b-b9cf-4fc0-aab4-3118d87f7c22" /> <img width="644" height="455" alt="image" src="https://github.com/user-attachments/assets/0234dc11-0be7-441a-83b6-c309d20a2663" /> <img width="611" height="455" alt="image" src="https://github.com/user-attachments/assets/df79d057-25ed-49d9-a8f3-5285e5fc7013" />	2026-01-02 03:59:54 +00:00

30 commits