This PR updates docstrings and function signatures in order to complete
the transition from `Iter.Partial` to `Iter.Total` (extrinsically
terminating by default). It also deprecates `allowNontermination` and
adds `Iter.Total.atIdxSlow?`.
This PR adds the `introSubstEq` MetaM tactic, as an optimization over
`intro h; subst h` that avoids introducing `h : a = b` if it can be
avoided,
which is the case when `b` can be reverted without reverting anything
else. Speeds up the generation of `injEq` theorem.
This PR adds two benchmarks for mvcgen in the style of Leo's SymM
benchmarks.
While performance on add_sub_cancel_StateM.lean is in the same order of
magnitude as the corresponding MetaM benchmark, add_if_sub_StateM.lean
is far slower.
Measurements for add_sub_cancel:
```
goal_10: 245.576221 ms, kernel: 134.134182 ms
goal_20: 613.945320 ms, kernel: 115.453811 ms
goal_30: 1074.053596 ms, kernel: 179.076070 ms
goal_40: 1680.678302 ms, kernel: 252.066677 ms
goal_50: 2457.209584 ms, kernel: 293.974096 ms
goal_60: 3271.773330 ms, kernel: 368.394386 ms
goal_70: 3981.247921 ms, kernel: 434.297822 ms
goal_80: 5077.300540 ms, kernel: 507.047772 ms
goal_90: 6486.990060 ms, kernel: 556.952095 ms
goal_100: 7791.399986 ms, kernel: 623.605163 ms
```
Measurements for add_if_sub:
```
goal_2: 89.762349 ms, kernel: 43.320205 ms
goal_3: 190.655546 ms, kernel: 38.888499 ms
goal_4: 434.461936 ms, kernel: 75.234581 ms
goal_5: 1110.295284 ms, kernel: 161.698707 ms
goal_6: 3241.383031 ms, kernel: 326.137173 ms
goal_7: 11675.609970 ms, kernel: 684.907188 ms
```
Much room for improvement.
This PR adds `simpArrowTelescope`, a simproc that simplifies telescopes
of non-dependent arrows (p₁ → p₂ → ... → q) while avoiding quadratic
proof growth.
When using `Expr.forallE` to represent nested implications, each nesting
level bumps de Bruijn indices in subterms, destroying sharing even with
hash-consing. For example, a free variable `x` gets different de Bruijn
representations at each depth, causing proof terms to grow.
`simpArrowTelescope` works by:
- Converting arrows to `Arrow p q` (a definitional wrapper)
- Simplifying each component
- Converting back to `→` form
Since `Arrow` arguments are not under binders, subterms remain identical
across nesting levels and can be shared.
The `simp_4` benchmark demonstrates the improvement:
With `forallE`: ~160ms, proof_size ≈ 173k
With `Arrow`: ~43ms, proof_size ≈ 16k
Tradeoff: `simpArrowTelescope` misses simplifications that depend on the
arrow structure (e.g., `p → p` to `True`), since post-methods aren't
applied to intermediate arrows. Thus, it is not used by default. to use
it, one has to set `simpArrowTelescope` as a `pre`-method.
This PR adds an API for building symbolic simulation engines and
verification
condition generators that leverage `grind`. The API wraps `Sym`
operations to
work with `grind`'s `Goal` type, enabling lightweight symbolic execution
while
carrying `grind` state for discharge steps.
New operations on `Goal`:
- `mkGoal`: create a `Goal` from an `MVarId`
- `introN`, `intros`: introduce binders
- `apply`: apply backward rules
- `simp`, `simpIgnoringNoProgress`: simplify using `Sym.Simp`
- `internalize`, `internalizeAll`: add hypotheses to the E-graph
- `grind`: attempt to close the goal using `grind`
- `assumption`: close by matching a hypothesis
A new test demonstrates the API on a stateful program with conditionals,
using `grind` to discharge arithmetic side conditions.
This PR adds a new benchmark `shallow_add_sub_cancel.lean` that
demonstrates symbolic simulation using a shallow embedding into monadic
`do` notation, as opposed to the deep embedding approach in
`add_sub_cancel.lean`.
The shallow embedding approach:
- Uses Lean's `StateM` monad directly instead of a custom command
language
- Defines `Exec s k post` as a simple predicate: `post (k s).1 (k s).2`
- Proves helper theorems for reasoning about monadic operations (`pure`,
`bind`, `get`, `set`, `modify`, `ite`)
- Programs are written in actual `do`-notation rather than a custom AST
The benchmark solves goals using both the `MetaM` and `SymM` frameworks,
showing that the shallow embedding integrates well with the symbolic
simulation infrastructure. `SymM` is again way faster than `MetaM`
### Symbolic simulation benchmark — tactic time only
Problem size `n` corresponds to a program with `4·n` monadic actions.
| n | MetaM tactic (ms) | SymM tactic (ms) | Speedup |
|-----|-------------------|------------------|---------|
| 10 | 82.10 | 11.37 | ~7.2× |
| 20 | 176.21 | 17.71 | ~9.9× |
| 30 | 306.47 | 25.39 | ~12.1× |
| 40 | 509.52 | 34.53 | ~14.7× |
| 50 | 689.19 | 43.51 | ~15.8× |
| 60 | 905.86 | 52.47 | ~17.3× |
| 70 | 1172.31 | 62.50 | ~18.8× |
| 80 | 1448.48 | 70.65 | ~20.5× |
| 90 | 1787.15 | 80.89 | ~22.1× |
| 100 | 2128.12 | 90.77 | ~23.5× |
<img width="580" height="455" alt="image"
src="https://github.com/user-attachments/assets/3511aaab-4d53-4520-8302-65d2d100df4a"
/>
This PR adds a comparison between `MetaM` and `SymM` for a benchmark was
proposed during the Lean@Google Hackathon.
### Benchmark description
In this benchmark, we define the semantics of a very simple imperative
language using an inductive predicate
```
Exec prog events mem lctx post
```
The predicate holds if, when executing the program `prog` with an
initial list of events `events`, memory `mem`, and local context `lctx`,
the postcondition `post` holds.
We then consider the following program:
```
input b
a := b
a := a + a
a := a - b
...
a := a + a
a := a - b
```
That is, after reading an input value `b`, the program repeatedly
updates the variable `a` by doubling it and then subtracting `b`.
We prove that, for any initial memory `m` and local context `l`, and
starting from the empty list of events, the following postcondition
holds:
```
fun t' m' l' =>
m' = m ∧ -- memory did not change
∃ v : Word,
t' = [IOEvent.IN v] ∧ -- exactly one input event
l'.get "a" = some v -- `a` contains the input value
```
In other words, executing the program produces exactly one input event,
leaves the memory unchanged, and ensures that the final value of `a` is
equal to the input value.
### Symbolic simulation benchmark (problem size `n`, with `2·n + 2`
instructions)
| Problem size (n) | MetaM time (ms) | MetaM kernel (ms) | SymM time
(ms) | SymM kernel (ms) | Total speedup |
|------------------|------------------|-------------------|----------------|------------------|---------------|
| 10 | 94.83 | 6.60 | 7.04 | 6.18 | ~13.5× |
| 20 | 218.92 | 13.33 | 14.15 | 13.02 | ~15.5× |
| 30 | 375.10 | 22.95 | 26.51 | 19.81 | ~14.2× |
| 40 | 563.82 | 34.99 | 40.42 | 29.55 | ~14.0× |
| 50 | 815.89 | 53.78 | 60.84 | 42.25 | ~13.4× |
| 60 | 1081.09 | 73.46 | 80.99 | 53.52 | ~13.3× |
| 70 | 1400.80 | 102.70 | 106.02 | 68.61 | ~13.2× |
| 80 | 1772.19 | 126.65 | 134.23 | 87.64 | ~13.2× |
| 90 | 2203.41 | 161.68 | 168.26 | 115.52 | ~13.1× |
| 100 | 2474.09 | 191.23 | 209.13 | 143.86 | ~11.8× |
<img width="580" height="455" alt="image"
src="https://github.com/user-attachments/assets/bc7058fa-e71a-4c2c-be28-860f39166965"
/>
### Symbolic simulation with extra simplification (SymM)
Problem size `n` corresponds to a program with `2·n + 2` instructions.
| n | Total time (ms) | Kernel time (ms) | Non-kernel time (ms) |
|-----|------------------|------------------|----------------------|
| 10 | 6.33 | 3.97 | 2.36 |
| 20 | 10.30 | 5.59 | 4.71 |
| 30 | 13.72 | 7.38 | 6.34 |
| 40 | 17.85 | 8.84 | 9.01 |
| 50 | 21.90 | 10.63 | 11.27 |
| 60 | 27.00 | 12.56 | 14.44 |
| 70 | 32.02 | 14.04 | 17.98 |
| 80 | 37.25 | 15.76 | 21.49 |
| 90 | 42.55 | 17.95 | 24.60 |
| 100 | 49.30 | 20.03 | 29.27 |
| 200 | 125.56 | 38.21 | 87.36 |
| 300 | 293.58 | 66.79 | 226.79 |
| 400 | 361.87 | 78.96 | 282.91 |
| 500 | 518.51 | 102.51 | 416.00 |
| 600 | 716.63 | 122.81 | 593.82 |
This PR adds support for offset terms in `SymM`. This is essential for
handling equational theorems for functions that pattern match on natural
numbers in `Sym.simp`. Without this, it cannot handle simple examples
such as
```lean
def pw (n : Nat) : Nat :=
match n with
| 0 => 1
| n+1 => 2 * pw n
example : pw 4 = 16 := by
sym_simp [pw.eq_1, pw.eq_2]
example : pw (a + 2) = 2 * (2 * pw a) := by
sym_simp [pw.eq_2]
```
This PR optimizes congruence proof construction in `Sym.simp` by
avoiding
`inferType` calls on expressions that are less likely to be cached.
Instead of
inferring types of expressions like `@HAdd.hAdd Nat Nat Nat instAdd 5`,
we infer
the type of the function prefix `@HAdd.hAdd Nat Nat Nat instAdd` and
traverse
the forall telescope.
The key insight is that function prefixes are more likely shared across
many call sites
(e.g., all `Nat` additions use the same `@HAdd.hAdd Nat Nat Nat
instAdd`), so they
benefit from `inferType` caching.
Benchmark results show improvements on workloads with shared function
prefixes:
- `many_rewrites_5000`: 48.8ms → 43.1ms (-12%)
- `term_tree_5000`: 53.4ms → 30.5ms (-43%)
This PR implements a new strategy for simplifying `have`-telescopes in
`Sym.simp` that achieves linear kernel type-checking time instead of
quadratic.
## Problem
When simplifying deep `have`-telescopes, the previous approach using
`have_congr'` produced proofs that type-checked in quadratic time. The
simplifier itself was fast, but the kernel became the bottleneck for
large telescopes.
For example, at n=100:
- **Before**: simp = 2.4ms, kernel = **225ms**
- **After**: simp = 3.5ms, kernel = **10ms**
The quadratic behavior occurred because the kernel creates fresh free
variables for each binder when type-checking, destroying sharing and
producing O(n²) intermediate terms.
## Solution
We transform sequential `have`-telescopes into a parallel
beta-application form:
```
have x₁ := v₁; have x₂ := v₂[x₁]; b[x₁, x₂]
↓ (definitionally equal)
(fun x₁ x₂' => b[x₁, x₂' x₁]) v₁ (fun x₁ => v₂[x₁])
```
This parallel form leverages the efficient simplifier for lambdas in
`Sym.simp`. This form enables:
1. Independent simplification of each argument
2. Proof construction using standard congruence lemmas
3. Linear kernel type-checking time
The algorithm has three phases:
1. **`toBetaApp`**: Transform telescope → parallel beta-application
2. **`simpBetaApp`**: Simplify using `congr`/`congrArg`/`congrFun'` and
`simpLambda`
3. **`toHave`**: Convert back to `have` form
## Benchmark Results
### Benchmark 1: Chain with all variables used in body
| n | Before (simp) | Before (kernel) | After (simp) | After (kernel) |
|---|---------------|-----------------|--------------|----------------|
| 50 | 1.2ms | 32ms | 1.6ms | 4.4ms |
| 100 | 2.4ms | **225ms** | 3.5ms | **10ms** |
| 200 | 4.5ms | — | 8.4ms | 27ms |
| 500 | 11.7ms | — | 33.6ms | 128ms |
### Benchmark 3: Parallel declarations (simplified values)
| n | Before (simp) | Before (kernel) | After (simp) | After (kernel) |
|---|---------------|-----------------|--------------|----------------|
| 50 | 0.5ms | 24ms | 0.8ms | 1.8ms |
| 100 | 1.2ms | **169ms** | 1.8ms | **5.3ms** |
| 200 | 2.2ms | — | 3.9ms | 17ms |
| 500 | 5.9ms | — | 12.3ms | 93ms |
### Benchmark 5: Chain with single dependency
| n | Before (simp) | Before (kernel) | After (simp) | After (kernel) |
|---|---------------|-----------------|--------------|----------------|
| 100 | 1.6ms | 6.2ms | 1.8ms | 6.2ms |
| 200 | 2.8ms | 21.6ms | 4.4ms | 16.5ms |
| 500 | 7.3ms | **125ms** | 12.8ms | **72ms** |
Key observations:
- Kernel time is now **linear** in telescope depth (previously
quadratic)
- Simp time increases slightly due to the transformation overhead
- Total time (simp + kernel) is dramatically reduced for large
telescopes
- The improvement is most pronounced when the body depends on many
variables
## Trade-offs
- Proof sizes are larger (more congruence lemma applications)
- Simp time has ~1.5x overhead from the transformation
- For very small telescopes (n < 10), the overhead may not pay off
The optimization targets the critical path: kernel type-checking was the
bottleneck preventing scaling to realistic symbolic simulation
workloads.
This PR adds a new option to the function `simpHaveTelescope` in which
the `have` telescope is simplified in two passes:
* In the first pass, only the values and the body are simplified.
* In the second pass, unused declarations are eliminated.
This new mode eliminates **superlinear** behavior in the benchmark
`simp_3.lean`. Note that the kernel type checker still **exhibits**
quadratic behavior in this example, because it **does not have support**
for expanding a `have`/`let` telescope in a single step.
This PR reorganizes the monad hierarchy for symbolic computation in
Lean.
## Motivation
We want a clean layering where:
1. A foundational monad (`SymM`) provides maximally shared terms and
structural/syntactic `isDefEq`
2. `GrindM` builds on this foundation, adding E-graphs, congruence
closure, and decision procedures
3. Symbolic execution / VCGen uses `GrindM` directly without introducing
a third monad
## Changes
The core symbolic computation layer still lives in `Lean.Meta.Sym`. This
monad (`SymM`) provides:
- Maximally shared terms with pointer-based equality
- Structural/syntactic `isDefEq` and matching (no reduction, predictable
cost)
- Monotonic local contexts (no `revert` or `clear`), enabling O(1)
metavariable validation
- Efficient `intro`, `apply`, and `simp` implementations
The name "Sym" reflects that this is infrastructure for symbolic
computation: symbolic simulation, verification condition generation, and
decision procedures.
### Updated hierarchy
```
Lean.Meta.Sym -- SymM: shared terms, syntactic isDefEq, intro, apply, simp
Lean.Meta.Grind -- GrindM: E-graphs, congruence closure (extends SymM)
```
Symbolic execution is a usage pattern of `GrindM` operating on
`Grind.Goal`, not a separate monad. This keeps the API surface minimal:
users learn two monads, and VCGen is "how you use `GrindM`" (for users
that want to use `grind`) rather than a third abstraction to understand.
This PR adds a `done` flag to the result returned by `Simproc`s in
`Sym.simp`.
The `done` flag controls whether simplification should continue after
the result:
- `done = false` (default): Continue with subsequent simplification
steps
- `done = true`: Stop processing, return this result as final
## Use cases for `done = true`
### In `pre` simprocs
Skip simplification of certain subterms entirely:
```
def skipLambdas : Simproc := fun e =>
if e.isLambda then return .rfl (done := true)
else return .rfl
```
### In `post` simprocs
Perform single-pass normalization without recursive simplification:
```
def singlePassNormalize : Simproc := fun e =>
if let some (e', h) ← tryNormalize e then
return .step e' h (done := true)
else return .rfl
```
With `done = true`, the result `e'` won't be recursively simplified.
This PR adds support for simplifying lambda expressions in `Sym.simp`.
It is much more efficient than standard simp for very large lambda
expressions with many binders. The key idea is to generate a custom
function extensionality theorem for the type of the lambda being
simplified.
This technique is compatible with the standard `simp` tactic, and will
be ported in a separate PR.
<img width="581" height="455" alt="image"
src="https://github.com/user-attachments/assets/5911dc6c-03f0-48ed-843b-b8cb4f67ee61"
/>
### `lambda` benchmark summary
| Lambda size | MetaM (ms) | SymM (ms) | Speedup |
|-------------|------------|-----------|---------|
| 50 | 22.7 | 0.74 | ~31× |
| 100 | 120.5 | 1.75 | ~69× |
| 150 | 359.6 | 2.90 | ~124× |
| 200 | 809.5 | 4.51 | ~180× |
This PR ensures that `Sym.simp` checks thresholds for maximum recursion
depth and maximum number of steps. It also invokes `checkSystem`.
Additionally, this PR simplifies the main loop. Assigned metavariables
and `zetaDelta` reduction are now handled by installing `pre`/`post`
methods.
This PR adds `getMatch` and `getMatchWithExtra` for retrieving patterns
from
discrimination trees in the symbolic simulation framework.
The PR also adds uses `DiscrTree` to implement indexing in `Sym.simp`.
This PR moves many constants of the iterator API from `Std.Iterators` to
the `Std` namespace in order to make them more convenient to use. These
constants include, but are not limited to, `Iter`, `IterM` and
`IteratorLoop`. This is a breaking change. If something breaks, try
adding `open Std` in order to make these constants available again. If
some constants in the `Std.Iterators` namespace cannot be found, they
can be found directly in `Std` now.
This PR introduces a new fixpoint combinator,
`WellFounded.extrinsicFix`. A termination proof, if provided at all, can
be given extrinsically, i.e., looking at the term from the outside, and
is only required if one intends to formally verify the behavior of the
fixpoint. The new combinator is then applied to the iterator API.
Consumers such as `toList` or `ForIn` no longer require a proof that the
underlying iterator is finite. If one wants to ensure the termination of
them intrinsically, there are strictly terminating variants available
as, for example, `it.ensureTermination.toList` instead of `it.toList`.
This PR fixes various typos across the codebase in documentation and
comments.
- `infered` → `inferred` (ParserCompiler.lean)
- `declartation` → `declaration` (Cleanup.lean)
- `certian` → `certain` (CasesInfo.lean)
- `wil` → `will` (Cache.lean)
- `the the` → `the` (multiple files - PrefixTree.lean, Sum/Basic.lean,
List/Nat/Perm.lean, Time.lean, Bounded.lean, Lake files)
- `to to` → `to` (MutualInductive.lean, simp_bubblesort_256.lean)
- Grammar improvements in Bounded.lean and Time.lean
All changes are to comments and documentation only - no functional
changes.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-authored-by: Claude <noreply@anthropic.com>
The bench script expected no output on stdout from `compile.sh`, which
was not always the case. Now, it separates the compilation and size
measurement steps.