This PR optimizes congruence proof construction in `Sym.simp` by
avoiding
`inferType` calls on expressions that are less likely to be cached.
Instead of
inferring types of expressions like `@HAdd.hAdd Nat Nat Nat instAdd 5`,
we infer
the type of the function prefix `@HAdd.hAdd Nat Nat Nat instAdd` and
traverse
the forall telescope.
The key insight is that function prefixes are more likely shared across
many call sites
(e.g., all `Nat` additions use the same `@HAdd.hAdd Nat Nat Nat
instAdd`), so they
benefit from `inferType` caching.
Benchmark results show improvements on workloads with shared function
prefixes:
- `many_rewrites_5000`: 48.8ms → 43.1ms (-12%)
- `term_tree_5000`: 53.4ms → 30.5ms (-43%)
This PR implements a new strategy for simplifying `have`-telescopes in
`Sym.simp` that achieves linear kernel type-checking time instead of
quadratic.
## Problem
When simplifying deep `have`-telescopes, the previous approach using
`have_congr'` produced proofs that type-checked in quadratic time. The
simplifier itself was fast, but the kernel became the bottleneck for
large telescopes.
For example, at n=100:
- **Before**: simp = 2.4ms, kernel = **225ms**
- **After**: simp = 3.5ms, kernel = **10ms**
The quadratic behavior occurred because the kernel creates fresh free
variables for each binder when type-checking, destroying sharing and
producing O(n²) intermediate terms.
## Solution
We transform sequential `have`-telescopes into a parallel
beta-application form:
```
have x₁ := v₁; have x₂ := v₂[x₁]; b[x₁, x₂]
↓ (definitionally equal)
(fun x₁ x₂' => b[x₁, x₂' x₁]) v₁ (fun x₁ => v₂[x₁])
```
This parallel form leverages the efficient simplifier for lambdas in
`Sym.simp`. This form enables:
1. Independent simplification of each argument
2. Proof construction using standard congruence lemmas
3. Linear kernel type-checking time
The algorithm has three phases:
1. **`toBetaApp`**: Transform telescope → parallel beta-application
2. **`simpBetaApp`**: Simplify using `congr`/`congrArg`/`congrFun'` and
`simpLambda`
3. **`toHave`**: Convert back to `have` form
## Benchmark Results
### Benchmark 1: Chain with all variables used in body
| n | Before (simp) | Before (kernel) | After (simp) | After (kernel) |
|---|---------------|-----------------|--------------|----------------|
| 50 | 1.2ms | 32ms | 1.6ms | 4.4ms |
| 100 | 2.4ms | **225ms** | 3.5ms | **10ms** |
| 200 | 4.5ms | — | 8.4ms | 27ms |
| 500 | 11.7ms | — | 33.6ms | 128ms |
### Benchmark 3: Parallel declarations (simplified values)
| n | Before (simp) | Before (kernel) | After (simp) | After (kernel) |
|---|---------------|-----------------|--------------|----------------|
| 50 | 0.5ms | 24ms | 0.8ms | 1.8ms |
| 100 | 1.2ms | **169ms** | 1.8ms | **5.3ms** |
| 200 | 2.2ms | — | 3.9ms | 17ms |
| 500 | 5.9ms | — | 12.3ms | 93ms |
### Benchmark 5: Chain with single dependency
| n | Before (simp) | Before (kernel) | After (simp) | After (kernel) |
|---|---------------|-----------------|--------------|----------------|
| 100 | 1.6ms | 6.2ms | 1.8ms | 6.2ms |
| 200 | 2.8ms | 21.6ms | 4.4ms | 16.5ms |
| 500 | 7.3ms | **125ms** | 12.8ms | **72ms** |
Key observations:
- Kernel time is now **linear** in telescope depth (previously
quadratic)
- Simp time increases slightly due to the transformation overhead
- Total time (simp + kernel) is dramatically reduced for large
telescopes
- The improvement is most pronounced when the body depends on many
variables
## Trade-offs
- Proof sizes are larger (more congruence lemma applications)
- Simp time has ~1.5x overhead from the transformation
- For very small telescopes (n < 10), the overhead may not pay off
The optimization targets the critical path: kernel type-checking was the
bottleneck preventing scaling to realistic symbolic simulation
workloads.
This PR adds a new option to the function `simpHaveTelescope` in which
the `have` telescope is simplified in two passes:
* In the first pass, only the values and the body are simplified.
* In the second pass, unused declarations are eliminated.
This new mode eliminates **superlinear** behavior in the benchmark
`simp_3.lean`. Note that the kernel type checker still **exhibits**
quadratic behavior in this example, because it **does not have support**
for expanding a `have`/`let` telescope in a single step.
This PR reorganizes the monad hierarchy for symbolic computation in
Lean.
## Motivation
We want a clean layering where:
1. A foundational monad (`SymM`) provides maximally shared terms and
structural/syntactic `isDefEq`
2. `GrindM` builds on this foundation, adding E-graphs, congruence
closure, and decision procedures
3. Symbolic execution / VCGen uses `GrindM` directly without introducing
a third monad
## Changes
The core symbolic computation layer still lives in `Lean.Meta.Sym`. This
monad (`SymM`) provides:
- Maximally shared terms with pointer-based equality
- Structural/syntactic `isDefEq` and matching (no reduction, predictable
cost)
- Monotonic local contexts (no `revert` or `clear`), enabling O(1)
metavariable validation
- Efficient `intro`, `apply`, and `simp` implementations
The name "Sym" reflects that this is infrastructure for symbolic
computation: symbolic simulation, verification condition generation, and
decision procedures.
### Updated hierarchy
```
Lean.Meta.Sym -- SymM: shared terms, syntactic isDefEq, intro, apply, simp
Lean.Meta.Grind -- GrindM: E-graphs, congruence closure (extends SymM)
```
Symbolic execution is a usage pattern of `GrindM` operating on
`Grind.Goal`, not a separate monad. This keeps the API surface minimal:
users learn two monads, and VCGen is "how you use `GrindM`" (for users
that want to use `grind`) rather than a third abstraction to understand.
This PR adds a `done` flag to the result returned by `Simproc`s in
`Sym.simp`.
The `done` flag controls whether simplification should continue after
the result:
- `done = false` (default): Continue with subsequent simplification
steps
- `done = true`: Stop processing, return this result as final
## Use cases for `done = true`
### In `pre` simprocs
Skip simplification of certain subterms entirely:
```
def skipLambdas : Simproc := fun e =>
if e.isLambda then return .rfl (done := true)
else return .rfl
```
### In `post` simprocs
Perform single-pass normalization without recursive simplification:
```
def singlePassNormalize : Simproc := fun e =>
if let some (e', h) ← tryNormalize e then
return .step e' h (done := true)
else return .rfl
```
With `done = true`, the result `e'` won't be recursively simplified.
This PR adds support for simplifying lambda expressions in `Sym.simp`.
It is much more efficient than standard simp for very large lambda
expressions with many binders. The key idea is to generate a custom
function extensionality theorem for the type of the lambda being
simplified.
This technique is compatible with the standard `simp` tactic, and will
be ported in a separate PR.
<img width="581" height="455" alt="image"
src="https://github.com/user-attachments/assets/5911dc6c-03f0-48ed-843b-b8cb4f67ee61"
/>
### `lambda` benchmark summary
| Lambda size | MetaM (ms) | SymM (ms) | Speedup |
|-------------|------------|-----------|---------|
| 50 | 22.7 | 0.74 | ~31× |
| 100 | 120.5 | 1.75 | ~69× |
| 150 | 359.6 | 2.90 | ~124× |
| 200 | 809.5 | 4.51 | ~180× |
This PR ensures that `Sym.simp` checks thresholds for maximum recursion
depth and maximum number of steps. It also invokes `checkSystem`.
Additionally, this PR simplifies the main loop. Assigned metavariables
and `zetaDelta` reduction are now handled by installing `pre`/`post`
methods.
This PR adds `getMatch` and `getMatchWithExtra` for retrieving patterns
from
discrimination trees in the symbolic simulation framework.
The PR also adds uses `DiscrTree` to implement indexing in `Sym.simp`.
This PR moves many constants of the iterator API from `Std.Iterators` to
the `Std` namespace in order to make them more convenient to use. These
constants include, but are not limited to, `Iter`, `IterM` and
`IteratorLoop`. This is a breaking change. If something breaks, try
adding `open Std` in order to make these constants available again. If
some constants in the `Std.Iterators` namespace cannot be found, they
can be found directly in `Std` now.
This PR introduces a new fixpoint combinator,
`WellFounded.extrinsicFix`. A termination proof, if provided at all, can
be given extrinsically, i.e., looking at the term from the outside, and
is only required if one intends to formally verify the behavior of the
fixpoint. The new combinator is then applied to the iterator API.
Consumers such as `toList` or `ForIn` no longer require a proof that the
underlying iterator is finite. If one wants to ensure the termination of
them intrinsically, there are strictly terminating variants available
as, for example, `it.ensureTermination.toList` instead of `it.toList`.
This PR fixes various typos across the codebase in documentation and
comments.
- `infered` → `inferred` (ParserCompiler.lean)
- `declartation` → `declaration` (Cleanup.lean)
- `certian` → `certain` (CasesInfo.lean)
- `wil` → `will` (Cache.lean)
- `the the` → `the` (multiple files - PrefixTree.lean, Sum/Basic.lean,
List/Nat/Perm.lean, Time.lean, Bounded.lean, Lake files)
- `to to` → `to` (MutualInductive.lean, simp_bubblesort_256.lean)
- Grammar improvements in Bounded.lean and Time.lean
All changes are to comments and documentation only - no functional
changes.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-authored-by: Claude <noreply@anthropic.com>
The bench script expected no output on stdout from `compile.sh`, which
was not always the case. Now, it separates the compilation and size
measurement steps.
This PR reduces the memory consumption of the language server (the
watchdog process in particular). In Mathlib, it reduces memory
consumption by about 1GB.
It also fixes two bugs in the call hierarchy:
- When an open file had import errors (e.g. from a transitive build
failure), the call hierarchy would not display any usages in that file.
Now we use the reference information from the .ilean instead.
- When a command would not set a parent declaration (e.g. `#check`), the
result was filtered from the call hierarchy. Now we display it as
`[anonymous]` instead.
This PR adds a new [radar]-based [temci]-less bench suite that replaces
the `stdlib` benchmarks from the old suite and also measures per-module
instruction counts. All other benchmarks from the old suite are
unaffected.
The readme at `tests/bench-radar/README.md` explains in more detail how
the bench suite is structured and how it works. The readmes in the
benchmark subdirectories explain what each benchmark does and which
metrics it collects.
All metrics except `stdlib//max dynamic symbols` were ported to the new
suite, though most have been renamed.
[radar]: https://github.com/leanprover/radar
[temci]: https://github.com/parttimenerd/temci
This PR fixes freeing memory accidentally retained for each document
version in the language server on certain elaboration workloads. The
issue must have existed since 4.18.0.
This PR significantly changes the signature of the `ToIterator` type
class. The obtained iterators' state is no longer dependently typed and
is an `outParam` instead of being bundled inside the class. Among other
benefits, `simp` can now rewrite inside of `Slice.toList` and
`Slice.toArray`. The downside is that we lose flexibility. For example,
the former combinator-based implementation of `Subarray`'s iterators is
no longer feasible because the states are dependently typed. Therefore,
this PR provides a hand-written iterator for `Subarray`, which does not
require a dependently typed state and is faster than the previous one.
Converting a family of dependently typed iterators into a simply typed
one using a `Sigma`-state iterator generates forbiddingly bad code, so
that we do provide such a combinator. This PR adds a benchmark for this
problem.
Creates an inductive data type with 100 constructors, and a function
that does
matches on half of its constructors, with a catch-all for the other
half, and generates the splitter.
Related to #11183.
This PR lets the match compilation procedure use sparse case analysis
when the patterns only match on some but not all constructors of an
inductive type. This way, less code is produce. Before, code handling
each of the other cases was then optimized and commoned-up by later
compilation pipeline, but that is wasteful to do.
In some cases this will prevent Lean from noticing that a match
statement is complete
because it performs less case-splitting for the unreachable case. In
this case, give explicit
patterns to perform the deeper split with `by contradiction` as the
right-hand side.
At least temporarily, there is also the option to disable this behaviour
with
```
set_option backwards.match.sparseCases false
```
This PR fixes name mangling to be unambiguous / injective by adding `00`
for disambiguation where necessary. Additionally, the inverse function,
`Lean.Name.unmangle` has been added which can be used to unmangle a
mangled identifier. This unmangler has been added to demonstrate the
injectivity but also to allow unmangling identifiers e.g. for debugging
purposes.
Closes#10724
This PR renames `String.endPos` to `String.rawEndPos`, as in a future
release the name `String.endPos` will be taken by the function that is
currently called `String.endValidPos`.
This PR introduces a no-op version of `Shrink`, a type that should allow
shrinking small types into smaller universes given a proof that the type
is small enough, and uses it in the iterator library. Because this type
would require special compiler support, the current version is just a
wrapper around the inner type so that the wrapper is equivalent, but not
definitionally equivalent.
While `Shrink` is unable to shrink universes right now, but introducing
it now will allow us to generalize the universes in the iterator library
with fewer breaking changes as soon as an actual `Shrink` is possible.
This PR "monomorphizes" the structure `Std.PRange shape α`, replacing it
with nine distinct structures `Std.Rcc`, `Std.Rco`, `Std.Rci` etc., one
for each possible shape of a range's bounds. This change was necessary
because the shape polymorphism is detrimental to attempts of automation.
**BREAKING CHANGE:** While range/slice notation itself is unchanged,
this essentially breaks the entire remaining (polymorphic) range and
slice API except for the dot-notation(`toList`, `iter`, ...). It is not
possible to deprecate old declarations that were formulated in a
shape-polymorphic way that is not available anymore.
This PR significantly improves the test coverage of the language server,
providing at least a single basic test for every request that is used by
the client. It also implements infrastructure for testing all of these
requests, e.g. the ability to run interactive tests in a project context
and refactors the interactive test runner to be more maintainable.
Finally, it also fixes a small bug with the recently implemented unknown
identifier code actions for auto-implicits (#10442) that was discovered
in testing, where the "import all unambiguous unknown identifiers" code
action didn't work correctly on auto-implicit identifiers.
This PR disables `trace.profiler` in `bench/riskv-ast.lean`. We don't
want to optimize the trace profiler, but normal code.
While at it, I removed the `#exit` to cover more of the file.
While at it, also import the latest from from upstream.
The radar bench scripts at
https://github.com/leanprover/radar-bench-lean4/ split up the benchmarks
between the two runners based on the tags: One runner filters by the tag
`stdlib` while the other filters by the tag `other`. Only benchmarks
using one of these tags will be run, and any benchmark tagged with both
will waste electricity.
As far as I know, the tags are unused otherwise, so I just replaced all
the old tags.