Commit graph

30 commits

Author SHA1 Message Date
Leonardo de Moura
45862d5486
feat: improves simpArrowTelescope simproc (#12153)
This PR improves the `simpArrowTelescope` simproc that simplifies
non-dependent arrow telescopes: `p₁ → p₂ → ... → q`.

The simproc now also applies telescope-specific simplifications:
- `False → q` to `True` (when `q : Prop`)
- `True → q` to `q` (when `q : Prop`)
- `p → True` to `True`
2026-01-25 22:29:38 +00:00
Leonardo de Moura
ba8c2ed4ee
feat: add simpArrowTelescope for compact proofs of arrow simplification (#12152)
This PR adds `simpArrowTelescope`, a simproc that simplifies telescopes
of non-dependent arrows (p₁ → p₂ → ... → q) while avoiding quadratic
proof growth.

When using `Expr.forallE` to represent nested implications, each nesting
level bumps de Bruijn indices in subterms, destroying sharing even with
hash-consing. For example, a free variable `x` gets different de Bruijn
representations at each depth, causing proof terms to grow.

`simpArrowTelescope` works by:

- Converting arrows to `Arrow p q` (a definitional wrapper)
- Simplifying each component
- Converting back to `→` form

Since `Arrow` arguments are not under binders, subterms remain identical
across nesting levels and can be shared.

The `simp_4` benchmark demonstrates the improvement:

With `forallE`: ~160ms, proof_size ≈ 173k
With `Arrow`: ~43ms, proof_size ≈ 16k
Tradeoff: `simpArrowTelescope` misses simplifications that depend on the
arrow structure (e.g., `p → p` to `True`), since post-methods aren't
applied to intermediate arrows. Thus, it is not used by default. to use
it, one has to set `simpArrowTelescope` as a `pre`-method.
2026-01-25 20:43:59 +00:00
Leonardo de Moura
e90f6f77db
test: local rewrite with Sym.simp (#12147)
This PR adds a new API for helping users write focused rewrites.
2026-01-25 01:32:50 +00:00
Leonardo de Moura
6de7100f69
feat: add Goal API for SymM + grind (#12143)
This PR adds an API for building symbolic simulation engines and
verification
condition generators that leverage `grind`. The API wraps `Sym`
operations to
work with `grind`'s `Goal` type, enabling lightweight symbolic execution
while
carrying `grind` state for discharge steps.

New operations on `Goal`:
- `mkGoal`: create a `Goal` from an `MVarId`
- `introN`, `intros`: introduce binders
- `apply`: apply backward rules
- `simp`, `simpIgnoringNoProgress`: simplify using `Sym.Simp`
- `internalize`, `internalizeAll`: add hypotheses to the E-graph
- `grind`: attempt to close the goal using `grind`
- `assumption`: close by matching a hypothesis

A new test demonstrates the API on a stateful program with conditionals,
using `grind` to discharge arithmetic side conditions.
2026-01-24 20:30:08 +00:00
Leonardo de Moura
4c1e4a77b4
test: MetaM vs SymM on do notation (#12134)
This PR adds a new benchmark `shallow_add_sub_cancel.lean` that
demonstrates symbolic simulation using a shallow embedding into monadic
`do` notation, as opposed to the deep embedding approach in
`add_sub_cancel.lean`.

The shallow embedding approach:
- Uses Lean's `StateM` monad directly instead of a custom command
language

- Defines `Exec s k post` as a simple predicate: `post (k s).1 (k s).2`

- Proves helper theorems for reasoning about monadic operations (`pure`,
`bind`, `get`, `set`, `modify`, `ite`)

- Programs are written in actual `do`-notation rather than a custom AST

The benchmark solves goals using both the `MetaM` and `SymM` frameworks,
showing that the shallow embedding integrates well with the symbolic
simulation infrastructure. `SymM` is again way faster than `MetaM`

### Symbolic simulation benchmark — tactic time only

Problem size `n` corresponds to a program with `4·n` monadic actions.

| n   | MetaM tactic (ms) | SymM tactic (ms) | Speedup |
|-----|-------------------|------------------|---------|
| 10  | 82.10  | 11.37 | ~7.2×  |
| 20  | 176.21 | 17.71 | ~9.9×  |
| 30  | 306.47 | 25.39 | ~12.1× |
| 40  | 509.52 | 34.53 | ~14.7× |
| 50  | 689.19 | 43.51 | ~15.8× |
| 60  | 905.86 | 52.47 | ~17.3× |
| 70  | 1172.31 | 62.50 | ~18.8× |
| 80  | 1448.48 | 70.65 | ~20.5× |
| 90  | 1787.15 | 80.89 | ~22.1× |
| 100 | 2128.12 | 90.77 | ~23.5× |

<img width="580" height="455" alt="image"
src="https://github.com/user-attachments/assets/3511aaab-4d53-4520-8302-65d2d100df4a"
/>
2026-01-24 03:38:02 +00:00
Leonardo de Moura
c81a8897a9
feat: improve Sym.simp APIs and new benchmark data (#12101)
This PR improves the the `Sym.simp` APIs. It is now easier to reuse the
simplifier cache between different simplification steps. We use the APIs
to improve the benchmark at #12100.

### Symbolic simulation with simplifier cache reuse (SymM)

Problem size `n` corresponds to a program with `2·n + 2` instructions.

| n   | Tactic time (ms) | Kernel time (ms) |
|-----|------------------|------------------|
| 10  | 4.53  | 4.29  |
| 20  | 5.56  | 6.91  |
| 30  | 6.46  | 8.67  |
| 40  | 8.07  | 11.20 |
| 50  | 9.37  | 13.63 |
| 60  | 11.89 | 15.43 |
| 70  | 12.43 | 18.28 |
| 80  | 14.07 | 20.72 |
| 90  | 15.62 | 23.41 |
| 100 | 17.39 | 24.80 |
| 200 | 30.35 | 48.39 |
| 300 | 45.41 | 72.84 |
| 400 | 59.17 | 97.67 |
| 500 | 79.63 | 138.99 |
| 600 | 100.05 | 173.67 |
| 700 | 119.77 | 208.80 |

<img width="571" height="455" alt="image"
src="https://github.com/user-attachments/assets/70da7ea2-b5d2-405e-985c-bfa358455afc"
/>
2026-01-22 03:37:16 +00:00
Leonardo de Moura
fa40491c78
test: benchmark MetaM vs SymM (#12100)
This PR adds a comparison between `MetaM` and `SymM` for a benchmark was
proposed during the Lean@Google Hackathon.

### Benchmark description

In this benchmark, we define the semantics of a very simple imperative
language using an inductive predicate

```
Exec prog events mem lctx post
```

The predicate holds if, when executing the program `prog` with an
initial list of events `events`, memory `mem`, and local context `lctx`,
the postcondition `post` holds.

We then consider the following program:

```
input b
a := b
a := a + a
a := a - b
...
a := a + a
a := a - b
```

That is, after reading an input value `b`, the program repeatedly
updates the variable `a` by doubling it and then subtracting `b`.

We prove that, for any initial memory `m` and local context `l`, and
starting from the empty list of events, the following postcondition
holds:

```
fun t' m' l' =>
  m' = m ∧                      -- memory did not change
  ∃ v : Word,
    t' = [IOEvent.IN v] ∧       -- exactly one input event
    l'.get "a" = some v         -- `a` contains the input value
```

In other words, executing the program produces exactly one input event,
leaves the memory unchanged, and ensures that the final value of `a` is
equal to the input value.

### Symbolic simulation benchmark (problem size `n`, with `2·n + 2`
instructions)

| Problem size (n) | MetaM time (ms) | MetaM kernel (ms) | SymM time
(ms) | SymM kernel (ms) | Total speedup |

|------------------|------------------|-------------------|----------------|------------------|---------------|
| 10  | 94.83  | 6.60  | 7.04  | 6.18  | ~13.5× |
| 20  | 218.92 | 13.33 | 14.15 | 13.02 | ~15.5× |
| 30  | 375.10 | 22.95 | 26.51 | 19.81 | ~14.2× |
| 40  | 563.82 | 34.99 | 40.42 | 29.55 | ~14.0× |
| 50  | 815.89 | 53.78 | 60.84 | 42.25 | ~13.4× |
| 60  | 1081.09 | 73.46 | 80.99 | 53.52 | ~13.3× | 
| 70  | 1400.80 | 102.70 | 106.02 | 68.61 | ~13.2× | 
| 80  | 1772.19 | 126.65 | 134.23 | 87.64 | ~13.2× |
| 90  | 2203.41 | 161.68 | 168.26 | 115.52 | ~13.1× | 
| 100 | 2474.09 | 191.23 | 209.13 | 143.86 | ~11.8× |

<img width="580" height="455" alt="image"
src="https://github.com/user-attachments/assets/bc7058fa-e71a-4c2c-be28-860f39166965"
/>

 ### Symbolic simulation with extra simplification (SymM)

Problem size `n` corresponds to a program with `2·n + 2` instructions.

| n   | Total time (ms) | Kernel time (ms) | Non-kernel time (ms) |
|-----|------------------|------------------|----------------------|
| 10  | 6.33  | 3.97 | 2.36 |
| 20  | 10.30 | 5.59 | 4.71 |
| 30  | 13.72 | 7.38 | 6.34 |
| 40  | 17.85 | 8.84 | 9.01 |
| 50  | 21.90 | 10.63 | 11.27 |
| 60  | 27.00 | 12.56 | 14.44 |
| 70  | 32.02 | 14.04 | 17.98 |
| 80  | 37.25 | 15.76 | 21.49 |
| 90  | 42.55 | 17.95 | 24.60 |
| 100 | 49.30 | 20.03 | 29.27 |
| 200 | 125.56 | 38.21 | 87.36 |
| 300 | 293.58 | 66.79 | 226.79 |
| 400 | 361.87 | 78.96 | 282.91 |
| 500 | 518.51 | 102.51 | 416.00 |
| 600 | 716.63 | 122.81 | 593.82 |
2026-01-22 01:38:56 +00:00
Leonardo de Moura
af438425d5
perf: avoid mkAppM in Sym.simp (#12099)
This PR ensures `Sym.simpGoal` does not use `mkAppM`. It also increases
the default number of maximum steps in `Sym.simp`.
2026-01-22 00:01:43 +00:00
Leonardo de Moura
f84aa23d6d
feat: metavar cleanup in Sym.simp (#12096)
This PR cleanups temporary metavariables generated when applying
rewriting rules in `Sym.simp`.
2026-01-21 21:36:17 +00:00
Leonardo de Moura
34d8eeb3be
chore: fix and rename sym_add_sub_cancel benchmark (#12092) 2026-01-21 17:47:40 +00:00
Leonardo de Moura
08e6f714ca
chore: normalize Sym APIs (#12088)
This PR cleanups the Sym APIs for `apply` and `simp`.
2026-01-21 17:02:22 +00:00
Leonardo de Moura
e9a1c9ef63
feat: offset terms in Sym (#12053)
This PR adds support for offset terms in `SymM`. This is essential for
handling equational theorems for functions that pattern match on natural
numbers in `Sym.simp`. Without this, it cannot handle simple examples
such as

```lean
def pw (n : Nat) : Nat :=
  match n with
  | 0 => 1
  | n+1 => 2 * pw n

example : pw 4 = 16 := by
  sym_simp [pw.eq_1, pw.eq_2]

example : pw (a + 2) = 2 * (2 * pw a) := by
  sym_simp [pw.eq_2]
```
2026-01-20 04:57:52 +00:00
Leonardo de Moura
df8ff255cb
test: benchmark from Lean Hackathon (#12051) 2026-01-20 01:32:41 +00:00
Leonardo de Moura
58e599f2f9
perf: optimize congruence proof construction in Sym.simp (#11974)
This PR optimizes congruence proof construction in `Sym.simp` by
avoiding
`inferType` calls on expressions that are less likely to be cached.
Instead of
inferring types of expressions like `@HAdd.hAdd Nat Nat Nat instAdd 5`,
we infer
the type of the function prefix `@HAdd.hAdd Nat Nat Nat instAdd` and
traverse
the forall telescope.

The key insight is that function prefixes are more likely shared across
many call sites
(e.g., all `Nat` additions use the same `@HAdd.hAdd Nat Nat Nat
instAdd`), so they
benefit from `inferType` caching. 

Benchmark results show improvements on workloads with shared function
prefixes:
- `many_rewrites_5000`: 48.8ms → 43.1ms (-12%)
- `term_tree_5000`: 53.4ms → 30.5ms (-43%)
2026-01-11 23:00:19 +00:00
Leonardo de Moura
d7cbdebf0b
chore: cleanup simp benchmark (#11971) 2026-01-11 19:55:39 +00:00
Leonardo de Moura
d57f71c1c0
perf: optimize kernel type-checking for have-telescope simplification in Sym.simp (#11967)
This PR implements a new strategy for simplifying `have`-telescopes in
`Sym.simp` that achieves linear kernel type-checking time instead of
quadratic.

## Problem

When simplifying deep `have`-telescopes, the previous approach using
`have_congr'` produced proofs that type-checked in quadratic time. The
simplifier itself was fast, but the kernel became the bottleneck for
large telescopes.

For example, at n=100:
- **Before**: simp = 2.4ms, kernel = **225ms**
- **After**: simp = 3.5ms, kernel = **10ms**

The quadratic behavior occurred because the kernel creates fresh free
variables for each binder when type-checking, destroying sharing and
producing O(n²) intermediate terms.

## Solution

We transform sequential `have`-telescopes into a parallel
beta-application form:

```
have x₁ := v₁; have x₂ := v₂[x₁]; b[x₁, x₂]
  ↓ (definitionally equal)
(fun x₁ x₂' => b[x₁, x₂' x₁]) v₁ (fun x₁ => v₂[x₁])
```

This parallel form leverages the efficient simplifier for lambdas in
`Sym.simp`. This form enables:
1. Independent simplification of each argument
2. Proof construction using standard congruence lemmas
3. Linear kernel type-checking time

The algorithm has three phases:
1. **`toBetaApp`**: Transform telescope → parallel beta-application
2. **`simpBetaApp`**: Simplify using `congr`/`congrArg`/`congrFun'` and
`simpLambda`
3. **`toHave`**: Convert back to `have` form

## Benchmark Results

### Benchmark 1: Chain with all variables used in body

| n | Before (simp) | Before (kernel) | After (simp) | After (kernel) |
|---|---------------|-----------------|--------------|----------------|
| 50 | 1.2ms | 32ms | 1.6ms | 4.4ms |
| 100 | 2.4ms | **225ms** | 3.5ms | **10ms** |
| 200 | 4.5ms | — | 8.4ms | 27ms |
| 500 | 11.7ms | — | 33.6ms | 128ms |

### Benchmark 3: Parallel declarations (simplified values)

| n | Before (simp) | Before (kernel) | After (simp) | After (kernel) |
|---|---------------|-----------------|--------------|----------------|
| 50 | 0.5ms | 24ms | 0.8ms | 1.8ms |
| 100 | 1.2ms | **169ms** | 1.8ms | **5.3ms** |
| 200 | 2.2ms | — | 3.9ms | 17ms |
| 500 | 5.9ms | — | 12.3ms | 93ms |

### Benchmark 5: Chain with single dependency

| n | Before (simp) | Before (kernel) | After (simp) | After (kernel) |
|---|---------------|-----------------|--------------|----------------|
| 100 | 1.6ms | 6.2ms | 1.8ms | 6.2ms |
| 200 | 2.8ms | 21.6ms | 4.4ms | 16.5ms |
| 500 | 7.3ms | **125ms** | 12.8ms | **72ms** |

Key observations:
- Kernel time is now **linear** in telescope depth (previously
quadratic)
- Simp time increases slightly due to the transformation overhead
- Total time (simp + kernel) is dramatically reduced for large
telescopes
- The improvement is most pronounced when the body depends on many
variables

## Trade-offs

- Proof sizes are larger (more congruence lemma applications)
- Simp time has ~1.5x overhead from the transformation
- For very small telescopes (n < 10), the overhead may not pay off

The optimization targets the critical path: kernel type-checking was the
bottleneck preventing scaling to realistic symbolic simulation
workloads.
2026-01-11 02:20:47 +00:00
Leonardo de Moura
cae739c27c
test: implies vs Arrow Sym.simp benchmark (#11966) 2026-01-10 18:51:54 +00:00
Leonardo de Moura
d92cdae8e9
feat: simpForall and simpArrow in Sym.simp (#11950)
This PR implements `simpForall` and `simpArrow` in `Sym.simp`.
2026-01-09 06:20:04 +00:00
Leonardo de Moura
0e4794a1a9
test: benchmarks for lambda-telescopes (#11929) 2026-01-08 00:20:03 +00:00
Leonardo de Moura
8484dbad5d
test: benchmarks for have-telescopes (#11927) 2026-01-07 23:24:46 +00:00
Leonardo de Moura
ff87bcb8e5
feat: add option for simplifying have decls in two passes (#11923)
This PR adds a new option to the function `simpHaveTelescope` in which
the `have` telescope is simplified in two passes:

* In the first pass, only the values and the body are simplified.
* In the second pass, unused declarations are eliminated.

This new mode eliminates **superlinear** behavior in the benchmark
`simp_3.lean`. Note that the kernel type checker still **exhibits**
quadratic behavior in this example, because it **does not have support**
for expanding a `have`/`let` telescope in a single step.
2026-01-07 01:58:36 +00:00
Leonardo de Moura
8154453bb5
feat: simplify have blocks in Sym.simp (#11920)
This PR implements support for simplifying `have` telescopes in
`Sym.simp`.
2026-01-07 00:10:47 +00:00
Leonardo de Moura
175661b6c3
refactor: reorganize SymM and GrindM monad hierarchy (#11909)
This PR reorganizes the monad hierarchy for symbolic computation in
Lean.

## Motivation

We want a clean layering where:
1. A foundational monad (`SymM`) provides maximally shared terms and
structural/syntactic `isDefEq`
2. `GrindM` builds on this foundation, adding E-graphs, congruence
closure, and decision procedures
3. Symbolic execution / VCGen uses `GrindM` directly without introducing
a third monad

## Changes

The core symbolic computation layer still lives in `Lean.Meta.Sym`. This
monad (`SymM`) provides:
- Maximally shared terms with pointer-based equality
- Structural/syntactic `isDefEq` and matching (no reduction, predictable
cost)
- Monotonic local contexts (no `revert` or `clear`), enabling O(1)
metavariable validation
- Efficient `intro`, `apply`, and `simp` implementations

The name "Sym" reflects that this is infrastructure for symbolic
computation: symbolic simulation, verification condition generation, and
decision procedures.

### Updated hierarchy

```
Lean.Meta.Sym   -- SymM: shared terms, syntactic isDefEq, intro, apply, simp
Lean.Meta.Grind -- GrindM: E-graphs, congruence closure (extends SymM)
```

Symbolic execution is a usage pattern of `GrindM` operating on
`Grind.Goal`, not a separate monad. This keeps the API surface minimal:
users learn two monads, and VCGen is "how you use `GrindM`" (for users
that want to use `grind`) rather than a third abstraction to understand.
2026-01-06 01:12:07 +00:00
Leonardo de Moura
82f60a7ff3
feat: pre and post may return "done" in Sym.simp (#11900)
This PR adds a `done` flag to the result returned by `Simproc`s in
`Sym.simp`.

The `done` flag controls whether simplification should continue after
the result:
- `done = false` (default): Continue with subsequent simplification
steps
- `done = true`: Stop processing, return this result as final

## Use cases for `done = true`

### In `pre` simprocs
Skip simplification of certain subterms entirely:
```
def skipLambdas : Simproc := fun e =>
  if e.isLambda then return .rfl (done := true)
  else return .rfl
```

### In `post` simprocs
Perform single-pass normalization without recursive simplification:
```
def singlePassNormalize : Simproc := fun e =>
  if let some (e', h) ← tryNormalize e then
    return .step e' h (done := true)
  else return .rfl
```
With `done = true`, the result `e'` won't be recursively simplified.
2026-01-05 02:10:06 +00:00
Leonardo de Moura
f1c903ca65
feat: simplify lambdas in Sym.simp (#11898)
This PR adds support for simplifying lambda expressions in `Sym.simp`.
It is much more efficient than standard simp for very large lambda
expressions with many binders. The key idea is to generate a custom
function extensionality theorem for the type of the lambda being
simplified.

This technique is compatible with the standard `simp` tactic, and will
be ported in a separate PR.

<img width="581" height="455" alt="image"
src="https://github.com/user-attachments/assets/5911dc6c-03f0-48ed-843b-b8cb4f67ee61"
/>

### `lambda` benchmark summary

| Lambda size | MetaM (ms) | SymM (ms) | Speedup |
|-------------|------------|-----------|---------|
| 50          | 22.7       | 0.74      | ~31×    |
| 100         | 120.5      | 1.75      | ~69×    |
| 150         | 359.6      | 2.90      | ~124×   |
| 200         | 809.5      | 4.51      | ~180×   |
2026-01-05 01:00:30 +00:00
Leonardo de Moura
609d99e860
chore: include free variables (#11894)
This PR includes free variable in a `simp` benchmark to stress the
default `simp` matching procedure.
2026-01-04 18:51:18 +00:00
Leonardo de Moura
78c9a01bb2
feat: check Sym.simp thresholds (#11890)
This PR ensures that `Sym.simp` checks thresholds for maximum recursion
depth and maximum number of steps. It also invokes `checkSystem`.
Additionally, this PR simplifies the main loop. Assigned metavariables
and `zetaDelta` reduction are now handled by installing `pre`/`post`
methods.
2026-01-04 04:27:46 +00:00
Leonardo de Moura
bc72487aed
refactor: Sym.simp (#11888)
This PR refactors `Sym.simp` to make it more general and customizable.
It also moves the code
to its own subdirectory `Meta/Sym/Simp`.
2026-01-04 02:17:23 +00:00
Leonardo de Moura
b40dabdecd
feat: add discrimination tree retrieval for Sym (#11886)
This PR adds `getMatch` and `getMatchWithExtra` for retrieving patterns
from
discrimination trees in the symbolic simulation framework. 
The PR also adds uses `DiscrTree` to implement indexing in `Sym.simp`.
2026-01-03 20:28:07 +00:00
Leonardo de Moura
4e8b5cfc46
test: benchmark Sym and Meta simplifiers (#11870)
This PR adds simple benchmarks for comparing the `MetaM` and `SymM`
simplifiers. The `SymM` simplifier is still working in progress.

### Big picture across benchmarks

| Benchmark | MetaM scaling | SymM scaling | Speedup (approx.) |

|-------------------------|-------------------|--------------|-------------------|
| `trans_chain` | Linear | Linear | ~8–9× |
| `congr_arg_explosion` | Super-linear | Linear | ~100× |
| `many_rewrites` | Super-linear | Linear | ~10–16× |

<img width="598" height="455" alt="image"
src="https://github.com/user-attachments/assets/8bd9021b-b9cf-4fc0-aab4-3118d87f7c22"
/>

<img width="644" height="455" alt="image"
src="https://github.com/user-attachments/assets/0234dc11-0be7-441a-83b6-c309d20a2663"
/>

<img width="611" height="455" alt="image"
src="https://github.com/user-attachments/assets/df79d057-25ed-49d9-a8f3-5285e5fc7013"
/>
2026-01-02 03:59:54 +00:00