This PR introduces a fast pattern matching and unification module for
the symbolic simulation framework (`Sym`). The design prioritizes
performance by using a two-phase approach:
**Phase 1 (Syntactic Matching)**
- Patterns use de Bruijn indices for expression variables and renamed
level params (`_uvar.0`, `_uvar.1`, ...) for universe variables
- Matching is purely structural after reducible definitions are unfolded
during preprocessing
- Universe levels treat `max` and `imax` as uninterpreted functions (no
AC reasoning)
- Binders and term metavariables are deferred to Phase 2
**Phase 2 (Pending Constraints)** [WIP]
- Handles binders (Miller patterns) and metavariable unification
- Converts remaining de Bruijn variables to metavariables
- Falls back to `isDefEq` when necessary
**Key design decisions:**
- Preprocessing unfolds reducible definitions and performs beta/zeta
reduction
- Kernel projections are expected to be folded as projection
applications before matching
- Assignment conflicts are deferred to pending rather than invoking
`isDefEq` inline
- `instantiateRevS` ensures maximal sharing of result expressions
**TODO:**
- Skip instance arguments during matching, synthesize later
- Skip proof arguments (proof irrelevance)
- Implement `processPending` for Phase 2 constraints
This PR refactors the `Goal` type used in `grind`. The new
representation allows multiple goals with different metavariables to
share the same `GoalState`. This is useful for automation such as
symbolic simulator, where applying theorems create multiple goals that
inherit the same E-graph, congruence closure and solvers state, and
other accumulated facts.
This PR implements `intro` (and its variants) for `SymM`. These versions
do not use reduction or infer types, and ensure expressions are
maximally shared.
This PR adds the function `Sym.instantiateS` and its variants, which are
similar to `Expr.instantiate` but assumes the input is maximally shared
and ensures the output is also maximally shared.
This PR adds the function `Sym.replaceS`, which is similar to
`replace_fn` available in the kernel but assumes the input is maximally
shared and ensures the output is also maximally shared. The PR also
generalizes the `AlphaShareBuilder` API.
This PR simplifies `AlphaShareCommon.State` by separating the persistent
and transient parts of the state.
The `map` field caches visited sub-expressions during a single
`shareCommonAlpha` call to handle DAGs efficiently, the input expression
may contain shared sub-expressions that are not yet maximally shared.
However, this cache does not need to persist between different
`shareCommonAlpha` calls.
**Changes:**
- Moved `map` from the persistent `AlphaShareCommon.State` to a private
`State` used only within individual `shareCommonAlpha` calls.
- Replaced `PHashMap ExprPtr Expr` with (the more efficient)
`Std.HashMap ExprPtr Expr` for `map`, since it is now local to each call
and does not need persistence.
- The public `AlphaShareCommon.State` now only contains the `set` of
alpha-equivalent expressions that should persist
This PR adds functions for creating maximally shared terms from
maximally shared terms. It is more efficient than creating an expression
and then invoking `shareCommon`. We are going to use these functions for
implementing the symbolic simulation primitives.
This PR introduces `SymM`, a new monad for implementing symbolic
simulators (e.g., verification condition generators) in Lean. The monad
addresses performance issues found in symbolic simulators built on top
of user-facing tactics like `apply` and `intros`.
**Key features:**
- Goals are represented by `Grind.Goal` objects, enabling incremental
hypothesis processing
- No `revert` or `clear` operations, allowing O(1) local context checks
instead of O(n log n)
- Carries `GrindM` state across goals to avoid reprocessing shared
hypotheses
- Provides `mkGoal` for creating new goals within the monad
This is the foundational infrastructure for `SymM`. Future PRs will add
operations like `intro`, `apply`, and the optimized definitional
equality test.
This PR adds support for incrementally processing local declarations in
`grind`. Instead of processing all hypotheses at once during goal
initialization, `grind` now tracks which local declarations have been
processed via `Goal.nextDeclIdx` and provides APIs to process new
hypotheses incrementally.
This feature will be used by the new `SymM` monad for efficient symbolic
simulation.
This PR disables closed term extraction in the reflection terms used by
`bv_decide`. These terms do
not profit at all from closed term extraction but can in practice cause
thousands of new closed term
declarations which in turn slows down the compiler.
This PR improves the performance of and flattening in `bv_decide`.
The two main insights of this PR are:
1. When embedded constraint substitution is disabled it makes no sense
to have and flattening on in
the first place, given that we do not profit from it in any way.
2. The new fvars produced by and flattening can also be inserted into
the rewriting caches of the
preprocessing pipeline if the fvar they were derived from is already in
the cache. This
drastically decreases the amount of work we have to do in the second
rewriting pass after running
and flattening.
This PR adds the attributes `[grind norm]` and `[grind unfold]` for
controlling the `grind` normalizer/preprocessor.
The `norm` modifier instructs `grind` to use a theorem as a
normalization rule. That is, the theorem is applied during the
preprocessing step. This feature is meant for advanced users who
understand how the preprocessor and `grind`'s search procedure interact
with each other.
New users can still benefit from this feature by restricting its use to
theorems that completely eliminate a symbol from the goal. Example:
```lean
theorem max_def : max n m = if n ≤ m then m else n
```
For a negative example, consider:
```lean
opaque f : Int → Int → Int → Int
theorem fax1 : f x 0 1 = 1 := sorry
theorem fax2 : f 1 x 1 = 1 := sorry
attribute [grind norm] fax1
attribute [grind =] fax2
example (h : c = 1) : f c 0 c = 1 := by
grind -- fails
```
In this example, `fax1` is a normalization rule, but it is not
applicable to the input goal since `f c 0 c` is not an instance of `f x
0 1`. However, `f c 0 c` matches the pattern `f 1 x 1` modulo the
equality `c = 1`. Thus, `grind` instantiates `fax2` with `x := 0`,
producing the equality `f 1 0 1 = 1`, which the normalizer simplifies to
`True`. As a result, nothing useful is learned. In the future, we plan
to include linters to automatically detect issues like these. Example:
```lean
opaque f : Nat → Nat
opaque g : Nat → Nat
@[grind norm] axiom fax : f x = x + 2
@[grind norm ←] axiom fg : f x = g x
example : f x ≥ 2 := by grind
example : f x ≥ g x := by grind
example : f x + g x ≥ 4 := by grind
```
The `unfold` modifier instructs `grind` to unfold the given definition
during the preprocessing step. Example:
```lean
@[grind unfold] def h (x : Nat) := 2 * x
example : 6 ∣ 3*h x := by grind
```
This PR implements support for user-defined attributes at
`grind_pattern`. Suppose we have declared the `grind` attribute
```lean
register_grind_attr my_grind
```
Then, we can now write
```lean
opaque f : Nat → Nat
opaque g : Nat → Nat
axiom fg : g (f x) = x
grind_pattern [my_grind] fg => g (f x)
```
This PR uses the new support for user-defined `grind` attributes to
implement the default `[grind]` attribute.
A manual update-stage0 is required because it affects the .olean files.
This PR implements user-defined `grind` attributes. They are useful for
users that want to implement tactics using the `grind` infrastructure
(e.g., `progress*` in Aeneas). New `grind` attributes are declared using
the command
```lean
register_grind_attr my_grind
```
The command is similar to `register_simp_attr`. After the new attribute
is declared. Recall that similar to `register_simp_attr`, the new
attribute cannot be used in the same file it is declared.
```lean
opaque f : Nat → Nat
opaque g : Nat → Nat
@[my_grind] theorem fax : f (f x) = f x := sorry
example theorem fax2 : f (f (f x)) = f x := by
fail_if_success grind
grind [my_grind]
```
TODO: remove leftovers after update stage0
This PR allows `grind` to use `List.eq_nil_of_length_eq_zero` (and
`Array.eq_empty_of_size_eq_zero`), but only when it has already proved
the length is zero.
This PR fixes an issue where `grind` fails when trying to unfold a
definition by pattern matching imported by `import all` (or from a
non-`module`).
Fixes#11715
---------
Co-authored-by: Sebastian Ullrich <sebasti@nullri.ch>
This PR turns even more commonly used bv_decide theorems that require
unification into fast simprocs
using syntactic equality. This pushes the overall performance across
sage/app7 to <= 1min10s for
every problem.
This PR upstreams dependency-management commands from Mathlib:
- `#import_path Foo` prints the transitive import chain that brings
`Foo` into scope
- `assert_not_exists Foo` errors if declaration `Foo` exists (for
dependency management)
- `assert_not_imported Module` warns if `Module` is transitively
imported
- `#check_assertions` verifies all pending assertions are eventually
satisfied
These commands help maintain the independence of different parts of a
library by catching unintended transitive dependencies early.
### Example usage
```lean
-- Find out how Nat got into scope
#import_path Nat
-- Declaration Nat is imported via
-- Init.Prelude,
-- which is imported by Init.Coe,
-- which is imported by Init.Notation,
-- ...
-- which is imported by this file.
-- Assert that a declaration should not be in scope yet
assert_not_exists SomeAdvancedType
-- Assert that a module should not be imported
assert_not_imported Some.Heavy.Module
-- Verify all assertions are eventually satisfied
#check_assertions
```
Addresses
https://lean-fro.zulipchat.com/#narrow/channel/398861-general/topic/path.20of.20an.20import🤖 Prepared with Claude Code
---------
Co-authored-by: Claude <noreply@anthropic.com>
This PR fixes an issue where `exact?` would not suggest private
declarations defined in the current module.
## Problem
When using `exact?` in a file with private declarations, those private
declarations were not being suggested even though they are valid and
accessible:
```lean
module
axiom P : Prop
private axiom p : P
example : P := by exact? -- error: could not find lemma
```
The problem was that `blacklistInsertion` in `LazyDiscrTree` was
filtering out all declarations whose names matched `isInternalDetail`,
which includes private names due to their `_private.Module.0.name`
structure.
## Solution
The fix adds a helper function `isPrivateNameOf` that checks if a
private declaration belongs to a specific module. The
`blacklistInsertion` function now allows private declarations belonging
to the current module (`env.header.mainModule`) to pass through the
filter.
Private declarations from imported modules are still filtered out, as
they may reference internal declarations that aren't accessible (which
would cause processing errors).
Zulip discussion:
https://leanprover.zulipchat.com/#narrow/channel/270676-lean4/topic/.60exact.3F.60.20and.20private.20declarations/near/564586152🤖 Prepared with Claude Code
---------
Co-authored-by: Claude <noreply@anthropic.com>
This PR internalizes all arguments of Quot.lift during LCNF conversion,
preventing panics in certain
non trivial programs that use quotients.
Fixes#11719.
This PR improves the performance of the functions for generating
congruence lemmas, used by `simp`
and a few other components.
It is a followup to (though not dependent on) #11717 and improves the
performance of `bv_decide` on the benchmark
in question further down to 20 seconds (from 1min 23s in #11717 and 8min
originally). We are thus at approximately a 24x speedup from the
original run.
This PR improves the performance of `bv_decide`'s rewriter on large
problems.
The baseline for this PR is `QF_BV/sage/app7/bench_1222.smt2` on
`chonk3` at 8 minutes. After this
PR it takes about 1min and 23 seconds. This improvement is achieved by
turning frequently used simp
rules into simprocs in order to avoid spending time performing
unification to see if they are
applicable.
This PR renames the namespace `Std.Range` to `Std.Legacy.Range`. Instead
of using `Std.Range` and `[a:b]` notation, the new range type `Std.Rco`
and its corresponding `a...b` notation should be used. There are also
other ranges with open/closed/infinite boundary shapes in
`Std.Data.Range.Polymorphic` and the new range notation also works for
`Int`, `Int8`, `UInt8`, `Fin` etc.
This PR adds more MPL spec lemmas for all combinations of `for` loops,
`fold(M)` and the `filter(M)/filterMap(M)/map(M)` iterator combinators.
These kinds of loops over these combinators (e.g. `it.mapM`) are first
transformed into loops over their base iterators (`it`), and if the base
iterator is of type `Iter _` or `IterM Id _`, then another spec lemma
exists for proving Hoare triples about it using an invariant and the
underlying list (`it.toList`). The PR also fixes a bug that MPL always
assigns the default priority to spec lemmas if `Std.Tactic.Do.Syntax` is
not imported and a bug that low-priority lemmas are preferred about
high-priority ones.
For context, the MPL bug was related to the fact that the `Attr.spec`
syntax is not built-in. Therefore, Lean falls back to the `Attr.simple`
syntax, which *basically* also works, but which stores the priority at a
different position. The routine to extract the priority does not
consider this and so it falls back to the default priority given an
`Attr.simple` syntax object.
This PR improves the performance of autocompletion and fuzzy matching by
introducing an ASCII fast path into one of their core loops and making
Char.toLower/toUpper more efficient.
Co-authored-by: Rob23oba <152706811+Rob23oba@users.noreply.github.com>
This PR gives a focused error message when a user tries to name an
example, and tweaks error messages for attempts to define multiple
opaque names at once.
## Example errors
```
example x : 1 == 1 := by grind
```
Current message:
```
Failed to infer type of binder `x`
Note: Because this declaration's type has been explicitly provided, all parameter types and holes (e.g., `_`) in its header are resolved before its body is processed; information from the declaration body cannot be used to infer what these values should be
```
New message:
```
Failed to infer type of binder `x`
Note: Examples don't have names. The identifier `x` is being interpreted as a parameter `(x : _)`.
```
## Plural-aware identifier lists
Both the example errors and opaque errors understand pluralization and
use oxford commas.
```
opaque a b c : Nat
```
Current message:
```
Failed to infer type of binder `c`
Note: Multiple constants cannot be declared in a single declaration. The identifier(s) `b`, `c` are being interpreted as parameters `(b : _)`, `(c : _)`.
```
New message:
```
Failed to infer type of binder `c`
Note: Multiple constants cannot be declared in a single declaration. The identifiers `b` and `c` are being interpreted as parameters `(b : _)` and `(c : _)`.```
This PR avoids invoking TC synthesis and other inference mechanisms in
the simprocs of bv_decide. This can give significant speedups on
problems that pressure these simprocs.
This PR enables the specializer to also recursively specialize in some
non trivial higher order situations.
The main motivation for this change is the upcoming changes to do
notation by sgraf. In there he uses combinators such as
```lean
@[specialize, expose]
def List.newForIn {α β γ} (l : List α) (b : β) (kcons : α → (β → γ) → β → γ) (knil : β → γ) : γ :=
match l with
| [] => knil b
| a :: l => kcons a (l.newForIn · kcons knil) b
```
in programs such as
```lean
def testing :=
let x := 42;
List.newForIn (β := Nat) (γ := Id Nat)
[1,2,3]
x
(fun i kcontinue s =>
let x := s;
List.newForIn
[i:10].toList x
(fun j kcontinue s =>
let x := s;
let x := x + i + j;
kcontinue x)
kcontinue)
pure
```
inspecting this IR right before we get to the specializer in the current
compiler we get:
```
[Compiler.eagerLambdaLifting] size: 22
def testing : Nat :=
fun _f.1 _y.2 : Nat :=
return _y.2;
let x := 42;
let _x.3 := 1;
fun _f.4 i kcontinue s : Nat :=
fun _f.5 j kcontinue s : Nat :=
let _x.6 := Nat.add s i;
let x := Nat.add _x.6 j;
let _x.7 := kcontinue x;
return _x.7;
let _x.8 := 10;
let _x.9 := Nat.sub _x.8 i;
let _x.10 := Nat.add _x.9 _x.3;
let _x.11 := 1;
let _x.12 := Nat.sub _x.10 _x.11;
let _x.13 := Nat.mul _x.3 _x.12;
let _x.14 := Nat.add i _x.13;
let _x.15 := @List.nil _;
let _x.16 := List.range'TR.go _x.3 _x.12 _x.14 _x.15;
let _x.17 := @List.newForIn _ _ _ _x.16 s _f.5 kcontinue;
return _x.17;
let _x.18 := 2;
let _x.19 := 3;
let _x.20 := @List.nil _;
let _x.21 := @List.cons _ _x.19 _x.20;
let _x.22 := @List.cons _ _x.18 _x.21;
let _x.23 := @List.cons _ _x.3 _x.22;
let _x.24 := @List.newForIn _ _ _ _x.23 x _f.4 _f.1;
return _x.24
```
Here the `kcontinue` higher order functions pose a special challenge
because they delay the discovery of new specialization opportunities.
Inspecting the IR after the current specializer (and a cleanup simp
step) we get functions that look as follows:
```
[simp] size: 7
def List.newForIn._at_.testing.spec_0 i kcontinue l b : Nat :=
cases l : Nat
| List.nil =>
let _x.1 := kcontinue b;
return _x.1
| List.cons head.2 tail.3 =>
let _x.4 := Nat.add b i;
let x := Nat.add _x.4 head.2;
let _x.5 := List.newForIn._at_.testing.spec_0 i kcontinue tail.3 x;
return _x.5
[simp] size: 14
def List.newForIn._at_.List.newForIn._at_.testing.spec_1.spec_1 _x.1 l b : Nat :=
cases l : Nat
| List.nil =>
return b
| List.cons head.2 tail.3 =>
fun _f.4 x.5 : Nat :=
let _x.6 := List.newForIn._at_.List.newForIn._at_.testing.spec_1.spec_1 _x.1 tail.3 x.5;
return _x.6;
let _x.7 := 10;
let _x.8 := Nat.sub _x.7 head.2;
let _x.9 := Nat.add _x.8 _x.1;
let _x.10 := 1;
let _x.11 := Nat.sub _x.9 _x.10;
let _x.12 := Nat.mul _x.1 _x.11;
let _x.13 := Nat.add head.2 _x.12;
let _x.14 := @List.nil _;
let _x.15 := List.range'TR.go _x.1 _x.11 _x.13 _x.14;
let _x.16 := List.newForIn._at_.testing.spec_0 head.2 _f.4 _x.15 b;
return _x.16
```
Observe that the specializer decided to abstract over `kcontinue`
instead of specializing further recursively. Thus this tight loop is now
going through an indirect call.
This PR now changes the specializer somewhat fundamentally to handle
situations like this. The most notable change is going to a fixpoint
loop of:
1. Specialize all current declarations in the worklist
2. If a declaration
- succeeded in specializing run the simplifier on it and put it back
onto the worklist
- if it didn't don't put it back onto the worklist anymore
3. Put all newly generated specialisations on the worklist
4. Recompute fixed parameters for the current SCC
5. Repeat until the worklist is empty
Furthermore, declarations that were already specialized:
- only consider `fixedHO` parameters for specialization, in order to
avoid termination issues with repeated specialization and abstraction of
type class parameters under binders
- recursively specialized declarations only allow specialization if at
least one of their fixedHO arguments is not a parameter itself. The
reason for allowing this in first generation specialization is that we
refrain from specializing inside the body of a declaration marked as
`@[specialize]`. Thus we need to specialize them even if their arguments
don't actually contain anything of interest in order to ensure that type
classes etc. are correctly cleaned up within their bodies.
There is one last trade-off to consider. When specializing code
generated by the new do elaborator we sometimes generate intermediate
specializations that are not actually part of any call graph after we
are done specializing. We could in principle detect these functions and
delete them but having them in cache is potentially helpful for further
specializations later. Once the new do elaborator lands we plan to test
this trade-off.
Closes#10924
This PR refactors match compilation, to handle “side-effect free”
patterns (`.var`, `.inaccessible`, `.as`) eagerly and for each
alternative separately. The idea is that there should be less interplay
between different alternatives, and prepares the ground for #11105.
This may cause some corner case match statements to compiler or fail
compile that behaved differently before. For example, it can now use a
sparse case where previously was using a full case, and pattern
completeness may not be clear to lean now. On the other hand, using a
sparse case can mean that match statements mixing matching in indicies
with matching on the indexed datatype can work.
This PR fixes `grind` to support dot notation on declarations in the
lemma list.
When using `grind only [foo.le]` where `foo.le` is dot notation applying
`LT.lt.le` to a theorem `foo`, grind previously failed with "Unknown
constant `foo.le`" because it tried to look up `foo.le` as a constant
name rather than elaborating it as a term.
The fix adds a fallback in `processParam`: when constant lookup fails,
it now falls back to `processTermParam` which elaborates the identifier
as a term. This allows dot notation expressions like `log_two_lt_d9.le`
to work correctly.
Closes#11690🤖 Prepared with Claude Code
---------
Co-authored-by: Claude <noreply@anthropic.com>
This PR improves `match` generalization such that it abstracts
metavariables in types of local variables and in the result type of the
match over the match discriminants. Previously, a metavariable in the
result type would silently default to the behavior of `generalizing :=
false`, and a metavariable in the type of a free variable would lead to
an error (#8099). Example of a `match` that elaborates now but
previously wouldn't:
```lean
example (a : Nat) (ha : a = 37) :=
(match a with | 42 => by contradiction | n => n) = 37
```
This is because the result type of the `match` is a metavariable that
was not abstracted over `a` and hence generalization failed; the result
is that `contradiction` cannot pick up the proof `ha : 42 = 37`.
The old behavior can be recovered by passing `(generalizing := false)`
to the `match`.
Furthermore, programs such as the following can now be elaborated:
```lean
example (n : Nat) : Id (Fin (n + 1)) :=
have jp : ?m := ?rhs
match n with
| 0 => ?jmp1
| n + 1 => ?jmp2
where finally
case m => exact Fin (n + 1) → Id (Fin (n + 1))
case jmp1 => exact jp ⟨0, by decide⟩
case jmp2 => exact jp ⟨n, by omega⟩
case rhs => exact pure
```
This is useful for the `do` elaborator.
Fixes#8099.
This PR makes `simpH`, used in the match equation generator, produce a
proof term. This is in preparation for a bigger refactoring in #11512.
This removes some cases, these are no longer necessary since #11196.
This PR changes the "declaration uses 'sorry'" warning to use backticks
instead of single quotes, consistent with Lean's conventions for
formatting code identifiers in diagnostic messages.
Fix a typo in the error message when an unknown role is used in a
docstring.
- Changes "Unkown role" to "Unknown role" in
`src/Lean/Elab/DocString.lean`
This PR fixes the `grind` support for `Nat.ctorIdx`. Nat constructors
appear in `grind` as offsets or literals, and not as a node marked
`.constr`, so handle that case as well.