This PR replaces `MatcherInfo.numAltParams` with a more detailed data
structure that allows us, in particular, to distinguish between an
alternative for a constructor with a `Unit` field and the alternative
for a nullary constructor, where an artificial `Unit` argument is
introduced.
This PR introduces a function `String.split` which is based on
`String.Slice.split` and therefore supports all pattern types and
returns a `Std.Iter String.Slice`.
This supersedes the functions `String.splitOn` and `String.splitToList`,
and we remove all all uses of these functions from core. They will be
deprecated in a future PR.
Migrating from `String.splitOn` and `String.splitToList` is easy: we
introduce functions `Iter.toStringList` and `Iter.toStringArray` that
can be used to conveniently go from `Std.Iter String.Slice` to `List
String` and `Array String`, so for example `s.splitOn "foo"` can be
replaced by `s.split "foo" |>.toStringList`.
This PR adds a `Unit` assumption to alternatives of the splitter that
would otherwise not have arguments. This fixes#11211.
In practice these argument-less alternatives did not cause wrong
behavior, as the motive when used with `split` is always a function
type. But it is better to be safe here (maybe someone uses splitters in
other ways), it may increase the effectiveness of #10184 and simplifies
#11220.
The perf impact is insignificant in the grand scheme of things on
stdlib, but the change is effective:
```
~/lean4 $ build/release/stage1/bin/lean tests/lean/run/matchSplitStats.lean
969 splitters found
455 splitters are const defs
~/lean4 $ build/release/stage2/bin/lean tests/lean/run/matchSplitStats.lean
969 splitters found
829 splitters are const defs
```
This PR implements the option `revert`, which is set to `false` by
default. To recover the old `grind` behavior, you should use `grind
+revert`. Previously, `grind` used the `RevSimpIntro` idiom, i.e., it
would revert all hypotheses and then re-introduce them while simplifying
and applying eager `cases`. This idiom created several problems:
* Users reported that `grind` would include unnecessary parameters. See
[here](https://leanprover.zulipchat.com/#narrow/channel/270676-lean4/topic/Grind.20aggressively.20includes.20local.20hypotheses.2E/near/554887715).
* Unnecessary section variables were also being introduced. See the new
test contributed by Sebastian Graf.
* Finally, it prevented us from supporting arbitrary parameters as we do
in `simp`. In `simp`, I implemented a mechanism that simulates local
universe-polymorphic theorems, but this approach could not be used in
`grind` because there is no mechanism for reverting (and re-introducing)
local universe-polymorphic theorems. Adding such a mechanism would
require substantial work: I would need to modify the local context
object. I considered maintaining a substitution from the original
variables to the new ones, but this is also tricky, because the mapping
would have to be stored in the `grind` goal objects, and it is not just
a simple mapping. After reverting everything, I would need to keep a
sequence of original variables that must be added to the mapping as we
re-introduce them, but eager case splits complicate this quite a bit.
The whole approach felt overly messy.
The new behavior `grind -revert` addresses all these issues. None of the
`grind` proofs in our test suite broke after we fixed the bugs exposed
by the new feature. That said, the traces and counterexamples produced
by `grind` are different. The new proof terms are also different.
This PR introduces a clarifying note to "undefined identifier" error
messages when the undefined identifier is in a syntactic position where
autobinding might generally apply, but where and autobinding is
disabled. A corresponding note is made in the `lean.unknownIdentifier`
error explanation.
The core intended audience for this error message change is "newcomer
who would otherwise be baffled why the thing that works in this Mathlib
project gets 'unknown identifier' errors in this non-Mathlib project."
## Modified behavior
### Example 1
```lean4
set_option autoImplicit true in
set_option relaxedAutoImplicit false in
def thisBreaks (x : α₂) (y : size₂) := ()
```
Before:
```
Unknown identifier `size₂`
```
After:
```
Unknown identifier `size₂`
Note: It is not possible to treat `size₂` as an implicitly bound variable here because it has multiple characters while the `relaxedAutoImplicit` option is set to `false`.
```
### Example 2
```lean4
set_option autoImplicit false in
def thisAlsoBreaks (x : α₃) (y : size₃) := ()
```
Before:
```
Unknown identifier `α₃`
Unknown identifier `size₃`
```
After:
```
Unknown identifier `α₃`
Note: It is not possible to treat `α₃` as an implicitly bound variable here because the `autoImplicit` option is set to `false`.
Unknown identifier `size₃`
Note: It is not possible to treat `size₃` as an implicitly bound variable here because the `autoImplicit` option is set to `false`.
```
## How this works
The elaboration process knows whether it is considering syntax where we
be able to auto-bind implicits thanks to information in the
`Lean.Elab.Term.Context`.
Before this PR, this contains:
* `autoBoundImplicit`, a boolean that is true when we are considering
syntax that might be able to auto-bind implicit AND when the
`autoImplicit` flag is set to true
* `autoBoundImplicits`, an array of `Expr` variables that we've
autobound
After this PR, this contains:
* `autoBoundImplicitCtx`, an option which is `some` **whenever** we are
considering syntax that might be able to auto-bind implicit, and carries
the array of exprs as well as a copy of the `autoImplicit` flag's value.
(The latter lets us re-implement the `autoBoundImplicit` flag for
backward compatibility.)
Therefore, rather than having access to "elaboration is in an
autobinding context && flag is enabled", it's possible to recover both
of those individual values, and give different information to the user
in cases where we didn't attempt autobinding but would have if different
options had been set.
## Rationale
The revised error message avoids offering much guidance — it doesn't
actively suggest setting the option to a different value or suggest
adding an implicit binding. Care needs to be taken here to make sure
advice is not misleading; as the accepted RFC in #6462 points out, a
substantial portion of autobinding failures are just going to be
misspellings.
I considered and then rejected a code action here to that would add a
local `set_option autoImplicit true`. This seems undesirable or
counterproductive — if a project like Mathlib has proactively disabled
`autoImplicit`, its odd to be pushing local exceptions.
A hint prompting the user to add an implicit binding would be more
proper, but only in certain circumstances — we want to be conservative
in suggesting specific code actions! In a situation like this one, we'd
want to _avoid_ giving the suggestion of adding a `{HasArr}` binding,
which I think either requires tricky heuristics or means we'd want the
elaboration to play through the consequences of auto-binding and make
sure it doesn't cause any follow-on errors before suggesting adding an
implicit binding.
```
set_option autoImplicit true
set_option relaxedAutoImplicit false
instance has_arr : HasArr Preorder := { Arr := Function }
```
Additionally, it seems like it would make the most sense to offer to
auto-bind _all_ the relevant unknown identifiers at once. To avoid being
misleading, this too would seem to require playing through the
consequences of autobinding before being able to safely suggest the
change. This is enough additional complexity that I'm leaving it for
future work.
---------
Co-authored-by: David Thrane Christiansen <david@davidchristiansen.dk>
This PR prevents symbol clashes between (non-`@[export]`) definitions
from different Lean packages.
Previously, if two modules define a function with the same name and were
transitively imported (even privately) by some downstream module,
linking would fail due to a symbol clash. Similarly, if a user defined a
symbol with the same name as one in the `Lean` library, Lean would use
the core symbol even if one did not import `Lean`.
This is solved by changing Lean's name mangling algorithm to include an
optional package identifier. This identifier is provided by Lake via
`--setup` when building a module. This information is weaved through the
elaborator, interpreter, and compiler via a persistent environment
extension that associates modules with their package identifier.
With a package identifier, standard symbols have the form
`lp_<pkg-id>_<mangled-def>`. Without one, the old scheme is used (i.e.,
`l_<mangled-def>`). Module initializers are also prefixed with package
identifier (if any). For example, the initializer for a module `Foo` in
a package `test` is now `initialize_test_Foo` (instead of
`initialize_Foo`). Lake's default for native library names has also been
adjusted accordingly, so that libraries can still, by default, be used
as plugins. Thus, the default library name of the `lean_lib Foo` in
`package test` will now be `libtest_Foo`.
When using Lake to build the Lean core (i.e., `bootstrap = true`), no
package identifier will be used. Thus, definitions in user packages can
never have symbol clashes with core.
Closes#222.
This PR redefines `String.take` and variants to operate on
`String.Slice`. While previously functions returning a substring of the
input sometimes returned `String` and sometimes returned
`Substring.Raw`, they now uniformly return `String.Slice`.
This is a BREAKING change, because many functions now have a different
return type. So for example, if `s` is a string and `f` is a function
accepting a string, `f (s.drop 1)` will no longer compile because
`s.drop 1` is a `String.Slice`. To fix this, insert a call to `copy` to
restore the old behavior: `f (s.drop 1).copy`.
Of course, in many cases, there will be more efficient options. For
example, don't write `f <| s.drop 1 |>.copy |>.dropEnd 1 |>.copy`, write
`f <| s.drop 1 |>.dropEnd 1 |>.copy` instead. Also, instead of `(s.drop
1).copy = "Hello"`, write `s.drop 1 == "Hello".toSlice` instead.
This PR extracts two modules from `Match.MatchEqs`, in preparation of
#11220
and to use the module system to draw clear boundaries between concerns
here.
This PR registers a node kind for `Lean.Parser.Term.elabToSyntax` in
order to support the `Lean.Elab.Term.elabToSyntax` functionality without
registering a dedicated parser for user-accessible syntax.
This PR implements `elabToSyntax` for creating scoped syntax `s :
Syntax` for an arbitrary elaborator `el : Option Expr -> TermElabM Expr`
such that `elabTerm s = el`.
Roundtripping example implementing an elaborator imitating `let`:
```lean
elab "lett " decl:letDecl ";" e:term : term <= ty? => do
let elabE (ty? : Option Expr) : TermElabM Expr := do elabTerm e ty?
elabToSyntax elabE fun body => do
elabTerm (← `(let $decl:letDecl; $body)) ty?
#guard lett x := 42; (x + 1) = 43
```
This PR avoids match splitter calculation from testing all quadratically
many pairs of alternatives for overlaps, by keeping track of possible
overlaps during matcher calculation, storing that information in the
`MatcherInfo`, and using that during matcher calculation.
This PR fixes a bug in the LCNF simplifier unearthed while working on
#11078. In some situations caused by `unsafeCast`, the simplifier would
record incorrect information about `cases`, leading to further bugs down
the line.
Suppose we have `v : NonScalar` due to an `unsafeCast` and we run
`cases` on it, expecting `Prod.mk fst snd`. The current code attempts to
record both the arguments from the constructor application in the case
arm `fst`, `snd` and the parameters for the type by inspecting the discr
`v`. However, `NonScalar` does of course not have any parameters,
causing the simplifier to record wrong information. This patch makes the
`cases` infrastructure more cautious when extracting information from
the type of `v`.
This PR changes how sparse case expressions represent the
none-of-the-above information. Instead of of many `x.ctorIdx ≠ i`
hypotheses, it introduces a single `Nat.hasNotBit mask x.ctorIdx`
hypothesis which compresses that information into a bitmask. This avoids
a quadratic overhead during splitter generation, where all n assumptions
would be refined through `.subst` and `.cases` constructions for all n
assumption of the splitter alternative.
The definition of `Nat.hasNotBit` uses `Nat.rightShift` which is fiddly
to get to reduce well, especially on open terms and with `Meta.whnf`.
Some experimentation was needed to find proof terms that work, these are
all put together in the `Lean.Meta.HasNotBit` module.
Fixes#11183
---------
Co-authored-by: Rob23oba <152706811+Rob23oba@users.noreply.github.com>
This PR fixes the `reduceArity` compiler pass to consider
over-applications to functions that have their arity reduced.
Previously, this pass assumed that the amount of arguments to
applications was always the same as the number of parameters in the
signature. This is usually true, since the compiler eagerly introduces
parameters as long as the return type is a function type, resulting in a
function with a return type that isn't a function type. However, for
dependent types that sometimes are function types and sometimes not,
this assumption is broken, resulting in the additional parameters to be
dropped.
Closes#11131
This ensures that no `grind` annotated theorem, simply by being
instantiated, causes a chain of >20 further instantiations, with a small
list of documented exceptions.
This PR modifies the `try?` framework, so each subsidiary tactic runs
with a separate `maxHeartbeats` budget.
---------
Co-authored-by: Rob23oba <152706811+Rob23oba@users.noreply.github.com>
This PR has `#grind_list check` produce a "Try this:" suggestion with
`#grind_list inspect` commands, as this is usually the next step in
dealing with problematic cases. We also fix the grind pattern for one
theorem, as part of testing the workflow. More to follow.
This PR fixes a few minor issues in the new `Action` framework used in
`grind`. The goal is to eventually delete the old `SearchM`
infrastructure. The main `solve` function used by `grind` is now based
on the `Action` framework. The PR also deletes dead code in `SearchM`.
This PR renames `Substring` to `Substring.Raw`.
This is to signify its status as a second-class citizen (not deprecated,
but no real plans for verification, like `String.Pos.Raw`) and to free
up the name `Substring` for a possible future type `String.Substring :
String -> Type` so that `s.Substring` is the type of substrings of `s`.
The functions `String.toSubstring` and `String.toSubstring'` will remain
for now for bootstrapping reasons.
This PR implements `try?` using the new `finish?` infrastructure. It
also removes the old tracing infrastructure, which is now obsolete.
Example:
```lean
/--
info: Try these:
[apply] grind
[apply] grind only [findIdx, insert, = mem_indices_of_mem, = getElem?_neg, = getElem?_pos, = HashMap.mem_insert,
= HashMap.getElem_insert, #1bba]
[apply] grind only [findIdx, insert, = mem_indices_of_mem, = getElem?_neg, = getElem?_pos, = HashMap.mem_insert,
= HashMap.getElem_insert]
[apply] grind =>
instantiate only [findIdx, insert, = mem_indices_of_mem]
instantiate only [= getElem?_neg, = getElem?_pos]
cases #1bba
· instantiate only [findIdx]
· instantiate only
instantiate only [= HashMap.mem_insert, = HashMap.getElem_insert]
-/
#guard_msgs in
example (m : IndexMap α β) (a : α) (b : β) :
(m.insert a b).findIdx a = if h : a ∈ m then m.findIdx a else m.size := by
try?
```
This PR modifies the error message that is returned when more than one
synthetic metavariable can't be resolved.
The two heuristics used for prioritization are:
- prefer typeclass problems associated with small ranges over typeclass
problems associated with large ranges (I'm pretty confident in this
heuristic)
- do not prefer typeclass problems over other kinds of errors (not as
confident in this heuristic)
This PR implements `grind_pattern` constraints. They are useful for
controlling theorem instantiation in `grind`. As an example, consider
the following two theorems:
```lean
theorem extract_empty {start stop : Nat} :
(#[] : Array α).extract start stop = #[] := …
theorem extract_extract {as : Array α} {i j k l : Nat} :
(as.extract i j).extract k l = as.extract (i + k) (min (i + l) j) := …
```
If both are used for theorem instantiation, an unbounded number of
instances is generated as soon as we add the term `#[].extract i j` to
the `grind` context.
We can now prevent this by adding a `grind_pattern` constraint to
`extract_extract`:
```lean
grind_pattern extract_extract => (as.extract i j).extract k l where
as =/= #[]
```
With this constraint, only one instance is generated, as expected:
```lean
/-- trace: [grind.ematch.instance] extract_empty: #[].extract i j = #[] -/
#guard_msgs (drop error, trace) in
set_option trace.grind.ematch.instance true in
example (as : Array Nat) (h : #[].extract i j = as) : False := by
grind only [= extract_empty, usr extract_extract]
```
This PR adds syntax for specifying `grind_pattern` constraints and
extends the `EMatchTheorem` object.
---
Note: We need a manual stage0 update because it affects the .olean
files.
This PR removes most cases where an error message explained that it was
"probably due to metavariables," giving more explanation and a hint.
## Example
```
def square x := x * x
```
Before:
```lean4
typeclass instance problem is stuck, it is often due to metavariables
HMul ?m.9 ?m.9 (?m.3 x)
```
After:
```
typeclass instance problem is stuck
HMul ?m.9 ?m.9 (?m.3 x)
Note: Lean will not try to resolve this typeclass instance problem because the
first and second type arguments to `HMul` are metavariables. These arguments
must be fully determined before Lean will try to resolve the typeclass.
Hint: Adding type annotations and supplying implicit arguments to functions
can give Lean more information for typeclass resolution. For example, if you
have a variable `x` that you intend to be a `Nat`, but Lean reports it as
having an unresolved type like `?m`, replacing `x` with `(x : Nat)` can get
typeclass resolution un-stuck.
```
In addition to providing beginner-and-intermediate-friendly explanation
about **why** typeclass instance problems are treated as "stuck" when
metavariables appear in output positions, this PR provides
potentially-valuable improvement even to expert users: it explains
**which of the typeclass arguments are inputs** and therefore need to be
fully specified before typeclass resolution will be attempted. This
information can be tricky to find otherwise.
## Next steps, but probably after this PR
* error explanation
* detecting when the syntactic source is a binop and giving a
special-cased explanation on the binary operators and their associated
typeclasses
* detecting when the syntactic source is a function call, inspecting the
function call's type somewhat, and replacing the generic "replace `x`
with `(x : Nat)` hint with a specialized "replace `foo` with `foo (tyArg
:= Nat)`" hint
This PR adds tactic and term mode macros for `∎` (typed `\qed`) which
expand to `try?`. The term mode version captures any produced
suggestions and prepends `by`.
Co-authored-by: Claude <noreply@anthropic.com>
This PR removes `simp_all? +suggestions` from `try?` for now. It's
really slow out in Mathlib; too often the suggestions cause `simp` to
loop. Until we have the ability for `try?` to move past a timeing-out
tactic (or maybe even until we have parallelism), it needs to be
removed.
Alternatively, we could try modifying `simp` so that e.g. it won't use a
premise more than once. This might help avoid loops, but it would
produce less-reproducible proofs.
Co-authored-by: Claude <noreply@anthropic.com>
This PR ensures that tactics using library suggestions set the caller
field, so the premise selection engine has access to this. We'll later
use this to filter out some modules for grind, which we know have
already been fully annotated.
Co-authored-by: Claude <noreply@anthropic.com>
This PR implements support for `#grind_lint check in module <module>`.
Mathlib does not use namespaces, so we need to restrict the
`#grind_lint` search space using module (prefix) names. Example:
```lean
/--
info: instantiating `Array.filterMap_some` triggers more than 100 additional `grind` theorem instantiations
---
info: Array.filterMap_some
[thm] instances
[thm] Array.filterMap_filterMap ↦ 94
[thm] Array.size_filterMap_le ↦ 5
[thm] Array.filterMap_some ↦ 1
---
info: instantiating `Array.range_succ` triggers 22 additional `grind` theorem instantiations
-/
#guard_msgs in
#grind_lint check (min := 20) in module Init.Data.Array
```
This PR changes the default library suggestions (e.g. for `grind
+suggestions` or `simp_all? +suggestions) to include the theorems from
the current file in addition to the output of Sine Qua Non.
This PR implements the following improvements to the `#grind_lint`
command:
1. More informative messages when the number of instances exceeds the
minimum threshold.
2. A code action for `#grind_lint inspect` that inserts
`set_option trace.grind.ematch.instance true` whenever the number of
instances exceeds
the minimum threshold.
3. Displaying doc strings for `grind` configuration options in
`#grind_lint`.
4. Improve doc strings for `#grind_lint inspect` and `#grind_lint
check`.
Example:
```lean
/--
info: instantiating `Array.filterMap_some` triggers more than 100 additional `grind` theorem instantiations
---
info: Array.filterMap_some
[thm] instances
[thm] Array.filterMap_filterMap ↦ 94
[thm] Array.size_filterMap_le ↦ 5
[thm] Array.filterMap_some ↦ 1
---
info: Try this to display the actual theorem instances:
[apply] set_option trace.grind.ematch.instance true in
#grind_lint inspect Array.filterMap_some
-/
#guard_msgs in
#grind_lint inspect Array.filterMap_some
```
This PR fixes some details in the Markdown renderings of Verso
docstrings, and adds tests to keep them correct. Also adds tests for
Verso docstring metadata.
This PR implements the `#grind_lint` command, a diagnostic tool for
analyzing the behavior of theorems annotated for theorem instantiation.
The command helps identify problematic theorems that produce excessive
or unbounded instance generation during E-matching, which can lead to
performance issues.
The main entry point is:
```
#grind_lint check
```
which analyzes all theorems marked with the `@[grind]` attribute.
For each theorem, it creates an artificial goal and runs `grind`,
collecting statistics about the number of instances produced.
Results are summarized using info messages, and detailed breakdowns are
shown for lemmas exceeding a configurable threshold.
Additional subcommands are provided for targeted inspection and control:
* `#grind_lint inspect thm`: analyzes one or more specific theorems in
detail
* `#grind_lint mute thm`: excludes a theorem from instantiation during
analysis
* `#grind_lint skip thm`: omits a theorem from being analyzed by
`#grind_lint check`
This PR adds a user-extension mechanism for the `try?` tactic. You can
either use the `@[try_suggestion]` attribute on a declaration with
signature ``MVarId -> Try.Info -> MetaM (Array (TSyntax `tactic))`` to
produce suggestions, or the `register_try?_tactic <stx>` command with a
fixed piece of syntax. User-extensions are only tried *after* the
built-in try strategies have been tried and failed.
I wanted to ensure that if the user provides a tactic that produces a
"Try this:" suggestion, we both emit the original tactic and the
suggested replacement (this is what we already do with `grind` and
`simp`). I have this working, but it is quite hacky: we grab the message
log and parse it. I fear this will break when the "Try this:" format is
inevitably changed in the future.
<!-- CURSOR_SUMMARY -->
---
> [!NOTE]
> Adds user-defined suggestion generators for `try?` via
`@[try_suggestion]` and `register_try?_tactic`, executed after built-ins
with priority and double-suggestion handling.
>
> - **Parser/Command**:
> - Add command syntax `register_try?_tactic (priority := n)?
<tacticSeq>` in `Lean.Parser.Command`.
> - **Suggestion registry**:
> - Introduce `@[try_suggestion (prio)]` attribute with a scoped env
extension to register generators (`MVarId → Try.Info → MetaM (Array
(TSyntax `tactic))`).
> - Priority ordering (higher first); supports local/global scope.
> - **Tactic engine (`try?`)**:
> - New unsafe pipeline to collect and run user generators after
built-in tactics; expands nested "Try this" outputs from user tactics.
> - `mkTryEvalSuggestStx` now takes `(goal, info)`; integrates user
tactics as fallback via `attempt_all`.
> - Suppress intermediate "Try this" messages during `evalAndSuggest` by
restoring the message log.
> - **Imports**:
> - Add `meta import Lean.Elab.Command` for command elaboration.
> - **Tests**:
> - `try_register_builtin.lean`: command availability and warning
without import.
> - `try_user_suggestions.lean`: basic, priority, built-in fallback,
double-suggestion, and command registration cases.
> - Update `versoDocMissing.lean.expected.out` to include
`register_try?_tactic` in expected commands.
>
> <sup>Written by [Cursor
Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit
302dc9454450eb29ad4ea9b01d87ac60365299ad. This will update automatically
on new commits. Configure
[here](https://cursor.com/dashboard?tab=bugbot).</sup>
<!-- /CURSOR_SUMMARY -->
This PR adds a new, inactive and unused `doElem_elab` attribute that
will allow users to register custom elaborators for `doElem`s in the
form of the new type `DoElab`. The old `do` elaborator is active by
default but can be switched off by disabling the new option
`backward.do.legacy`.