This PR defines `String.Slice.replace` and redefines `String.replace` to
use the `Slice` version.
The new implementation is generic in the pattern, so it supports things
like `"education".replace isVowel "☃!" = "☃!d☃!c☃!t☃!☃!n"`. Since it
uses the `ForwardSearcher` infrastructure, `String` patterns are
searched using KMP, unlike the previous implementation which had
quadratic runtime. As a side effect, the behavior when replacing an
empty string now matches that of most other programming languages,
namely `"abc".replace "" "k" = "kakbkck"`.
This PR splits some low-hanging fruit out of `Init.Data.String.Basic`:
basic material about `String.Pos.Raw`, `String.Substrig`, and
`String.Iterator`.
More splitting required and the remaining material is quite unorganized,
but it's a start.
This PR renames `String.endPos` to `String.rawEndPos`, as in a future
release the name `String.endPos` will be taken by the function that is
currently called `String.endValidPos`.
This PR moves many operations involving `String.Pos.Raw` to a the
`String.Pos.Raw` namespace with the eventual aim of freeing up the
`String` namespace to contain operations using `String.ValidPos` (to be
renamed to `String.Pos`) instead.
This PR adds the `String.ValidPos.set` and `String.ValidPos.modify`
functions.
After this PR, `String.pos_lt_eq` is no longer a `simp` lemma. Add
`String.Pos.Raw.lt_iff` as a `simp` lemma if your proofs break.
This PR introduces the `backward.privateInPublic` option to aid in
porting projects to the module system by temporarily allowing access to
private declarations from the public scope, even across modules. A
warning will be generated by such accesses unless
`backward.privateInPublic.warn` is disabled.
This PR fixes a bug in combination with VS Code where Lean code that
looks like CSS color codes would display a color picker decoration.
VS Code displays this decoration by default for all languages, not just
CSS. Due to https://github.com/microsoft/vscode/issues/91533, this
setting cannot be disabled in the client on a per-language basis.
However, we can override the default behavior by providing a color
provider of our own. This PR implements an empty color provider to
override the VS Code one.
This PR enforces rules around arithmetic of `String.Pos.Raw`.
Specifically, it adopts the following conventions:
- Byte indices ("ordinals") in strings should be represented using
`String.Pos.Raw`
- Amounts of bytes ("cardinals") in strings should be represented using
`Nat`.
For example, `String.Slice.utf8ByteSize` now returns `Nat` instead of
`String.Pos.Raw`, and there is a new function `String.Slice.rawEndPos`.
Finally, the `HAdd` and `HSub` instances for `String.Pos.Raw` are
reorganized. This is a **breaking change**.
The `HAdd/HSub String.Pos.Raw String.Pos.Raw String.Pos.Raw` instances
have been removed. For the use case of tracking positions relative to
some other position, we instead provide `offsetBy` and `unoffsetBy`
functions. For the use case of advancing/unadvancing a position by an
arbitrary number of bytes, we instead provide `increaseBy` and
`decreaseBy` functions. For
offsetting/unoffsetting/advancing/unadvancing a position `p` by the size
of a string `s` (resp. character `c`), use `s + p`/`p - s`/`p + s`/`p -
s` (resp. `c + p`/`p - c`/`p + c`/`p - c`).
This PR significantly improves the test coverage of the language server,
providing at least a single basic test for every request that is used by
the client. It also implements infrastructure for testing all of these
requests, e.g. the ability to run interactive tests in a project context
and refactors the interactive test runner to be more maintainable.
Finally, it also fixes a small bug with the recently implemented unknown
identifier code actions for auto-implicits (#10442) that was discovered
in testing, where the "import all unambiguous unknown identifiers" code
action didn't work correctly on auto-implicit identifiers.
This PR cuts some edges from the import graph.
Specifically:
- `TreeMap` and `HashMap` no longer depend on `String`, so now the
expensive things are all in parallel instead of partially in sequence
- `Omega` no longer relies on `List` lemmas
- The section of the import graph between `Init.Omega` and
`Init.Data.Bitvec.Lemmas` is cleaned up a bit
This PR introduces safe alternatives to `String.Pos` and `Substring`
that can only represent valid positions/slices.
Specifically, the PR
- introduces the predicate `String.Pos.IsValid`;
- proves several nontrivial equivalent conditions for
`String.Pos.IsValid`;
- introduces `String.ValidPos`, which is a `String.Pos` with an
`IsValid` proof;
- introduces `String.Slice`, which is like `Substring` but made from
`String.ValidPos` instead of `Pos`;
- introduces `String.Pos.IsValidForSlice`, which is like
`String.Pos.IsValid` but for slices;
- introduces `String.Slice.Pos`, which is like `String.ValidPos` but for
slices;
- introduces various functions for converting between the two types of
positions.
The API added in this PR is not complete. It will be expanded in future
PRs with addional operations and verification.
This PR resolves a potential bad interaction between the compiler and
the module system where references to declarations not imported are
brought into scope by inlining or specializing. We now proactively check
that declarations to be inlined/specialized only reference public
imports. The intention is to later resolve this limitation by moving out
compilation into a separate build step with its own import/incremental
system.
This PR adds a simple implementation of MePo, from "Lightweight
relevance filtering for machine-generated resolution problems" by Meng
and Paulson.
This needs tuning, but is already useful as a baseline or test case.
---------
Co-authored-by: Thomas Zhu <thomas.zhu.sh@hotmail.com>
This PR uses the per-constructor `noConfusion` principles (from #10315)
in the `mkNoConfusion` app builder, if possible. This means they are
used by `injection`, `grind`, `simp` and other places. This brings
notable performance improvements when dealing with inductives with a
large number of constructors.
This PR upstreams the Verso parser and adds preliminary support for
Verso in docstrings. This will allow the compiler to check examples and
cross-references in documentation.
After a `stage0` update, a follow-up PR will add the appropriate
attributes that allow the feature to be used. The parser tests from
Verso also remain to be upstreamed, and user-facing documentation will
be added once the feature has been used on more internals.
This PR speeds up auto-completion by a factor of ~3.5x through various
performance improvements in the language server. On one machine, with
`import Mathlib`, completing `i` used to take 3200ms and now instead
yields a result in 920ms.
Specifically, the following improvements are made:
- The watchdog process no longer de-serializes and re-serializes most
messages from the file worker before passing them on to the user - a
fast partial de-serialization procedure is now used to determine whether
the message needs to be de-serialized in full or not.
- `escapePart` is optimized to perform better on ASCII strings that do
not need escaping.
- `Json.compress` is optimized to allocate fewer objects.
- A faster JSON compression specifically for completion responses is
implemented that skips allocating `Json` altogether.
- The JSON compression has been moved to the task where we convert a
request response to `Json` so that converting to a string won't block
the output task of the FileWorker and so the `Json` value is not marked
as multi-threaded when we compress is, which drastically increases the
cost of reference-counting.
- The JSON representation of the `data?` field of each completion item
is optimized.
- Both the completion kind and the set of completion tags for each
imported completion item is now cached.
- The filtering of duplicate completion items is optimized.
Other adjustments:
- `LT UInt8` and `LE UInt8` are moved to Prelude so that they can be
used in `Init.Meta` for the name part escaping fast path.
- `Array.usize` is exposed since it was marked as `@[simp]`.
This PR generalizes the monadic operations for `HashMap`, `TreeMap`, and
`HashSet` to work for `m : Type u → Type v`.
This upstreams [a workaround from
Aesop](66a992130e/Aesop/Util/Basic.lean (L57-L66)),
and seems to continue a pattern already established in other files, such
as:
```lean
Array.forM.{u, v, w} {α : Type u} {m : Type v → Type w} [Monad m] (f : α → m PUnit) (as : Array α) (start : Nat := 0)
(stop : Nat := as.size) : m PUnit
```
This PR normalizes the published diagnostics in the test runner so that
messages published out of order (due to parallelism) cannot cause test
failures. Clients can handle out-of-order messages just fine.
This PR allows Lean's parser to run with a final position prior to the
end of the string, so it can be invoked on a sub-region of the input.
This has applications in Verso proper, which parses Lean syntax in
contexts such as code blocks and docstrings, and it is a prerequisite to
parsing the contents of Lean docstrings.
This PR changes macro scope numbering from per-module to per-command,
ensuring that unrelated changes to other commands do not affect macro
scopes generated by a command, which improves `prefer_native` hit rates
on bootstrapping as well as avoids further rebuilds under the module
system.
In detail, instead of always using the current module name as a macro
scope prefix, each command now introduces a new macro scope prefix
(called "context") of the shape `<main module>._hygCtx_<uniq>` where
`uniq` is a `UInt32` derived from the command but automatically
incremented in case of conflicts (which must be local to the current
module). In the current implementation, `uniq` is the hash of the
declaration name, if any, or else the hash of the full command's syntax.
Thus, it is always independent of syntactic changes to other commands
(except in case of hash conflicts, which should only happen in practice
for syntactically identical commands) and, in the case of declarations,
also independent of syntactic changes to any private parts of the
declaration.
This PR makes `IsPreorder`, `IsPartialOrder`, `IsLinearPreorder` and
`IsLinearOrder` extend `BEq` and `Ord` as appropriate, adds the
`LawfulOrderBEq` and `LawfulOrderOrd` typeclasses relating `BEq` and
`Ord` to `LE`, and adds many lemmas and instances.
Note: This PR contains a refactoring where `Init.Data.Ord` is moved to
`Init.Data.Ord.Basic`. If I added `Init.Data.Ord` simply importing all
submodules, git would not be able to determine that `Init.Data.Ord` was
renamed to `Init.Data.Ord.Basic`. This could lead to unnecessary merge
conflicts in the future. Hence, I chose the name `Init.Data.OrdRoot`
instead of `Init.Data.Ord` temporarily. After this PR, I will rename
this module back to `Init.Data.Ord` in a separate PR.
(This is a copy of #9430: I will not touch that PR because it currently
allows to debug a CI problem and pushing commits might break the
reproducibility.)
This PR addresses a missing check in the module system where private
names that remain in the public environment map for technical reasons
(e.g. inductive constructors generated by the kernel and relied on by
the code generator) accidentally were accessible in the public scope.
This PR performs some micro optimizations on fuzzy matching for a `~20%`
instructions win.
The three key changes are:
- try to remove some unnecessary allocations of things such as tuples
- change `containsInOrderLower` to use the efficient `get'` and `next'`
primitives. I hope that we can replace these with iterators on strings
in the second half of this quarter
- Do the same thing as clangd and use `Int16` with the `minValue` being
used for "worst score" while this does have the potential to
over/underflow, if the user is working with a score in the 10000s
something weird is certainly going on already (the score usually seems
to be in the 2 digit area based on some).
As an additional bonus, once we finally have unboxed arrays we will get
some additional cache wins on the 16 bit arrays!
This PR upstreams some helper instances for `NameSet` from Batteries.
(These could be generalized to an arbitrary TreeSet, but I'll leave that
for someone else.)
This PR improves the 'Go to Definition' UX, specifically:
- Using 'Go to Definition' on a type class projection will now extract
the specific instances that were involved and provide them as locations
to jump to. For example, using 'Go to Definition' on the `toString` of
`toString 0` will yield results for `ToString.toString` and `ToString
Nat`.
- Using 'Go to Definition' on a macro that produces syntax with type
class projections will now also extract the specific instances that were
involved and provide them as locations to jump to. For example, using
'Go to Definition' on the `+` of `1 + 1` will yield results for
`HAdd.hAdd`, `HAdd α α α` and `Add Nat`.
- Using 'Go to Declaration' will now provide all the results of 'Go to
Definition' in addition to the elaborator and the parser that were
involved. For example, using 'Go to Declaration' on the `+` of `1 + 1`
will yield results for `HAdd.hAdd`, `HAdd α α α`, `Add Nat`,
``macro_rules | `($x + $y) => ...`` and `infixl:65 " + " => HAdd.hAdd`.
- Using 'Go to Type Definition' on a value with a type that contains
multiple constants will now provide 'Go to Definition' results for each
constant. For example, using 'Go to Type Definition' on `x` for `x :
Array Nat` will yield results for `Array` and `Nat`.
### Details
'Go to Definition' for type class projections was first implemented by
#1767, but there were still a couple of shortcomings with the
implementation. E.g. in order to jump to the instance in `toString 0`,
one had to add another space within the application and then use 'Go to
Definition' on that, or macros would block instances from being
displayed. Then, when the .ilean format was added, most 'Go to
Definition' requests were already handled using the .ileans in the
watchdog process, and so the file worker never received them to handle
them with the semantic information that it has available.
This PR resolves most of the issues with the previous implementation and
refactors the 'Go to Definition' control flow so that 'Go to Definition'
requests are always handled by the file worker, with the watchdog merely
using its .ilean position information to update the positions in the
response to a more up-to-date state. This is necessary because the file
worker obtains its position information from the .oleans, which need to
be rebuilt in order to be up-to-date, while the watchdog always receives
.ilean update notifications from each active file worker with the
current position information in the editor.
Finally, all of the 'Go to Definition' code is refactored to be easier
to maintain.
### Breaking changes
`InfoTree.hoverableInfoAt?` has been generalized to
`InfoTree.hoverableInfoAtM?` and now takes a general `filter` argument
instead of several boolean flags, as was the case before.
This PR removes uses of `Lean.RBMap` in Lean itself.
Furthermore some massaging of the import graph is done in order to avoid
having `Std.Data.TreeMap.AdditionalOperations` (which is quite
expensive) be the critical path for a large chunk of Lean. In particular
we can build `Lean.Meta.Simp` and `Lean.Meta.Grind` without it thanks to
these changes.
We did previously not conduct this change as `Std.TreeMap` was not
outperforming `Lean.RBMap` yet, however this has changed with the new
code generator.
This PR migrates usages of `Std.Range` to the new polymorphic ranges.
This PR unfortunately increases the transitive imports for
frequently-used parts of `Init` because the ranges now rely on iterators
in order to provide their functionality for types other than `Nat`.
However, iteration over ranges in compiled code is as efficient as
before in the examples I checked. This is because of a special
`IteratorLoop` implementation provided in the PR for this purpose.
There were two issues that were uncovered during migration:
* In `IndPredBelow.lean`, migrating the last remaining range causes
`compilerTest1.lean` to break. I have minimized the issue and came to
the conclusion it's a compiler bug. Therefore, I have not replaced said
old range usage yet (see #9186).
* In `BRecOn.lean`, we are publicly importing the ranges. Making this
import private should theoretically work, but there seems to be a
problem with the module system, causing the build to panic later in
`Init.Data.Grind.Poly` (see #9185).
* In `FuzzyMatching.lean`, inlining fails with the new ranges, which
would have led to significant slowdown. Therefore, I have not migrated
this file either.
This PR replaces all usages of `[:]` slice notation in `src` with the
new `[...]` notation in production code, tests and comments. The
underlying implementation of the `Subarray` functions stays the same.
Notation cheat sheet:
* `*...*` is the doubly-unbounded range.
* `*...a` or `*...<a` contains all elements that are less than `a`.
* `*...=a` contains all elements that are less than or equal to `a`.
* `a...*` contains all elements that are greater than or equal to `a`.
* `a...b` or `a...<b` contains all elements that are greater than or
equal to `a` and less than `b`.
* `a...=b` contains all elements that are greater than or equal to `a`
and less than or equal to `b`.
* `a<...*` contains all elements that are greater than `a`.
* `a<...b` or `a<...<b` contains all elements that are greater than `a`
and less than `b`.
* `a<...=b` contains all elements that are greater than `a` and less
than or equal to `b`.
Benchmarks have shown that importing the iterator-backed parts of the
polymorphic slice library in `Init` impacts build performance. This PR
avoids this problem by separating those parts of the library that do not
rely on iterators from those those that do. Whereever the new slice
notation is used, only the iterator-independent files are imported.
This PR allows `simp` to recognize and warn about simp lemmas that are
likely looping in the current simp set. It does so automatically
whenever simplification fails with the dreaded “max recursion depth”
error fails, but it can be made to do it always with `set_option
linter.loopingSimpArgs true`. This check is not on by default because it
is somewhat costly, and can warn about simp calls that still happen to
work.
This closes#5111. In the end, this implemented much simpler logic than
described there (and tried in the abandoned #8688; see that PR
description for more background information), but it didn’t work as well
as I thought. The current logic is:
“Simplify the RHS of the simp theorem, complain if that fails”.
It is a reasonable policy for a Lean project to say that all simp
invocation should be so that this linter does not complain. Often it is
just a matter of explicitly disabling some simp theorems from the
default simp set, to make it clear and robust that in this call, we do
not want them to trigger. But given that often such simp call happen to
work, it’s too pedantic to impose it on everyone.
This PR filters out all declarations from `Lean.*`, `*.Tactic.*`, and
`*.Linter.*` from the results of `exact?` and `rw?`.
---------
Co-authored-by: damiano <adomani@gmail.com>
Co-authored-by: Markus Himmel <markus@lean-fro.org>
This PR implements signature help support. When typing a function
application, editors with support for signature help will now display a
popup that designates the current (remaining) function type. This
removes the need to remember the function signature while typing the
function application, or having to constantly cycle between hovering
over the function identifier and typing the application. In VS Code, the
signature help can be triggered manually using `Ctrl+Shift+Space`.

### Other changes
- In order to support signature help for the partial syntax `f a <|` or
`f a $`, these notations now elaborate as `f a`, not `f a .missing`.
- The logic in `delabConstWithSignature` that delaborates parameters is
factored out into a function `delabForallParamsWithSignature` so that it
can be used for arbitrary `forall`s, not just constants.
- The `InfoTree` formatter is adjusted to produce output where it is
easier to identify the kind of `Info` in the `InfoTree`.
- A bug in `InfoTree.smallestInfo?` is fixed so that it doesn't panic
anymore when its predicate `p` does not ensure that both `pos?` and
`tailPos?` of the `Info` are present.
This PR reworks the `simp` set around the `Id` monad, to not elide or
unfold `pure` and `Id.run`
In particular, it stops encoding the "defeq abuse" of `Id X = X` in the
statements of theorems, instead using `Id.run` and `pure` to pass back
and forth between these two spellings. Often when writing these with
`pure`, they generalize to other lawful monads; though such changes were
split off to other PRs.
This fixes the problem with the current simp set where `Id.run (pure x)`
is simplified to `Id.run x`, instead of the desirable `x`.
This is particularly bad because the` x` is sometimes inferred with type
`Id X` instead of `X`, which prevents other `simp` lemmas about `X` from
firing.
Making `Id` reducible instead is not an option, as then the `Monad`
instances would have nothing to key on.
---------
Co-authored-by: Sebastian Graf <sg@lean-fro.org>
Co-authored-by: Kim Morrison <kim@tqft.net>
Co-authored-by: Paul Reichert <6992158+datokrat@users.noreply.github.com>