This PR introduces a function `String.split` which is based on
`String.Slice.split` and therefore supports all pattern types and
returns a `Std.Iter String.Slice`.
This supersedes the functions `String.splitOn` and `String.splitToList`,
and we remove all all uses of these functions from core. They will be
deprecated in a future PR.
Migrating from `String.splitOn` and `String.splitToList` is easy: we
introduce functions `Iter.toStringList` and `Iter.toStringArray` that
can be used to conveniently go from `Std.Iter String.Slice` to `List
String` and `Array String`, so for example `s.splitOn "foo"` can be
replaced by `s.split "foo" |>.toStringList`.
This PR redefines `String.take` and variants to operate on
`String.Slice`. While previously functions returning a substring of the
input sometimes returned `String` and sometimes returned
`Substring.Raw`, they now uniformly return `String.Slice`.
This is a BREAKING change, because many functions now have a different
return type. So for example, if `s` is a string and `f` is a function
accepting a string, `f (s.drop 1)` will no longer compile because
`s.drop 1` is a `String.Slice`. To fix this, insert a call to `copy` to
restore the old behavior: `f (s.drop 1).copy`.
Of course, in many cases, there will be more efficient options. For
example, don't write `f <| s.drop 1 |>.copy |>.dropEnd 1 |>.copy`, write
`f <| s.drop 1 |>.dropEnd 1 |>.copy` instead. Also, instead of `(s.drop
1).copy = "Hello"`, write `s.drop 1 == "Hello".toSlice` instead.
This PR renames `Substring` to `Substring.Raw`.
This is to signify its status as a second-class citizen (not deprecated,
but no real plans for verification, like `String.Pos.Raw`) and to free
up the name `Substring` for a possible future type `String.Substring :
String -> Type` so that `s.Substring` is the type of substrings of `s`.
The functions `String.toSubstring` and `String.toSubstring'` will remain
for now for bootstrapping reasons.
This PR removes all uses of `String.Iterator` from core, preferring
`String.ValidPos` instead.
In an upcoming PR, `String.Iterator` will be renamed to
`String.Legacy.Iterator`.
This PR defines `String.Slice.replace` and redefines `String.replace` to
use the `Slice` version.
The new implementation is generic in the pattern, so it supports things
like `"education".replace isVowel "☃!" = "☃!d☃!c☃!t☃!☃!n"`. Since it
uses the `ForwardSearcher` infrastructure, `String` patterns are
searched using KMP, unlike the previous implementation which had
quadratic runtime. As a side effect, the behavior when replacing an
empty string now matches that of most other programming languages,
namely `"abc".replace "" "k" = "kakbkck"`.
This PR splits some low-hanging fruit out of `Init.Data.String.Basic`:
basic material about `String.Pos.Raw`, `String.Substrig`, and
`String.Iterator`.
More splitting required and the remaining material is quite unorganized,
but it's a start.
This PR renames `String.endPos` to `String.rawEndPos`, as in a future
release the name `String.endPos` will be taken by the function that is
currently called `String.endValidPos`.
This PR moves many operations involving `String.Pos.Raw` to a the
`String.Pos.Raw` namespace with the eventual aim of freeing up the
`String` namespace to contain operations using `String.ValidPos` (to be
renamed to `String.Pos`) instead.
This PR adds the `String.ValidPos.set` and `String.ValidPos.modify`
functions.
After this PR, `String.pos_lt_eq` is no longer a `simp` lemma. Add
`String.Pos.Raw.lt_iff` as a `simp` lemma if your proofs break.
This PR introduces the `backward.privateInPublic` option to aid in
porting projects to the module system by temporarily allowing access to
private declarations from the public scope, even across modules. A
warning will be generated by such accesses unless
`backward.privateInPublic.warn` is disabled.
This PR fixes a bug in combination with VS Code where Lean code that
looks like CSS color codes would display a color picker decoration.
VS Code displays this decoration by default for all languages, not just
CSS. Due to https://github.com/microsoft/vscode/issues/91533, this
setting cannot be disabled in the client on a per-language basis.
However, we can override the default behavior by providing a color
provider of our own. This PR implements an empty color provider to
override the VS Code one.
This PR enforces rules around arithmetic of `String.Pos.Raw`.
Specifically, it adopts the following conventions:
- Byte indices ("ordinals") in strings should be represented using
`String.Pos.Raw`
- Amounts of bytes ("cardinals") in strings should be represented using
`Nat`.
For example, `String.Slice.utf8ByteSize` now returns `Nat` instead of
`String.Pos.Raw`, and there is a new function `String.Slice.rawEndPos`.
Finally, the `HAdd` and `HSub` instances for `String.Pos.Raw` are
reorganized. This is a **breaking change**.
The `HAdd/HSub String.Pos.Raw String.Pos.Raw String.Pos.Raw` instances
have been removed. For the use case of tracking positions relative to
some other position, we instead provide `offsetBy` and `unoffsetBy`
functions. For the use case of advancing/unadvancing a position by an
arbitrary number of bytes, we instead provide `increaseBy` and
`decreaseBy` functions. For
offsetting/unoffsetting/advancing/unadvancing a position `p` by the size
of a string `s` (resp. character `c`), use `s + p`/`p - s`/`p + s`/`p -
s` (resp. `c + p`/`p - c`/`p + c`/`p - c`).
This PR significantly improves the test coverage of the language server,
providing at least a single basic test for every request that is used by
the client. It also implements infrastructure for testing all of these
requests, e.g. the ability to run interactive tests in a project context
and refactors the interactive test runner to be more maintainable.
Finally, it also fixes a small bug with the recently implemented unknown
identifier code actions for auto-implicits (#10442) that was discovered
in testing, where the "import all unambiguous unknown identifiers" code
action didn't work correctly on auto-implicit identifiers.
This PR cuts some edges from the import graph.
Specifically:
- `TreeMap` and `HashMap` no longer depend on `String`, so now the
expensive things are all in parallel instead of partially in sequence
- `Omega` no longer relies on `List` lemmas
- The section of the import graph between `Init.Omega` and
`Init.Data.Bitvec.Lemmas` is cleaned up a bit
This PR introduces safe alternatives to `String.Pos` and `Substring`
that can only represent valid positions/slices.
Specifically, the PR
- introduces the predicate `String.Pos.IsValid`;
- proves several nontrivial equivalent conditions for
`String.Pos.IsValid`;
- introduces `String.ValidPos`, which is a `String.Pos` with an
`IsValid` proof;
- introduces `String.Slice`, which is like `Substring` but made from
`String.ValidPos` instead of `Pos`;
- introduces `String.Pos.IsValidForSlice`, which is like
`String.Pos.IsValid` but for slices;
- introduces `String.Slice.Pos`, which is like `String.ValidPos` but for
slices;
- introduces various functions for converting between the two types of
positions.
The API added in this PR is not complete. It will be expanded in future
PRs with addional operations and verification.
This PR resolves a potential bad interaction between the compiler and
the module system where references to declarations not imported are
brought into scope by inlining or specializing. We now proactively check
that declarations to be inlined/specialized only reference public
imports. The intention is to later resolve this limitation by moving out
compilation into a separate build step with its own import/incremental
system.
This PR adds a simple implementation of MePo, from "Lightweight
relevance filtering for machine-generated resolution problems" by Meng
and Paulson.
This needs tuning, but is already useful as a baseline or test case.
---------
Co-authored-by: Thomas Zhu <thomas.zhu.sh@hotmail.com>
This PR uses the per-constructor `noConfusion` principles (from #10315)
in the `mkNoConfusion` app builder, if possible. This means they are
used by `injection`, `grind`, `simp` and other places. This brings
notable performance improvements when dealing with inductives with a
large number of constructors.
This PR upstreams the Verso parser and adds preliminary support for
Verso in docstrings. This will allow the compiler to check examples and
cross-references in documentation.
After a `stage0` update, a follow-up PR will add the appropriate
attributes that allow the feature to be used. The parser tests from
Verso also remain to be upstreamed, and user-facing documentation will
be added once the feature has been used on more internals.
This PR speeds up auto-completion by a factor of ~3.5x through various
performance improvements in the language server. On one machine, with
`import Mathlib`, completing `i` used to take 3200ms and now instead
yields a result in 920ms.
Specifically, the following improvements are made:
- The watchdog process no longer de-serializes and re-serializes most
messages from the file worker before passing them on to the user - a
fast partial de-serialization procedure is now used to determine whether
the message needs to be de-serialized in full or not.
- `escapePart` is optimized to perform better on ASCII strings that do
not need escaping.
- `Json.compress` is optimized to allocate fewer objects.
- A faster JSON compression specifically for completion responses is
implemented that skips allocating `Json` altogether.
- The JSON compression has been moved to the task where we convert a
request response to `Json` so that converting to a string won't block
the output task of the FileWorker and so the `Json` value is not marked
as multi-threaded when we compress is, which drastically increases the
cost of reference-counting.
- The JSON representation of the `data?` field of each completion item
is optimized.
- Both the completion kind and the set of completion tags for each
imported completion item is now cached.
- The filtering of duplicate completion items is optimized.
Other adjustments:
- `LT UInt8` and `LE UInt8` are moved to Prelude so that they can be
used in `Init.Meta` for the name part escaping fast path.
- `Array.usize` is exposed since it was marked as `@[simp]`.
This PR generalizes the monadic operations for `HashMap`, `TreeMap`, and
`HashSet` to work for `m : Type u → Type v`.
This upstreams [a workaround from
Aesop](66a992130e/Aesop/Util/Basic.lean (L57-L66)),
and seems to continue a pattern already established in other files, such
as:
```lean
Array.forM.{u, v, w} {α : Type u} {m : Type v → Type w} [Monad m] (f : α → m PUnit) (as : Array α) (start : Nat := 0)
(stop : Nat := as.size) : m PUnit
```
This PR normalizes the published diagnostics in the test runner so that
messages published out of order (due to parallelism) cannot cause test
failures. Clients can handle out-of-order messages just fine.
This PR allows Lean's parser to run with a final position prior to the
end of the string, so it can be invoked on a sub-region of the input.
This has applications in Verso proper, which parses Lean syntax in
contexts such as code blocks and docstrings, and it is a prerequisite to
parsing the contents of Lean docstrings.
This PR changes macro scope numbering from per-module to per-command,
ensuring that unrelated changes to other commands do not affect macro
scopes generated by a command, which improves `prefer_native` hit rates
on bootstrapping as well as avoids further rebuilds under the module
system.
In detail, instead of always using the current module name as a macro
scope prefix, each command now introduces a new macro scope prefix
(called "context") of the shape `<main module>._hygCtx_<uniq>` where
`uniq` is a `UInt32` derived from the command but automatically
incremented in case of conflicts (which must be local to the current
module). In the current implementation, `uniq` is the hash of the
declaration name, if any, or else the hash of the full command's syntax.
Thus, it is always independent of syntactic changes to other commands
(except in case of hash conflicts, which should only happen in practice
for syntactically identical commands) and, in the case of declarations,
also independent of syntactic changes to any private parts of the
declaration.
This PR makes `IsPreorder`, `IsPartialOrder`, `IsLinearPreorder` and
`IsLinearOrder` extend `BEq` and `Ord` as appropriate, adds the
`LawfulOrderBEq` and `LawfulOrderOrd` typeclasses relating `BEq` and
`Ord` to `LE`, and adds many lemmas and instances.
Note: This PR contains a refactoring where `Init.Data.Ord` is moved to
`Init.Data.Ord.Basic`. If I added `Init.Data.Ord` simply importing all
submodules, git would not be able to determine that `Init.Data.Ord` was
renamed to `Init.Data.Ord.Basic`. This could lead to unnecessary merge
conflicts in the future. Hence, I chose the name `Init.Data.OrdRoot`
instead of `Init.Data.Ord` temporarily. After this PR, I will rename
this module back to `Init.Data.Ord` in a separate PR.
(This is a copy of #9430: I will not touch that PR because it currently
allows to debug a CI problem and pushing commits might break the
reproducibility.)
This PR addresses a missing check in the module system where private
names that remain in the public environment map for technical reasons
(e.g. inductive constructors generated by the kernel and relied on by
the code generator) accidentally were accessible in the public scope.
This PR performs some micro optimizations on fuzzy matching for a `~20%`
instructions win.
The three key changes are:
- try to remove some unnecessary allocations of things such as tuples
- change `containsInOrderLower` to use the efficient `get'` and `next'`
primitives. I hope that we can replace these with iterators on strings
in the second half of this quarter
- Do the same thing as clangd and use `Int16` with the `minValue` being
used for "worst score" while this does have the potential to
over/underflow, if the user is working with a score in the 10000s
something weird is certainly going on already (the score usually seems
to be in the 2 digit area based on some).
As an additional bonus, once we finally have unboxed arrays we will get
some additional cache wins on the 16 bit arrays!
This PR upstreams some helper instances for `NameSet` from Batteries.
(These could be generalized to an arbitrary TreeSet, but I'll leave that
for someone else.)
This PR improves the 'Go to Definition' UX, specifically:
- Using 'Go to Definition' on a type class projection will now extract
the specific instances that were involved and provide them as locations
to jump to. For example, using 'Go to Definition' on the `toString` of
`toString 0` will yield results for `ToString.toString` and `ToString
Nat`.
- Using 'Go to Definition' on a macro that produces syntax with type
class projections will now also extract the specific instances that were
involved and provide them as locations to jump to. For example, using
'Go to Definition' on the `+` of `1 + 1` will yield results for
`HAdd.hAdd`, `HAdd α α α` and `Add Nat`.
- Using 'Go to Declaration' will now provide all the results of 'Go to
Definition' in addition to the elaborator and the parser that were
involved. For example, using 'Go to Declaration' on the `+` of `1 + 1`
will yield results for `HAdd.hAdd`, `HAdd α α α`, `Add Nat`,
``macro_rules | `($x + $y) => ...`` and `infixl:65 " + " => HAdd.hAdd`.
- Using 'Go to Type Definition' on a value with a type that contains
multiple constants will now provide 'Go to Definition' results for each
constant. For example, using 'Go to Type Definition' on `x` for `x :
Array Nat` will yield results for `Array` and `Nat`.
### Details
'Go to Definition' for type class projections was first implemented by
#1767, but there were still a couple of shortcomings with the
implementation. E.g. in order to jump to the instance in `toString 0`,
one had to add another space within the application and then use 'Go to
Definition' on that, or macros would block instances from being
displayed. Then, when the .ilean format was added, most 'Go to
Definition' requests were already handled using the .ileans in the
watchdog process, and so the file worker never received them to handle
them with the semantic information that it has available.
This PR resolves most of the issues with the previous implementation and
refactors the 'Go to Definition' control flow so that 'Go to Definition'
requests are always handled by the file worker, with the watchdog merely
using its .ilean position information to update the positions in the
response to a more up-to-date state. This is necessary because the file
worker obtains its position information from the .oleans, which need to
be rebuilt in order to be up-to-date, while the watchdog always receives
.ilean update notifications from each active file worker with the
current position information in the editor.
Finally, all of the 'Go to Definition' code is refactored to be easier
to maintain.
### Breaking changes
`InfoTree.hoverableInfoAt?` has been generalized to
`InfoTree.hoverableInfoAtM?` and now takes a general `filter` argument
instead of several boolean flags, as was the case before.
This PR removes uses of `Lean.RBMap` in Lean itself.
Furthermore some massaging of the import graph is done in order to avoid
having `Std.Data.TreeMap.AdditionalOperations` (which is quite
expensive) be the critical path for a large chunk of Lean. In particular
we can build `Lean.Meta.Simp` and `Lean.Meta.Grind` without it thanks to
these changes.
We did previously not conduct this change as `Std.TreeMap` was not
outperforming `Lean.RBMap` yet, however this has changed with the new
code generator.