When the elaborator doesn't provide us with any `CompletionInfo`, we
currently provide no completions whatsoever. But in many cases, we can
still provide some helpful identifier completions without elaborator
information. This PR adds a fallback mode for this situation.
There is more potential here, but this should be a good start.
In principle, this issue alleviates #5172 (since we now provide
completions in these contexts). I'll leave it up to an elaboration
maintainer whether we also want to ensure that the completion infos are
provided correctly in these cases.
Fixes#4455, fixes#4705, fixes#5219
Also fixes a minor bug where a dot in brackets would report incorrect
completions instead of no completions.
---------
Co-authored-by: Sebastian Ullrich <sebasti@nullri.ch>
This PR enables the use of incrementality for completion in tactic
blocks. Consider the following example:
```lean
example : True := by
have : True := T
sleep 10000
```
Before this PR, in order to respond to a completion request after `T`,
`sleep 10000` has to complete first since the command must be fully
elaborated. After this PR, the completion request is responded to
immediately.
This PR roughly halves the time needed to load the .ilean files by
optimizing the JSON parser and the conversion from JSON to Lean data
structures.
The code is optimized roughly as follows:
- String operations are inlined more aggressively
- Parsers are changed to use new `String.Iterator` functions `curr'` and
`next'` that receive a proof and hence do not need to perform an
additional check
- The `RefIdent` of .ilean files now uses a `String` instead of a `Name`
to avoid the expensive parse step from `String` to `Name` (despite the
fact that we only very rarely actually need a `Name` in downstream code)
- Instead of `List`s and `Subarray`s, the JSON to Lean conversion now
directly passes around arrays and array indices to avoid redundant
boxing
- Parsec's `peek?` sometimes generates redundant `Option` wrappers
because the generation of basic blocks interferes with the ctor-match
optimization, so it is changed to use an `isEof` check where possible
- Early returns and inline-do-blocks cause the code generator to
generate new functions, which then interfere with optimizations, so they
are now avoided
- Mutual defs are used instead of unspecialized passing of higher-order
functions to generate faster code
- The object parser is made tail-recursive
This PR also fixes a stack overflow in `Lean.Json.compress` that would
occur with long lists and adds a benchmark for the .ilean roundtrip
(compressed pretty-printing -> parsing).
This PR fixes a small bug where over time, "import out of data" messages
would accumulate in files when their size changed before restarting its
file worker.
This PR resolves two language server bugs that especially affect Windows
users:
1. Editing the header could result in the watchdog not correctly
restarting the file worker (#3786, #3787), which would lead to the file
seemingly being processed forever.
- The cause of this issue was a race condition in the watchdog that was
accidentally introduced as far back as #1884: In specific circumstances,
the watchdog will attempt forwarding a message to the file worker after
the process has exited due to a changed header, but before the file
worker exiting has been noticed by the watchdog (which will then restart
the file worker). In this case, the watchdog would mark the file worker
as having crashed and not look at its exit code to restart the file
worker, but instead treat it like a crashed file worker that will only
be restarted when editing the file again. Not inspecting the exit code
of the file worker when it crashed from forwarding a message from the
file worker is necessary since we do not restart the file worker until
another notification from the client arrives, and so we would read the
same crash exit code over and over again in the main loop of the
watchdog if we did not remove it from our list of file workers that we
listen to.
- This PR resolves this issue by distinguishing between "crashes when
forwarding messages to the file worker" and "crashes when forwarding
messages from the file worker". In the former case, we still inspect the
exit code of the file worker and potentially restart it if the imports
changed, whereas in the latter case, we stop inspecting the exit code of
the file worker. This is correct because the latter case is exactly the
one where we need to stop inspecting the exit code but where a crash
cannot occur as a result of a changed header, whereas the former case is
exactly the one where we still need to inspect the exit code after a
crash to ensure that we restart the file worker in case it exited
because the header changed.
- At some point in the future, it would be nice to revamp the
concurrency model of the watchdog entirely now that we have all those
fancy concurrency primitives that were not available four years ago when
the watchdog was first written.
2. On an especially slow Windows machine, we found that starting the
language server would sometimes not succeed at all because reading from
the stdin pipe in the watchdog produced an EINVAL error, which was in
turn caused by an NT "pipe empty" error.
- After lots of debugging, @Kha found that Lake accidentally passes its
stdin to Git because it does not explicitly set the `stdin` field to
`null` when spawning the process.
- Changing this fixes the issue, which suggests that Git may mutate the
pipe we pass to it to be non-blocking, which then causes a "pipe empty"
error in the watchdog when we also attempt to read from that same pipe.
- I'm still very uncertain why we only saw this issue on one
particularly slow machine and not across the whole eco system.
This PR also resolves an issue where we would not correctly emit
messages that we received while the file worker is being restarted to
the corresponding file worker after the restart.
Closes#3786, closes#3787.
---------
Co-authored-by: Sebastian Ullrich <sebasti@nullri.ch>
The function `locationLinksFromDecl` could throw an error if the name it
is provided doesn't exist in the environment, which is possible if for
example an elaborator is a builtin.
Closes#3789
After each tactic step, we save the info tree created by it together
with an appropriate info tree context that makes it stand-alone (which
we already did before to some degree, see `Info.updateContext?`). Then,
in the adjusted request handlers, we first search for a snapshot task
containing the required position, if so wait on it, and if it yielded an
info tree, use it to answer the request, or else continue searching and
waiting, falling back to the full info tree, which should be unchanged
by this PR.
The definition header does *not* report info trees early as in general
it is not stand-alone in the tactic sense but may contain e.g.
metavariables solved by the body in which case we do want to show the
ultimate state as before. This could be refined in the future in case
there are no unsolved mvars.
The adjusted request handlers are exactly the ones waited on together by
the info view, so they all have to be adjusted to have any effect on the
UX. Further request handlers may be adjusted in the future.
No new tests as "replies early" is not something we can test with our
current framework but the existing test suite did help in uncovering
functional regressions.
Before, `pp.instantiateMVars` generally had no effect because most call
sites for the pretty printer instantiated metavariables first, but now
this functionality is entrusted upon the `pp.instantiateMVars` option.
This also has an effect in hovers, where metavariables can be unfolded
one assignment at a time. However, the goal state still sees all
metavariables instantiated due to the fact that the algorithm relies on
expression equality post-instantiation (see
`Lean.Widget.goalToInteractive`).
Closes#4406
This is the groundwork for a tactic index in generated documentation, as
there was in Lean 3. There are a few challenges to getting this to work
well in Lean 4:
* There's no natural notion of *tactic identity* - a tactic may be
specified by multiple syntax rules (e.g. the pattern-matching version of
`intro` is specified apart from the default version, but both are the
same from a user perspective)
* There's no natural notion of *tactic name* - here, we take the
pragmatic choice of using the first keyword atom in the tactic's syntax
specification, but this may need to be overridable someday.
* Tactics are extensible, but we don't want to allow arbitrary imports
to clobber existing tactic docstrings, which could become unpredictable
in practice.
For tactic identity, this PR introduces the notion of a *tactic
alternative*, which is a `syntax` specification that is really "the same
as" an existing tactic, but needs to be separate for technical reasons.
This provides a notion of tactic identity, which we can use as the basis
of a tactic index in generated documentation. Alternative forms of
tactics are specified using a new `@[tactic_alt IDENT]` attribute,
applied to the new tactic syntax. It is an error to declare a tactic
syntax rule to be an alternative of another one that is itself an
alternative. Documentation hovers now take alternatives into account,
and display the docs for the canonical name.
*Tactic tags*, created with the `register_tactic_tag` command, specify
tags that may be applied to tactics. This is intended to be used by
doc-gen and Verso. Tags may be applied using the `@[tactic_tag TAG1 TAG2
...]` attribute on a canonical tactic parser, which may be used in any
module to facilitate downstream projects introducing tags that apply to
pre-existing tactics. Tags may not be removed, but it's fine to
redundantly add them. The collection of tags, and the tactics to which
they're applied, can be seen using the `#print tactic tags` command.
*Extension documentation* provides a structured way to document
extensions to tactics. The resulting documentation is gathered into a
bulleted list at the bottom of the tactic's docstring. Extensions are
added using the `tactic_extension TAC` command. This can be used when
adding new interpretations of a tactic via `macro_rules`, when extending
some table or search index used by the tactic, or in any other way. It
is a command to facilitate its flexible use with various extension
mechanisms.
This came up when watching new Lean users in a class situation. A number
of them were confused when they omitted a namespace on a constructor
name, and Lean treated the variable as a pattern that matches anything.
For example, this program is accepted but may not do what the user
thinks:
```
inductive Tree (α : Type) where
| leaf
| branch (left : Tree α) (val : α) (right : Tree α)
def depth : Tree α → Nat
| leaf => 0
```
Adding a `branch` case to `depth` results in a confusing message.
With this linter, Lean marks `leaf` with:
```
Local variable 'leaf' resembles constructor 'Tree.leaf' - write '.leaf' (with a dot) or 'Tree.leaf' to use the constructor.
note: this linter can be disabled with `set_option linter.constructorNameAsVariable false`
```
Additionally, the error message that occurs when invalid names are
applied in patterns now suggests similar names. This means that:
```
def length (list : List α) : Nat :=
match list with
| nil => 0
| cons x xs => length xs + 1
```
now results in the following warning on `nil`:
```
warning: Local variable 'nil' resembles constructor 'List.nil' - write '.nil' (with a dot) or 'List.nil' to use the constructor.
note: this linter can be disabled with `set_option linter.constructorNameAsVariable false`
```
and error on `cons`:
```
invalid pattern, constructor or constant marked with '[match_pattern]' expected
Suggestion: 'List.cons' is similar
```
The list of suggested constructors is generated before the type of the
pattern is known, so it's less accurate, but it truncates the list to
ten elements to avoid being overwhelming. This mostly comes up with
`mk`.
This PR addresses some non-critical but annoying issues that sometimes
cause the language server to report an error:
- When using global search and replace in VS Code, the language client
sends `textDocument/didChange` notifications for documents that it never
told the server to open first. Instead of emitting an error and crashing
the language server when this occurs, we now instead ignore the
notification. Fixes#4435.
- When terminating the language server, VS Code sometimes still sends
request to the language server even after emitting a `shutdown` request.
The LSP spec explicitly forbids this, but instead of emitting an error
when this occurs, we now error requests and ignore all other messages
until receiving the final `exit` notification. Reported on Zulip several
times over the years but never materialized as an issue, e.g.
https://leanprover.zulipchat.com/#narrow/stream/270676-lean4/topic/Got.20.60shutdown.60.20request.2C.20expected.20an.20.60exit.60.20notification/near/441914289.
- Some language clients attempt to reply to the file watcher
registration request before completing the LSP initialization dance. To
fix this, we now only send this request after the initialization dance
has completed. Fixes#3904.
---------
Co-authored-by: Sebastian Ullrich <sebasti@nullri.ch>
Extends Lean's incremental reporting and reuse between commands into
various steps inside declarations:
* headers and bodies of each (mutual) definition/theorem
* `theorem ... := by` for each contained tactic step, including
recursively inside supported combinators currently consisting of
* `·` (cdot), `case`, `next`
* `induction`, `cases`
* macros such as `next` unfolding to the above

*Incremental reuse* means not recomputing any such steps if they are not
affected by a document change. *Incremental reporting* includes the
parts seen in the recording above: the progress bar and messages. Other
language server features such as hover etc. are *not yet* supported
incrementally, i.e. they are shown only when the declaration has been
fully processed as before.
---------
Co-authored-by: Scott Morrison <scott.morrison@gmail.com>
To eliminate parsing differences between Windows and other platforms,
the frontend now normalizes all CRLF line endings to LF, like [in
Rust](https://github.com/rust-lang/rust/issues/62865).
Effects:
- This makes Lake hashes be faithful to what Lean sees (Lake already
normalizes line endings before computing hashes).
- Docstrings now have normalized line endings. In particular, this fixes
`#guard_msgs` failing multiline tests for Windows users using CRLF.
- Now strings don't have different lengths depending on the platform.
Before this PR, the following theorem is true for LF and false for CRLF
files.
```lean
example : "
".length = 1 := rfl
```
Note: the normalization will take `\r\r\n` and turn it into `\r\n`. In
the elaborator, we reject loose `\r`'s that appear in whitespace. Rust
instead takes the approach of making the normalization routine fail.
They do this so that there's no downstream confusion about any `\r\n`
that appears.
Implementation note: the LSP maintains its own copy of a source file
that it updates when edit operations are applied. We are assuming that
edit operations never split or join CRLFs. If this assumption is not
correct, then the LSP copy of a source file can become slightly out of
sync. If this is an issue, there is some discussion
[here](https://github.com/leanprover/lean4/pull/3903#discussion_r1592930085).
As [reported on
Zulip](https://leanprover.zulipchat.com/#narrow/stream/341532-lean4-dev/topic/Find.20references.20broken.20in.20lean.20core/near/437051935).
The `mainModuleName` was being set incorrectly when browsing lean core
sources, resulting in failure of cross-file server requests like "Find
References". Because the `srcSearchPath` is generated asynchronously, we
store it as a `Task Name` which is resolved some time before the header
is finished parsing. (I don't think the `.get` here will ever block,
because the srcSearchPath will be ready by the time the initial command
snap is requested.)
---------
Co-authored-by: Sebastian Ullrich <sebasti@nullri.ch>
While implementing #3925, I noticed that the performance of the
`textDocument/semanticTokens/full` request is *extremely* bad due to a
quadratic implementation. Specifically, on my machine, computing the
full semantic tokens for `Lean/Elab/Do.lean` took a full 5s. In
practice, this means that while elaborating the file, one core is
entirely busy with computing the semantic tokens for the file.
This PR fixes this performance bug by re-implementing the semantic token
handling, reducing the latency for `Lean/Elab/Do.lean` from 5s to 60ms.
As a result, the overly cautious refresh latency of 5s in #3925 can
easily be reduced to 2s again.
Since the previous semantic tokens implementation used a very brittle
hack to identify projections, this PR also changes the projection
notation elaboration to augment the `InfoTree` syntax for the field of a
projection with a special syntax node of kind
`Lean.Parser.Term.identProjKind`. With this syntax kind, projection
fields can now easily be identified in the `InfoTree`.
This fixes an issue where the completion would use info nodes before the
cursor for computing completions.
Fixes https://github.com/leanprover/lean4/issues/3462.
ToDo:
- [x] Fix test failures for completions that previously worked by
accident (cc: @Kha)
- [x] stage0 update
---------
Co-authored-by: Sebastian Ullrich <sebasti@nullri.ch>
This avoids printing the entire docstring for `⋯` when hovering over it,
which is rather long, and instead it gives a brief reason for omission
and what option to set to pretty print the omitted term.
The concrete dependency that is stale isn't really actionable
information for users (ideally we'd like something like "amount of
dependencies that will be rebuilt when you restart file"). This also
makes the diagnostic an "information" diagnostic so that non-infoview
users can still see it.
Since we are moving away from using notifications for stale dependency
information, we don't need to provide an ID anymore, either.
This coercion caused difficult-to-diagnose bugs sometimes. Because there
are some situations where converting a string to a name should be done
by parsing the string, and others where it should not, an explicit
choice seems better here.
---------
Co-authored-by: Mac Malone <tydeu@hatpress.net>
I forgot to use the sticky diagnostics in `getInteractiveDiagnostics` in
#3247, leading to them not consistently showing up in the "Messages"
panel of the InfoView.
This lead to incorrect diagnostic spans in the editor and resulted in
header errors that did not show up under "Messages" everywhere in the
file because the `fullRange?` property was missing.
Also changes the "Import out of date" warning diagnostic severity to
"Hint" so that it doesn't show up in the "Problems" view.
Sends a diagnostic informing the user to run Restart File when a file
dependency is saved.
Based on #3014 because this feature was easier to implement with the new
architecture.
ToDo:
- [x] Adjust vscode-lean4 to display a notification when this diagnostic
appears in a non-annoying way
(https://github.com/leanprover/vscode-lean4/pull/393)
- [x] Use a file watcher to identify changes to files not tracked by VS
Code
- [x] Rebase onto master when #3014 is merged
This reverts commit 4e3a8468c3 for PR
#3619. It looks like the CI in that commit didn't inform me that a test
was broken by the PR, so I managed to commit it despite the broken test.
- Add support for reserved declaration names. We use them for theorems
generated on demand.
- Equation theorems are not private declarations anymore.
- Generate equation theorems on demand when resolving symbols.
- Prevent users from creating declarations using reserved names. Users
can bypass it using meta-programming.
See next test for examples.
This PR fixes an issue where the file worker would not provide the
client with semantic tokens until the file had been elaborated
completely. The file worker now also tells the client to refresh its
semantic tokens after running "Restart File". This PR is based on #3271.
Fixes#1170.
This PR adds the module name to `RefIdent` in order to distinguish
conflicting names from different files. This also fixes related issues
in find-references or the call hierarchy feature.
It also adds some docstrings and stylistically refactors a bunch of
code.
This PR adds support for requests from the server to the client in the
language server. It is based on #3014 and was developed during an
experiment for #3247 that unfortunately did not go anywhere.
This bug is the real cause of leanprover/vscode-lean4#392.
At the end of a tactic state, the client calls
`getInteractiveDiagnostics` with a range `[last line of proof, last line
of proof + 1)`. The `fullRange` span of the `unresolved goals` error
however is something like `[(first line of proof, start character),
(last line of proof, nonzero end character)).
Since it operates on line numbers, `getInteractiveDiagnostics` would
then check whether `[last line of proof, last line of proof + 1)` and
`[first line of proof, last line of proof)` intersect, which is false
because of the excluded upper bound on the latter interval, despite the
fact that the end character in the last line may be nonzero.
This fix adjusts the intersection logic to use `[first line of proof,
last line of proof]` if the end character is nonzero.
Closesleanprover/vscode-lean4#392.
This PR enables import auto-completion to complete partial words in
imports.
Other inconsistencies that I've found in import completion already seem
to be fixed by #3014. Since it will be merged soon, there is no need to
invest time to fix these issues on master.
In v4.6.0, there was a significant regression in initial server startup
performance because the .ilean files got bigger in #3082 and we load the
information stored in all .ilean files synchronously when the server
starts up.
This PR makes this loading asynchronous. The trade-off is that requests
that are issued right after the initial server start when the references
are not fully loaded yet may yield incomplete results.
Benchmark for this in a separate PR soon after this one.
---------
Co-authored-by: Sebastian Ullrich <sebasti@nullri.ch>
This PR addresses several performance issues in the auto-completion
implementation. It also fixes a number of smaller bugs related to
auto-completion.
In a file with `import Mathlib`, the performance of various kinds of
completions has improved as follows:
- Completing `C`: 49000ms -> 1400ms
- Completing `Cat`: 14300ms -> 1000ms
- Completing `x.` for `x : Nat`: 3700ms -> 220ms
- Completing `.` for an expected type of `Nat`: 11000ms -> 180ms
The following bugs have been fixed as well:
- VS Code never used our custom completion order. Now, the server fuzzy
completion score decides the order that completions appear in.
- Dot auto-completion for private types did not work at all. It does
now.
- Completing `.<identifier>` (where the expected type is used to infer
the namespace) did not filter by the expected type and instead displayed
all matching constants in the respective namespace. Now, it uses the
expected type for filtering. Note that this is not perfect because
sub-namespaces are technically correct completions as well (e.g.
`.Foo.foobar`). Implementing this is future work.
- Completing `.` was often not possible at all. Now, as long as the `.`
is not used in a bracket (where it may be used for the anonymous lambda
feature, e.g. `(. + 1)`), it triggers the correct completion.
- Fixes#3228.
- The auto-completion in `#check` commands would always try to complete
identifiers using the full declaration name (including namespaces) if it
could be resolved. Now it simply uses the identifier itself in case
users want to complete this identifier to another identifier.
## Details
Regarding completion performance, I have more ideas on how to improve it
further in the future.
Other changes:
- The feature that completions with a matching expected type are sorted
to the top of the server-side ordering was removed. This was never
enabled in VS Code because it would use its own completion item order
and when testing it I found it to be more confusing than useful.
- In the server-side ordering, we would always display keywords at the
top of the list. They are now displayed according to their fuzzy match
score as well.
The following approaches have been used to improve performance:
- Pretty-printing the type for every single completion made up a
significant amount of the time needed to compute the completions. We now
do not pretty-print the type for every single completion that is offered
to the user anymore. Instead, the language server now supports
`completionItem/resolve` requests to compute the type lazily when the
user selects a completion item.
- Note that we need to keep the amount of properties that we compute in
a resolve request to a minimum. When the server receives the resolve
request, the document state may have changed from the state it was in
when the initial auto-completion request was received. LSP doesn't tell
us when it will stop sending resolve requests, so we cannot keep this
state around, as we would have to keep it around forever.
LSP's solution for this dilemma is to have servers send all the state
they need to compute a response to a resolve request to the client as
part of the initial auto completion response (which then sends it back
as part of the resolve request), but this is clearly infeasible for all
real language servers where the amount of state needed to resolve a
request is massive.
This means that the only practical solution is to use the current state
to compute a response to the resolve request, which may yield an
incorrect result. This scenario can especially occur when using
LiveShare where the document is edited by another person while cycling
through available completions.
- Request handlers can now specify a "header caching handler" that is
called after elaborating the header of a file. Request handlers can use
this caching handler to compute caches for information stored in the
header. The auto-completion uses this to pre-compute non-blacklisted
imported declarations, which in turn allow us to iterate only over
non-blacklisted imported declarations where we would before iterate over
all declarations in the environment. This is significant because
blacklisted declarations make up about 4/5 of all declarations.
- Dot completion now looks up names modulo private prefixes to figure
out whether a declaration is in the namespace of the type to the left of
the dot instead of first stripping the private prefix from the name and
then comparing it. This has the benefit that we do not need to scan the
full name in most cases.
This PR also adds a couple of regression tests for fixed bugs, but *no
benchmarks*. We will add these in the future when we add proper support
for benchmarking server interaction sessions to our benchmarking
architecture.
All tests that were broken by producing different completion output
(empty `detail` field, added `sortText?` and `data?` fields) have been
manually checked by me to be still correct before replacing their
expected output.