Commit graph

1694 commits

Author SHA1 Message Date
Joachim Breitner
2e06fb5008
perf: fuse fvar substitution into instantiateMVars (#12233)
This PR replaces the default `instantiateMVars` implementation with a
two-pass variant that fuses fvar substitution into the traversal,
avoiding separate `replace_fvars` calls for delayed-assigned MVars and
preserving sharing. The old single-pass implementation is removed
entirely.

The previous implementation had quadratic complexity when instantiating
expressions with long chains of nested delayed-assigned MVars. Such
chains arise naturally from repeated `intro`/`apply` tactic sequences,
where each step creates a new delayed assignment wrapping the previous
one. The new two-pass approach resolves the entire chain in a single
traversal with a fused fvar substitution, reducing this to linear
complexity.

### Terminology (used in this PR and in the source)

* **Direct MVar**: an MVar that is not delayed-assigned.
* **Pending MVar**: the direct MVar stored in a
`DelayedMetavarAssignment`.
* **Assigned MVar**: a direct MVar with an assignment, or a
delayed-assigned MVar with an assigned pending MVar.
* **MVar DAG**: the directed acyclic graph of MVars reachable from the
expression.
* **Resolvable MVar**: an MVar where all MVars reachable from it
(including itself) are assigned.
* **Updateable MVar**: an assigned direct MVar, or a delayed-assigned
MVar that is resolvable but not reachable from any other resolvable
delayed-assigned MVar.

In the MVar DAG, the updateable delayed-assigned MVars form a cut (the
**updateable-MVar cut**) with only assigned MVars behind it and no
resolvable delayed-assigned MVars before it.

### Two-pass architecture

**Pass 1** (`instantiate_direct_fn`): Traverses all MVars and
expressions reachable from the initial expression and instantiates all
updateable direct MVars (updating their assignment with the result),
instantiates all level MVars, and determines if there are any updateable
delayed-assigned MVars.

**Pass 2** (`instantiate_delayed_fn`): Only run if pass 1 found
updateable delayed-assigned MVars. Has an **outer** and an **inner**
mode, depending on whether it has crossed the updateable-MVar cut.

In outer mode (empty fvar substitution), all MVars are either unassigned
direct MVars (left alone), non-updateable delayed-assigned MVars
(pending MVar traversed in outer mode and updated with the result), or
updateable delayed-assigned MVars. When a delayed-assigned MVar is
encountered, its MVar DAG is explored (via `is_resolvable_pending`) to
determine if it is resolvable (and thus updateable). Results are cached
across invocations.

If it is updateable, the substitution is initialized from its arguments
and traversal continues with the value of its pending MVar in inner
mode. In inner mode (non-empty substitution), all encountered
delayed-assigned MVars are, by construction, resolvable but not
updateable. The substitution is carried along and extended as we cross
such MVars. Pending MVars of these delayed-assigned MVars are NOT
updated with the result (as the result is valid only for this
substitution, not in general).

Applying the substitution in one go, rather than instantiating each
delayed-assigned MVar on its own from inside out, avoids the quadratic
overhead of that approach when there are long chains of delayed-assigned
MVars.

**Write-back behavior**: Pass 2 writes back the normalized pending MVar
values of delayed-assigned MVars above the updateable-MVar cut (the
non-resolvable ones whose children may have been resolved). This is
exactly the right set: these MVars are visited in outer mode, so their
normalized values are suitable for storing in the mctx. MVars below the
cut are visited in inner mode, so their intermediate values cannot be
written back.

### Pass 2 scope-tracked caching

A `scope_cache` data structure ensures that sharing is preserved even
across different delayed-assigned MVars (and hence with different
substitutions), when possible. Each `visit_delayed` call pushes a new
scope with fresh fvar bindings. The cache correctly handles cross-scope
reuse, fvar shadowing, and late-binding via generation counters and
scope-level tracking.

The `scope_cache` has been formally verified:
`tests/elab/scopeCacheProofs.lean` contains a complete Lean proof that
the lazy generation-based implementation refines the eager
specification, covering all operations (push, pop, lookup, insert)
including the rewind lazy cleanup with scope re-entry and degradation.
The key correctness invariant is inter-entry gen list consistency
(GensConsistent), which, unlike per-entry alignment with `currentGens`,
survives pop+push cycles.

### Behavioral differences from original `instantiateMVars`

The implementation matches the original single-pass `instantiateMVars`
behavior with one cosmetic difference: the new implementation
substitutes fvars inline during traversal rather than constructing
intermediate beta-redexes, producing more beta-reduced terms in some
edge cases. This changes the pretty-printed output for two elab tests
(`1179b`, `depElim1`) but all terms remain definitionally equal.

### Tests

Correctness and performance tests for the new implementation were added
in #12808.

### Files

- `src/library/instantiate_mvars.cpp` — C++ implementation of both
passes (replaces `src/kernel/instantiate_mvars.cpp`)
- `src/library/scope_cache.h` — scope-aware cache data structure
- `src/Lean/MetavarContext.lean` — exported accessors for
`DelayedMetavarAssignment` fields
- `tests/elab/scopeCacheProofs.lean` — formal verification of
`scope_cache` correctness
- `tests/elab/1179b.lean.out.expected`,
`tests/elab/depElim1.lean.out.expected` — updated expected output

Co-authored-by: Claude <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-09 17:05:21 +00:00
Kyle Miller
d8accf47b3
chore: use terminology "non-recursive structure" instead of "struct-like" (#12749)
This PR changes "structure-like" terminology to "non-recursive
structure" across internal documentation, error messages, the
metaprogramming API, and the kernel, to clarify Lean's type theory. A
*structure* is a one-constructor inductive type with no indices — these
can be created by either the `structure` or `inductive` commands — and
are supported by the primitive `Expr.proj` projections. Only
*non-recursive* structures have an eta conversion rule. The PR
description contains the APIs that were renamed.

Addresses RFC #5891, which proposed this rename. The change is motivated
by the need to distinguish between `structure`-defined structures,
structures, and non-recursive structures. Especially since #5783, which
enabled the `structure` command to define recursive structures,
"structure-like" has been easy to misunderstand.

Changes:
- Kernel: `is_structure_like()` -> `is_non_rec_structure()`
- `Lean.isStructureLike` -> `Lean.isNonRecStructure`
- `Lean.matchConstStructLike` -> `Lean.matchConstNonRecStructure`
- `Lean.getStructureLikeCtor?` -> `Lean.getNonRecStructureCtor?`
- `Lean.getStructureLikeNumFields` -> `Lean.getNonRecStructureNumFields`
- `Lean.Expr.proj`: extended and corrected documentation (note: despite
the fact that not every projection can be written as a recursor
application, I left in this claim since it seems good to document a
more-restrictive specification, and some users have requested the kernel
be more restrictive in this way)

Closes #5891
2026-03-09 03:44:38 +00:00
Joachim Breitner
6ebe573c19
fix: kernel: move level parameter count and thm-is-prop checks for robustness (#12817)
This PR moves the universe-level-count check from
`unfold_definition_core` into `is_delta`, establishing the invariant
that if `is_delta` succeeds then `unfold_definition` also succeeds. This
prevents a crash (SIGSEGV or garbled error) that occurred when call
sites in `lazy_delta_reduction_step` unconditionally dereferenced the
result of `unfold_definition` even on a level-parameter-count mismatch.

Additionally, moves the `is_prop` check for theorem types in
`add_theorem` to occur after `check_constant_val`, so the type is
verified to be well-formed before `is_prop` evaluates it. This prevents
`is_prop` from being called on an ill-typed term when a malformed
theorem declaration is supplied.

Fixes #10577.

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: nomeata <148037+nomeata@users.noreply.github.com>
2026-03-05 17:03:01 +00:00
Leonardo de Moura
d1514f3cec
perf: cache unfold_definition in the kernel (#12259)
This PR ensures we cache the result of `unfold_definition` definition in
the kernel type checker. We used to cache this information in a thread
local storage, but it was deleted during the Lean 3 to Lean 4
transition.
2026-01-31 03:44:50 +00:00
Garmelon
6dcd6c8f08
chore: reformat all cmake files (#12218)
The script to run for reformatting is `script/fmt`.
2026-01-28 18:23:08 +00:00
Sebastian Ullrich
f47dfe9e7f
perf: Options.hasTrace (#12001)
Drastically speeds up `isTracingEnabledFor` in the common case, which
has evolved from "no options set" to "`Elab.async` and probably some
linter options set but no `trace`".

## Breaking changes

`Lean.Options` is now an opaque type. The basic but not all of the
`KVMap` API has been redefined on top of it.
2026-01-16 09:03:40 +00:00
Sebastian Ullrich
b81608d0d9
perf: use lean::unordered_map/set everywhere (#11957) 2026-01-12 17:14:09 +00:00
Henrik Böving
34d619bf93
perf: use lean::unordered_set for expr_eq_fn (#11731)
This PR makes the cache in expr_eq_fn use mimalloc for a small
performance win across the board.
2025-12-18 14:24:50 +00:00
Sebastian Ullrich
6eeb215e8f
chore: CI: enable leak sanitizer again (#11339) 2025-11-27 18:32:35 +00:00
Markus Himmel
cf871a892c
chore: use String.ofList instead of String.mk in elaborator+kernel (#11048)
This PR is a follow-up to #11017, preparing for the eventual removal of
the `String.mk` function.
2025-11-01 14:44:16 +00:00
Eric Wieser
08bc333705
perf: mark move constructors and assignment operators as noexcept (#10784)
Detected by
https://clang.llvm.org/extra/clang-tidy/checks/performance/noexcept-move-constructor.html.
This ensures constructions like `std::vector<object_ref>` call these
operators instead of the copy ones, and do not do extra refcounting.

Note that `optional` and `atomic` need something more complex using
`noexcept()`, as they are templated.
2025-10-22 14:21:51 +00:00
Henrik Böving
52b1b342ab
feat: zero cost BaseIO (#10625)
This PR implements zero cost `BaseIO` by erasing the `IO.RealWorld`
parameter from argument lists and structures. This is a **major breaking
change for FFI**.

Concretely:
- `BaseIO` is defined in terms of `ST IO.RealWorld`
- `EIO` (and thus `IO`) is defined in terms of `EST IO.RealWorld`
- The opaque `Void` type is introduced and the trivial structure
optimization updated to account for it. Furthermore, arguments of type
`Void s` are removed from the argument lists of the C functions.
- `ST` is redefined as `Void s -> ST.Out s a` where `ST.Out` is a pair
of `Void s` and `a`

This together has the following major effects on our generated code:
- Functions that return `BaseIO`/`ST`/`EIO`/`IO`/`EST` now do not take
the dummy world parameter anymore. To account for this FFI code needs to
delete the dummy world parameter from the argument lists.
- Functions that return `BaseIO`/`ST` now return their wrapped value
directly. In particular `BaseIO UInt32` now returns a `uint32_t` instead
of a `lean_object*`. To account for this FFI code might have to change
the return type and does not need to call `lean_io_result_mk_ok` anymore
but can instead just `return` values right away (same with extracting
values from `BaseIO` computations.
- Functions that return `EIO`/`IO`/`EST` now only return the equivalent
of an `Except` node which reduces the allocation size. The
`lean_io_result_mk_ok`/`lean_io_result_mk_error` functions were updated
to account for this already so no change is required.

Besides improving performance by dropping allocation (sizes) we can now
also do fun new things such as:
```lean
@[extern "malloc"]
opaque malloc (size : USize) : BaseIO USize
```
2025-10-22 10:55:12 +02:00
Henrik Böving
bd0b91de07
perf: reduce amount of symbols in DLLs (#10864)
This PR reduces the amount of symbols in our DLLs by cutting open a
linking cycle of the shape:

`Environment -> Compiler -> Meta -> Environment`

This is achieved by introducing a dynamic call to the compiler hidden
behind a `Ref` as previously
done in the pretty printer.
2025-10-21 09:00:56 +00:00
Markus Himmel
68409ef6fd
chore: turn some crashes into errors (#8402)
This PR prevents some nonsensical code from crashing the server.

Specifically, the kernel is changed to
- properly check that passed expressions do not contain loose bvars,
which could lead to a segmentation fault on a well-crafted input
(discovered through fuzzing), and
- check that constants generated when creating a new inductive type do
not overwrite each other, which could lead to the kernel taking
something out of the environment and then casting it to something it
isn't.

Partially addresses #8258, but let's keep that one open until the error
message is a little better.

Fixes #10492.
2025-09-24 13:04:18 +00:00
Leonardo de Moura
72bb7cf364
fix: infer_let in the kernel (#10476)
This PR fixes the dead `let` elimination code in the kernel's
`infer_let` function.

Closes #10475
2025-09-20 16:26:46 +00:00
Markus Himmel
cf8ffc28d3
chore: kernel changes ahead of String redefinition (#10330)
This PR changes the defeq algorithm to perform `whnf` on the `String.mk`
expression it creates for string literals.

This is currently a no-op, but will no longer be one once `String` is
redefined so that `String.mk` is a regular function instead of a
constructor.
2025-09-17 09:12:07 +00:00
Leonardo de Moura
6e18afac8c
feat: kernel hint for proof-by-reflection (#9865)
This PR adds improved support for proof-by-reflection to the kernel type
checker. It addresses the performance issue exposed by #9854. With this
PR, whenever the kernel type-checks an argument of the form `eagerReduce
_`, it enters "eager-reduction" mode. In this mode, the kernel is more
eager to reduce terms. The new `eagerReduce _` hint is often used to
wrap `Eq.refl true`. The new hint should not negatively impact any
existing Lean package.

---------

Co-authored-by: Joachim Breitner <mail@joachim-breitner.de>
2025-08-12 19:24:47 +00:00
Eric Wieser
232443371b
perf: add missing std::moves (#9107)
Continues from #4700.

This will save a handful of refcounts here and there.
2025-07-01 12:39:12 +00:00
Kyle Miller
cdc923167e
feat: add the nondep field of Expr.letE to the C++ data model (#8751)
This PR adds the `nondep` field of `Expr.letE` to the C++ data model.
Previously this field has been unused, and in followup PRs the
elaborator will use it to encode `have` expressions (non-dependent
`let`s). The kernel does not verify that `nondep` is correctly applied
during typechecking. The `letE` delaborator now prints `have`s when
`nondep` is true, though `have` still elaborates as `letFun` for now.
Breaking change: `Expr.updateLet!` is renamed to `Expr.updateLetE!`.

This PR also fixes a bug in `Expr.letFun?` and `Expr.letFunAppArgs?`
when the body is not a lambda. In any case, these functions will be
removed once the `Expr.letE (nondep := true)` encoding of `have`
expressions is complete.
2025-06-14 23:10:27 +00:00
Leonardo de Moura
837193b5ec
fix: block potential adversarial exploit of non-aborting assert! (#8560)
This PR is similar to #8559 but for `Expr.mkData`. This vulnerability
has not been exploited yet, but adversarial users may find a way.
2025-05-31 03:14:01 +00:00
Leonardo de Moura
6940d2c4ff
fix: block adversarial exploit of non-aborting assert! (#8559)
This PR fixes an adversarial soundness attack described in #8554. The
attack exploits the fact that `assert!` no longer aborts execution, and
that users can redirect error messages.
Another PR will implement the same fix for `Expr.Data`.
2025-05-31 00:08:30 +00:00
Arthur Adjedj
1138062a70
fix: normalize imax 1 u to u (#7631)
This PR fixes `Lean.Level.mkIMaxAux` (`mk_imax` in the kernel) such that
`imax 1 u` reduces to `u`.

Closes #7096
2025-05-21 20:27:53 +00:00
Paul Reichert
57915af218
fix: reducing Nat.pow, kernel interprets constant as Nat literal (#8060)
This PR fixes a bug in the Lean kernel. During reduction of `Nat.pow`,
the kernel did not validate that the WHNF of the first argument is a
`Nat` literal before interpreting it as an `mpz` number. This PR adds
the missing check.

### Explanation

In `type_checker::reduce_pow`, an expression was interpreted as a `Nat`
literal without previously validating that it actually was a `Nat`
literal.

We (@TwoFX and me) noticed this while fuzzing the Lean kernel with GMP
and Mimalloc disabled. Until now, the fuzzer found one crash, leading us
to this issue.

What are the consequences? If GMP is disabled, the Lean kernel will
crash on some inputs after the memory allocator returns `null`. (MPZ
tries to clone the `.const` expression in disguise of a `Nat` literal
which accidentally has a size field indicating that the number has 88
trillion `mpz` digits. This is too much for every allocator.) If GMP is
enabled, it is possible to [prove
`False`](https://live.lean-lang.org/#codez=JYWwDg9gTgLgBAGQKYEMB2AoDATJAzOAcwDoBvAVwF84AuOABQFU1gYyq5AkwjgDkV4aAXjh5yaOAH04ggHxwIYJOIDCAGxQBnDcADGKVXGDjgBACoBPRdLgWrMABZK4ABjhJVGpC6wBaH3AApcg14AHcoViNCOAADPjZIULgACnjiRLgARlcADgBKOABWZwAmGLhQiHJVbDh9DQgK6ABrAG5YtIzU/nSIJOy4fKLnAGZyyursNAByGBx8G1pefi4GKAVaYVFxAA9pORM4PeFXBycAMXqvd08bKHIkX39TRzhmpCg0dzgoJGxyHRIDRwBzAYEwRooOBocggABGHzgCJgoSQTmyAD0cqMSnU0LVMdiACw5eYEegAeQA6gBCTbLBJ9FIkUjOaiAC/JAJfkBUyWHcKDhcAARABiIwAKyQOhgEjhKGwEjA6wgeCFSx0EBAIHQtVkcGwEAwcDgqiQ8FwOgMdGQ6GIABEpeooPxgBBxEI4MRcHg0A7LXBSEbjcG0CgQF4PTEQOYHCAADRB4NwexGGDAj3EX6EaooKAuBNJ40aFB4M3menEYulguFmCWCPCZLEFBgMApYgatAhWKmOAAbQAugUm53uzFKbT+0O8jWk6aAG7uei5sPp4SD2fB+f6B70kduseme5IYip9ZTvKJuCUIM2tDEACi6jhxGUmu1+OIqhMMDfvwAsikd7Ntg2B+gYFqqJeGB+HAyjOhojjAocADi/70IYwLYGCAqmtgGBimgkrSrK8qKsqeBYGc0BICARASEgACOEgAF4fI0dAshw4gnPScLmEGYh4BANREEGGhgN+8AADytHIUB4KoVGODRdGIX0Eh4FcSyXB4DZIgJxo6PY6CEF4vbdIySSuJkl7GlASR9oACYT0UxrHsQOQZIDsKDSnA0axhgQA)
because the kernel doesn't crash on a memory allocation and instead just
happily interprets the `.const` expression as a GMP number.

Importantly, this is _not_ a flaw in Lean's type theory. It is an
implementation bug in the built-in kernel, related to the efficient
reduction of `Nat.pow`, that will be fixed with this PR; see the test
file. Because Lean's kernel is relatively small, there are third-party
kernel implementations such as `lean4lean` and `nanoda`. `lean4lean`
catches the bogus proof, and looking at its code `nanoda` will, too, but
I haven't tried it yet.
2025-04-23 13:55:20 +00:00
Sebastian Ullrich
5cd352588c
perf: use mimalloc with important C++ hash maps (#7868)
`unordered_map`/`unordered_set` does an allocation per insert, use
mimalloc for them for important hash maps
2025-04-11 16:23:33 +00:00
Joachim Breitner
dd91d7e2e2
fix: bv_omega to use -implicitDefEqProofs (#7387)
This PR uses `-implicitDefEqProofs` in `bv_omega` to ensure it is not
affected by the change in #7386.

---------

Co-authored-by: Leonardo de Moura <leomoura@amazon.com>
2025-03-09 00:13:14 +00:00
Sebastian Ullrich
087f0b4a69
perf: optimize sorry detection in unused variables linter (#7129)
This PR optimizes the performance of the unused variables linter in the
case of a definition with a huge `Expr` representation
2025-02-22 16:43:39 +00:00
Sebastian Ullrich
3770808b58
feat: split Lean.Kernel.Environment from Lean.Environment (#5145)
This PR splits the environment used by the kernel from that used by the
elaborator, providing the foundation for tracking of asynchronously
elaborated declarations, which will exist as a concept only in the
latter.

Minor changes:
* kernel diagnostics are moved from an environment extension to a direct
environment as they are the only extension used directly by the kernel
* `initQuot` is moved from an environment header field to a direct
environment as it is the only header field used by the kernel; this also
makes the remaining header immutable after import
2025-01-18 18:42:57 +00:00
Joachim Breitner
bf13b24692
doc: refine kernel code comments (#6150)
I just spent too much time being confused about the kernel type checker
until I noticed that `lazy_delta_reduction` modifies its arguments.
2024-11-23 17:13:51 +00:00
Sebastian Ullrich
748f0d6c15
fix: instantiateMVars slowdown in the language server (#5805)
Fixes #5614
2024-10-25 09:35:41 +00:00
Leonardo de Moura
82d31a1793
perf: has_univ_mvar, has_univ_mvar, and has_fvar in C++ (#5793)
`instantiate_mvars` is now implemented in C/C++, and makes many calls to
`has_fvar`, `has_mvar`. The new C/C++ implementations are inlined and
avoid unnecessary RC inc/decs.
2024-10-21 16:56:30 +00:00
Joachim Breitner
76164b284b
fix: RecursorVal.getInduct to return name of major argument’s type (#5679)
Previously `RecursorVal.getInduct` would return the prefix of the
recursor’s name, which is unlikely the right value for the “derived”
recursors in nested recursion. The code using `RecursorVal.getInduct`
seems to expect the name of the inductive type of major argument here.

If we return that name, this fixes #5661.

This bug becomes more visible now that we have structural mutual
recursion.

Also, to avoid confusion, renames the function to ``getMajorInduct`.
2024-10-21 08:45:18 +00:00
Joachim Breitner
e417a2331c
feat: expose Kernel.check for debugging purposes (#5412)
along `Kernel.isDefEq` and `Kernel.whnf`.
2024-10-01 21:28:02 +00:00
euprunin
cda6733f97
chore: fix spelling mistakes in non-Lean files (#5430)
Co-authored-by: euprunin <euprunin@users.noreply.github.com>
2024-09-23 21:11:20 +00:00
Joachim Breitner
8f899bf5bd
doc: code comments about reflection support (#5235)
I found that the kernel has special support for `e =?= true`, and will
in this case aggressively whnf `e`. This explains the following behavior
(for a `sqrt` function with fuel):

```lean
theorem foo : sqrt 100000000000000000002 == 10000000000 := rfl       -- fast
theorem foo : sqrt 100000000000000000002 =  10000000000 := rfl       -- slow
theorem foo : sqrt 100000000000000000002 =  10000000000 := by decide -- fast
```

The special support in the kernel only applies for closed `e` and `true`
on the RHS. It could be generlized (also open terms, also `false`, other
data type's constructors, different orientation). But maybe I should
wait for evidence that this generaziation really matters, or whether
all applications (proof by reflection) can be made to have this form.
2024-09-10 06:36:38 +00:00
Leonardo de Moura
e9e858a448
chore: use Expr.numObjs instead of lean_expr_size_shared (#5239)
Remark: declarations like `sizeWithSharing` must be in `IO` since they
are not functions.

The commit also uses the more efficient `ShareCommon.shareCommon'`.
2024-09-02 21:26:00 +00:00
Sebastian Ullrich
2117b89cd5
feat: pp.exprSizes debugging option (#5218) 2024-09-02 07:29:23 +00:00
Sebastian Ullrich
c5114c971a fix: Windows needs more LEAN_EXPORTs 2024-08-12 14:14:42 +02:00
Leonardo de Moura
89c3079072
chore: fix typo in hash code for Expr equality test (#4990)
We observed a small performance improvement at
https://github.com/leanprover/LNSym/blob/proof_size_expt/Proofs/SHA512/Experiments/Sym30.lean
Before: 2.65s
After: 2.60s
2024-08-12 00:47:08 +00:00
Leonardo de Moura
2436562d57
fix: regular mvar assignments take precedence over delayed ones (#4987)
closes #4920
2024-08-12 00:14:38 +00:00
Leonardo de Moura
3c5ac9496f
perf: expr_eq_fn (#4934)
This PR contains a collection of small optimizations that improve the
equality test for `Expr`.
2024-08-07 00:02:14 +00:00
Leonardo de Moura
a27d4a9519
chore: reduce stack space usage at instantiate_mvars_fn (#4931) 2024-08-06 17:38:59 +00:00
Leonardo de Moura
4a2fb6e922
chore: fix typo (#4929) 2024-08-06 15:27:20 +00:00
Leonardo de Moura
8e476e9d22
perf: instantiateExprMVars (#4915)
TODO: 
- Support for `zeta := true` at `apply_beta`.
- Investigate test failure. 
- Break PR in pieces because of bootstrapping issues. The current PR
updates a stage0 file to workaround the issue.

Motivation: significant performance improvement at
https://github.com/leanprover/LNSym/blob/proof_size_expt/Proofs/SHA512/Experiments/Sym30.lean

With M1 Pro:
- Before: 4.56 secs
- After: 3.16 secs

Successfully built stage2 using this PR
2024-08-05 17:15:22 +00:00
Leonardo de Moura
1e9d96be22
perf: add lean_instantiate_level_mvars (#4910)
The new code is not active yet because of bootstrapping issues.
It requires an `update_stage0`.
2024-08-04 18:31:44 +00:00
Clement Courbet
9c4028aab4
perf: avoid expr copies in replace_rec_fn::apply (#4702)
Those represent ~13% of the time spent in `save_result`,
even though `r` is a temporary in all cases but one.

See #4698 for details.

---------

Co-authored-by: Leonardo de Moura <leomoura@amazon.com>
2024-08-02 18:32:36 +00:00
Clement Courbet
2c002718e0
perf: fix implementation of move constructors and move assignment ope… (#4700)
…rators

Right now those constructors result in a copy instead of the desired
move. We've measured that expr copying and assignment by itself uses
around 10% of total runtime on our workloads.

See #4698 for details.
2024-08-02 17:55:03 +00:00
Leonardo de Moura
a856016b9d
perf: precise cache for expr_eq_fn (#4890)
This performance issue was exposed by the benchmarks at
https://github.com/leanprover/LNSym/tree/proof_size_expt/Proofs/SHA512/Experiments
2024-08-01 02:56:41 +00:00
Leonardo de Moura
d0bc4e4245
fix: replace_fn.cpp (#4798) 2024-07-19 21:20:43 -07:00
Leonardo de Moura
3477b0e7f6
fix: for_each_fn.cpp (#4797) 2024-07-20 03:22:56 +00:00
Leonardo de Moura
726e162527
perf: kernel replace with precise cache (#4796)
Changes:

- We avoid the thread local storage.
- We use a hash map to ensure that cached values are not lost.
- We remove `check_system`. If this becomes an issue in the future we
should precompute the remaining amount of stack space, and use a cheaper
check.
- We add a `Expr.replaceImpl`, and will use it to implement
`Expr.replace` after update-stage0
2024-07-20 02:00:29 +00:00