Commit graph

1184 commits

Author SHA1 Message Date
Henrik Böving
57afb23c5c
fix: compilation of projections on non trivial structures (#11340)
This PR fixes a miscompilation when encountering projections of non
trivial structure types.

Closes: #11322
2025-11-24 19:25:03 +00:00
Markus Himmel
fa67f300f6
chore: rename String.ValidPos to String.Pos (#11240)
This PR renames `String.ValidPos` to `String.Pos`, `String.endValidPos`
to `String.endPos` and `String.startValidPos` to `String.startPos`.

Accordingly, the deprecations of `String.Pos` to `String.Pos.Raw` and
`String.endPos` to `String.rawEndPos` are removed early, after an
abbreviated deprecation cycle of two releases.
2025-11-24 16:40:21 +00:00
Henrik Böving
80224c72c9
perf: improve specializer cache keys (#11310)
This PR makes the specializer (correctly) share more cache keys across
invocations, causing us to produce less code bloat.

We observed that in functions with lots of specialization, sometimes
cache keys are defeq but not BEq because one has unused let decls
(introduced by specialization) that the other doesn't. This PR resolves
this conflict by erasing unused let decls from specializer cache keys.
2025-11-21 23:21:40 +00:00
Joachim Breitner
4288aa71e0
chore: do not set unused Option.Decl.group (#11307)
This PR removes all code that sets the `Option.Decl.group` field, which
is unused and has no clearly documented meaning.

The actual removal of the field would be #11305.
2025-11-21 16:44:38 +00:00
Joachim Breitner
63bd0b5e77
refactor: introduce Match.altInfos (#11256)
This PR replaces `MatcherInfo.numAltParams` with a more detailed data
structure that allows us, in particular, to distinguish between an
alternative for a constructor with a `Unit` field and the alternative
for a nullary constructor, where an artificial `Unit` argument is
introduced.
2025-11-19 15:09:17 +00:00
Markus Himmel
52d05b6972
refactor: use String.split instead of String.splitOn or String.splitToList (#11250)
This PR introduces a function `String.split` which is based on
`String.Slice.split` and therefore supports all pattern types and
returns a `Std.Iter String.Slice`.

This supersedes the functions `String.splitOn` and `String.splitToList`,
and we remove all all uses of these functions from core. They will be
deprecated in a future PR.

Migrating from `String.splitOn` and `String.splitToList` is easy: we
introduce functions `Iter.toStringList` and `Iter.toStringArray` that
can be used to conveniently go from `Std.Iter String.Slice` to `List
String` and `Array String`, so for example `s.splitOn "foo"` can be
replaced by `s.split "foo" |>.toStringList`.
2025-11-19 09:35:19 +00:00
Mac Malone
5bb9839887
fix: symbol clashes between packages (#11082)
This PR prevents symbol clashes between (non-`@[export]`) definitions
from different Lean packages.

Previously, if two modules define a function with the same name and were
transitively imported (even privately) by some downstream module,
linking would fail due to a symbol clash. Similarly, if a user defined a
symbol with the same name as one in the `Lean` library, Lean would use
the core symbol even if one did not import `Lean`.

This is solved by changing Lean's name mangling algorithm to include an
optional package identifier. This identifier is provided by Lake via
`--setup` when building a module. This information is weaved through the
elaborator, interpreter, and compiler via a persistent environment
extension that associates modules with their package identifier.

With a package identifier, standard symbols have the form
`lp_<pkg-id>_<mangled-def>`. Without one, the old scheme is used (i.e.,
`l_<mangled-def>`). Module initializers are also prefixed with package
identifier (if any). For example, the initializer for a module `Foo` in
a package `test` is now `initialize_test_Foo` (instead of
`initialize_Foo`). Lake's default for native library names has also been
adjusted accordingly, so that libraries can still, by default, be used
as plugins. Thus, the default library name of the `lean_lib Foo` in
`package test` will now be `libtest_Foo`.

When using Lake to build the Lean core (i.e., `bootstrap = true`), no
package identifier will be used. Thus, definitions in user packages can
never have symbol clashes with core.

Closes #222.
2025-11-19 02:24:44 +00:00
Markus Himmel
fa5d08b7de
refactor: use String.Slice in String.take and variants (#11180)
This PR redefines `String.take` and variants to operate on
`String.Slice`. While previously functions returning a substring of the
input sometimes returned `String` and sometimes returned
`Substring.Raw`, they now uniformly return `String.Slice`.

This is a BREAKING change, because many functions now have a different
return type. So for example, if `s` is a string and `f` is a function
accepting a string, `f (s.drop 1)` will no longer compile because
`s.drop 1` is a `String.Slice`. To fix this, insert a call to `copy` to
restore the old behavior: `f (s.drop 1).copy`.

Of course, in many cases, there will be more efficient options. For
example, don't write `f <| s.drop 1 |>.copy |>.dropEnd 1 |>.copy`, write
`f <| s.drop 1 |>.dropEnd 1 |>.copy` instead. Also, instead of `(s.drop
1).copy = "Hello"`, write `s.drop 1 == "Hello".toSlice` instead.
2025-11-18 16:13:48 +00:00
Henrik Böving
bef8574b93
fix: be more careful when recording cases in the compiler (#11210)
This PR fixes a bug in the LCNF simplifier unearthed while working on
#11078. In some situations caused by `unsafeCast`, the simplifier would
record incorrect information about `cases`, leading to further bugs down
the line.

Suppose we have `v : NonScalar` due to an `unsafeCast` and we run
`cases` on it, expecting `Prod.mk fst snd`. The current code attempts to
record both the arguments from the constructor application in the case
arm `fst`, `snd` and the parameters for the type by inspecting the discr
`v`. However, `NonScalar` does of course not have any parameters,
causing the simplifier to record wrong information. This patch makes the
`cases` infrastructure more cautious when extracting information from
the type of `v`.
2025-11-17 11:34:16 +00:00
Rob23oba
eba5a5a6ef
fix: consider over-applications in reduceArity compiler pass (#11185)
This PR fixes the `reduceArity` compiler pass to consider
over-applications to functions that have their arity reduced.
Previously, this pass assumed that the amount of arguments to
applications was always the same as the number of parameters in the
signature. This is usually true, since the compiler eagerly introduces
parameters as long as the return type is a function type, resulting in a
function with a return type that isn't a function type. However, for
dependent types that sometimes are function types and sometimes not,
this assumption is broken, resulting in the additional parameters to be
dropped.

Closes #11131
2025-11-17 07:51:37 +00:00
Sebastian Ullrich
ed34ee0cd5
chore: make declMetaExt persistent for shake (#11201) 2025-11-16 20:11:56 +00:00
Markus Himmel
bf60550ce5
chore: rename Substring to Substring.Raw (#11154)
This PR renames `Substring`  to `Substring.Raw`.

This is to signify its status as a second-class citizen (not deprecated,
but no real plans for verification, like `String.Pos.Raw`) and to free
up the name `Substring` for a possible future type `String.Substring :
String -> Type` so that `s.Substring` is the type of substrings of `s`.

The functions `String.toSubstring` and `String.toSubstring'` will remain
for now for bootstrapping reasons.
2025-11-16 09:30:04 +00:00
Sebastian Ullrich
5011b7bd89
chore: make compilation type mismatch error message from non-exposed defs a lot less mysterious (#11177) 2025-11-14 10:50:43 +00:00
Sebastian Ullrich
4602586b6a
chore: suggest public meta import on phase check failure, which is more likely to be the correct variant (#11173) 2025-11-14 10:10:04 +00:00
Markus Himmel
2c2fcff4f8
refactor: do not use String.Iterator (#11127)
This PR removes all uses of `String.Iterator` from core, preferring
`String.ValidPos` instead.

In an upcoming PR, `String.Iterator` will be renamed to
`String.Legacy.Iterator`.
2025-11-11 11:46:58 +00:00
Joachim Breitner
d41f39fb10
perf: sparse case splitting in match compilation (#10823)
This PR lets the match compilation procedure use sparse case analysis
when the patterns only match on some but not all constructors of an
inductive type. This way, less code is produce. Before, code handling
each of the other cases was then optimized and commoned-up by later
compilation pipeline, but that is wasteful to do.

In some cases this will prevent Lean from noticing that a match
statement is complete
because it performs less case-splitting for the unreachable case. In
this case, give explicit
patterns to perform the deeper split with `by contradiction` as the
right-hand side.

At least temporarily, there is also the option to disable this behaviour
with
```
set_option backwards.match.sparseCases false
```
2025-11-06 13:46:35 +00:00
Joachim Breitner
0cb79868f4
feat: sparse casesOn constructions (#11072)
This PR adds “sparse casesOn” constructions. They are similar to
`.casesOn`, but have arms only for some constructors and a catch-all
(providing `t.ctorIdx ≠ 42` assumptions). The compiler has native
support for these constructors and now (because of the similarity) also
the per-constructor elimination principles.
2025-11-05 15:49:11 +00:00
Sebastian Ullrich
18131de438
fix: evalConst meta check and auxiliary IR decls (#11079)
Uncovered in Mathlib through new boxed decls from `BaseIO` changes
2025-11-04 21:29:49 +00:00
Sebastian Ullrich
e4fb780f8a
perf: remove unused argument to ExternEntry.opaque (#11066)
This used to create quite a few unique objects in public .olean
2025-11-03 17:26:32 +00:00
Henrik Böving
3d307925b7
refactor: make constant folding more robust for future bugs (#11044)
This PR enforces users of the constant folder API to provide proofs of
their algebraic properties,
thus hopefully avoiding bugs such as #11042 and #11043 in the future.
2025-11-01 11:07:20 +00:00
Rob23oba
1fa67d0d47
fix: overeager Nat.sub constant folding (#11043)
This PR fixes a case of overeager constant folding on Nat where the
compiler would mistakenly assume `0 - x = x` (see also #11042 for the
same bug on UInts).
2025-11-01 10:14:20 +00:00
Henrik Böving
51ef1dcc5e
fix: overeager uint constant folding (#11042)
This PR fixes a case of overeager constant folding on UInts where the
compiler would mistakenly
assume `0 - x = x`.
2025-11-01 02:42:43 +00:00
Henrik Böving
8b28467655
perf: better detection of repeated branching on same value (#11020)
This PR improves the detection of situations where we branch multiple
times on the same value in the
code generator. Previously this would only consider repeated branching
on function arguments, now on
arbitrary values.


Closes: #11018
2025-10-30 16:02:45 +00:00
Henrik Böving
cc046e0c18
perf: improve join point finding (#10999)
This PR improves join point finding in the compiler through two means:
1. We now handle situations where a function `f` can only become a join
point when a function `g`
   becomes a join point as well correctly.
2. We introduce a second join point finding pass after specialisation
and before the following
simplification pass, as the specialiser might have introduced new join
point opportunities for
   the simplifier to exploit.

Notably in the code from #10995 we now correctly detect the missing join
point which required both
of these changes to be made.

Closes: #10995
2025-10-30 15:05:11 +00:00
Henrik Böving
1587d02dfb
fix: more stable eager lambda lifting heuristic (#11010)
This PR makes the eager lambda lifting heuristic more predictable by
blocking it from lifting from
any kind of inlineable function, not just `@[inline]`. It also adapts
the doc-string to describe
what is actually going on.
2025-10-29 13:58:23 +00:00
Markus Himmel
167429501b
refactor: redefine String.replace (#10986)
This PR defines `String.Slice.replace` and redefines `String.replace` to
use the `Slice` version.

The new implementation is generic in the pattern, so it supports things
like `"education".replace isVowel "☃!" = "☃!d☃!c☃!t☃!☃!n"`. Since it
uses the `ForwardSearcher` infrastructure, `String` patterns are
searched using KMP, unlike the previous implementation which had
quadratic runtime. As a side effect, the behavior when replacing an
empty string now matches that of most other programming languages,
namely `"abc".replace "" "k" = "kakbkck"`.
2025-10-29 07:48:33 +00:00
Markus Himmel
8fe260de55
feat: termination arguments for String.ValidPos and String.Slice.Pos (#10933)
This PR adds the basic infrastructure to perform termination proofs
about `String.ValidPos` and `String.Slice.Pos`.

We choose approach where the intended way to do termination arguments is
to argue about the position itself rather than some projection of it
like `remainingBytes`.

The types `String.ValidPos` and `String.Slice.Pos` are equipped with a
`WellFoundedRelation` instance given by the greater-than relation. This
means that if a function takes a position `p` and performs a recursive
call on `q`, then the decreasing obligation will be `p < q`. This works
well in the common case where `q` is `p.next h`, in which case the goal
`p < p.next h` is solved by the simplifier.

For stepping through a string backwards, we introduce a type synonym
with a `WellFoundedRelation` instance given by the less-than relation.
This means that if a function takes a position `p` and performs a
recursive call on `q` and specifies `termination_by p.down`, then the
decreasing obligation will be `q < p`. This works well in the case where
`q` is `p.prev h`, in which case the goal `p.prev h < p` is solved by
the simplifier.

For termination arguments invoving multiple strings, the lower-level
primitive `p.remainingBytes` (landing in `Nat`) is also available.

In a future PR, we will additionally provide the necessary typeclasses
instances to register `String.ValidPos` and `String.Slice.Pos` with
`grind` to make complex termination arguments more convenient in user
code.
2025-10-27 10:05:44 +00:00
Henrik Böving
7e1be20317
perf: widen more in ElimDeadBranches (#10856)
This PR performs more widening in ElimDeadBranches in an attempt to
improve performance in situations with a lot of local precision.

While this is not enough to make the compilation instant it pushes
compilation time from 12s to 3s for the example in #10857 and barely
introduces regressions so it seems like a good first step in this
direction.

Closes: #10857
2025-10-27 09:12:16 +00:00
Sebastian Ullrich
77ddfd49e6
chore: further shake improvements (#10947) 2025-10-26 11:27:19 +00:00
Markus Himmel
ba7798b389
chore: more reorganization of strings (#10928)
This PR splits more material out of `Init.Data.String.Basic`.
2025-10-23 11:56:11 +00:00
Rob23oba
fad0e69cc7
fix: make name mangling unambiguous (#10727)
This PR fixes name mangling to be unambiguous / injective by adding `00`
for disambiguation where necessary. Additionally, the inverse function,
`Lean.Name.unmangle` has been added which can be used to unmangle a
mangled identifier. This unmangler has been added to demonstrate the
injectivity but also to allow unmangling identifiers e.g. for debugging
purposes.

Closes #10724
2025-10-23 07:18:07 +00:00
Markus Himmel
b5dc11e8d3
chore: move some material out of Init.Data.String.Basic (#10893)
This PR splits some low-hanging fruit out of `Init.Data.String.Basic`:
basic material about `String.Pos.Raw`, `String.Substrig`, and
`String.Iterator`.

More splitting required and the remaining material is quite unorganized,
but it's a start.
2025-10-22 16:31:08 +00:00
Henrik Böving
52b1b342ab
feat: zero cost BaseIO (#10625)
This PR implements zero cost `BaseIO` by erasing the `IO.RealWorld`
parameter from argument lists and structures. This is a **major breaking
change for FFI**.

Concretely:
- `BaseIO` is defined in terms of `ST IO.RealWorld`
- `EIO` (and thus `IO`) is defined in terms of `EST IO.RealWorld`
- The opaque `Void` type is introduced and the trivial structure
optimization updated to account for it. Furthermore, arguments of type
`Void s` are removed from the argument lists of the C functions.
- `ST` is redefined as `Void s -> ST.Out s a` where `ST.Out` is a pair
of `Void s` and `a`

This together has the following major effects on our generated code:
- Functions that return `BaseIO`/`ST`/`EIO`/`IO`/`EST` now do not take
the dummy world parameter anymore. To account for this FFI code needs to
delete the dummy world parameter from the argument lists.
- Functions that return `BaseIO`/`ST` now return their wrapped value
directly. In particular `BaseIO UInt32` now returns a `uint32_t` instead
of a `lean_object*`. To account for this FFI code might have to change
the return type and does not need to call `lean_io_result_mk_ok` anymore
but can instead just `return` values right away (same with extracting
values from `BaseIO` computations.
- Functions that return `EIO`/`IO`/`EST` now only return the equivalent
of an `Except` node which reduces the allocation size. The
`lean_io_result_mk_ok`/`lean_io_result_mk_error` functions were updated
to account for this already so no change is required.

Besides improving performance by dropping allocation (sizes) we can now
also do fun new things such as:
```lean
@[extern "malloc"]
opaque malloc (size : USize) : BaseIO USize
```
2025-10-22 10:55:12 +02:00
Henrik Böving
bd0b91de07
perf: reduce amount of symbols in DLLs (#10864)
This PR reduces the amount of symbols in our DLLs by cutting open a
linking cycle of the shape:

`Environment -> Compiler -> Meta -> Environment`

This is achieved by introducing a dynamic call to the compiler hidden
behind a `Ref` as previously
done in the pretty printer.
2025-10-21 09:00:56 +00:00
Sebastian Ullrich
37b78bd53d
chore: more module system fixes and refinements for finishing batteries port (#10819) 2025-10-21 08:19:50 +00:00
Markus Himmel
dad541265c
refactor: move operations on String.Pos.Raw to the String.Pos.Raw namespace (#10735)
This PR moves many operations involving `String.Pos.Raw` to a the
`String.Pos.Raw` namespace with the eventual aim of freeing up the
`String` namespace to contain operations using `String.ValidPos` (to be
renamed to `String.Pos`) instead.

This PR adds the `String.ValidPos.set` and `String.ValidPos.modify`
functions.

After this PR, `String.pos_lt_eq` is no longer a `simp` lemma. Add
`String.Pos.Raw.lt_iff` as a `simp` lemma if your proofs break.
2025-10-18 12:12:55 +00:00
Sebastian Ullrich
428355cf02
chore: remove redundant imports in core (#10750) 2025-10-16 20:27:46 +00:00
Sebastian Ullrich
419982bd42
chore: even more module system fixes and refinements from Mathlib porting (#10726) 2025-10-15 14:59:09 +00:00
Sebastian Ullrich
3b061a0996
chore: more module system fixes and improvements from Mathlib porting (#10655) 2025-10-08 11:30:09 +00:00
Henrik Böving
1f7374a5d6
fix: RC dec insertion for unused variables (#10689)
This PR fixes an oversight in the RC insertion phase in the code
generator.

If the code generator encounters a `let` that is unused (which is
perfectly reasonable as at this
phase we are in an impure IR and as such allow for side effects to
happen so we cannot remove all
unused `let`) it didn't insert a `dec` instruction for this variable.
This has previously gone
unnoticed because at this point in the compiler basically all unused
lets are removed already
anyways. However with the `IO`/`ST` token erasure coming up they will be
very frequent.
2025-10-06 22:05:17 +00:00
Joachim Breitner
232a0495b0
chore: remove public section from end of files (#10684)
This PR removes `public section` lines from end of files; they look a
bit silly there.
2025-10-06 13:30:48 +00:00
Sebastian Ullrich
d17160518c
chore: module system fixes and refinements from Mathlib porting (#10643) 2025-10-02 08:28:08 +00:00
Paul Reichert
89686fcd02
refactor: replace PRange shape α with Rcc α and eight other types (#10319)
This PR "monomorphizes" the structure `Std.PRange shape α`, replacing it
with nine distinct structures `Std.Rcc`, `Std.Rco`, `Std.Rci` etc., one
for each possible shape of a range's bounds. This change was necessary
because the shape polymorphism is detrimental to attempts of automation.

**BREAKING CHANGE:** While range/slice notation itself is unchanged,
this essentially breaks the entire remaining (polymorphic) range and
slice API except for the dot-notation(`toList`, `iter`, ...). It is not
possible to deprecate old declarations that were formulated in a
shape-polymorphic way that is not available anymore.
2025-10-02 06:45:11 +00:00
Henrik Böving
d88e417cda
refactor: tame down dead let eliminator in lambda RC (#10626)
This PR reduces the aggressiveness of the dead let eliminator from
lambda RC.

The motivation for this is that all other passes in lambda RC respect
impurity but the dead let eliminator still operates under the assumption
of purity. There is a couple of motivations for the elim dead let
elaborator:
- unused projections introduced by the ToIR translation
- the elim dead branch pass introducing new opportunities
- closed term extraction introducing new opportunities
2025-09-30 19:51:16 +00:00
Markus Himmel
c039e29a3f
perf: shorten critical build path around String.Basic (#10614)
This PR cuts some edges from the import graph.

Specifically:
- `TreeMap` and `HashMap` no longer depend on `String`, so now the
expensive things are all in parallel instead of partially in sequence
- `Omega` no longer relies on `List` lemmas
- The section of the import graph between `Init.Omega` and
`Init.Data.Bitvec.Lemmas` is cleaned up a bit
2025-09-29 19:45:21 +00:00
Henrik Böving
b82303e9b3
feat: consistent type ABI regardless of transparency (#10610)
This PR ensures that even if a type is marked as `irreducible` the
compiler can see through it in
order to discover functions hidden behind type aliases.
2025-09-29 13:31:41 +00:00
Sebastian Ullrich
fd3f51012f
feat: shake import minimizer aware of the module system and arbitrary elaboration dependencies (#10575)
This PR adds the necessary infrastructure for recording elaboration
dependencies that may not be apparent from the resulting environment
such as notations and other metaprograms. An adapted version of `shake`
from Mathlib is added to `script/` but may be moved to another location
or repo in the future.
2025-09-28 16:00:00 +00:00
Sebastian Ullrich
8b2fea1ec7
perf: avoid blocking wait on kernel env on some interpreter entries (#10591) 2025-09-28 12:52:24 +00:00
Sebastian Ullrich
646f2fabbf
fix: allow meta decls in #eval (#10545) 2025-09-26 15:10:33 +00:00
Sebastian Ullrich
a164ae5073
chore: overhaul meta error messages (#10569) 2025-09-26 12:56:46 +00:00