lean4-htt

Author	SHA1	Message	Date
Henrik Böving	31e4eb62b7	perf: speed up compiler recompilation (#12196 )	2026-01-27 18:50:58 +00:00
Henrik Böving	1b8dd80ed1	chore: don't extract standalone constants as closed terms (#12027 )	2026-01-16 14:52:14 +00:00
Sebastian Ullrich	f47dfe9e7f	perf: `Options.hasTrace` (#12001 ) Drastically speeds up `isTracingEnabledFor` in the common case, which has evolved from "no options set" to "`Elab.async` and probably some linter options set but no `trace`". ## Breaking changes `Lean.Options` is now an opaque type. The basic but not all of the `KVMap` API has been redefined on top of it.	2026-01-16 09:03:40 +00:00
Henrik Böving	2d87d50e34	perf: avoid superliniear overhead in closed term extraction (#12010 ) This PR fixe a superliniear behavior in the closed subterm extractor. Consider an LCNF of the shape: ``` let x1 := f arg let x2 := f x1 let x3 := f x2 let x4 := f x3 ... ``` In this case the previous closed term extraction algorithm would visit `x1`, then `x2` and `x1`, then `x3`,`x2`,`x1` and so on, failing each time. We now introduce a cache to avoid this behavior.	2026-01-14 21:50:35 +00:00
Henrik Böving	4b63048825	perf: simplify decision procedures in LCNF base already (#12008 ) This PR ensures that the LCNF simplifier already constant folds decision procedures (`Decidable` operations) in the base phase.	2026-01-14 21:11:23 +00:00
Henrik Böving	2f7f63243f	perf: fast path for SCC decomposition (#12009 )	2026-01-14 20:05:02 +00:00
Henrik Böving	dc70d0cc43	feat: split up the compiler SCC after lambda lifting (#12003 ) This PR splits up the SCC that the compiler manages into (potentially) multiple ones after performing lambda lifting. This aids both the closed term extractor and the elimDeadBranches pass as they are both negatively influenced when more declarations than required are within one SCC.	2026-01-14 18:36:25 +00:00
Rob23oba	e2353689f2	fix: ensure linearity in floatLetIn (#11983 ) This PR fixes the `floatLetIn` pass to not move variables in case it could break linearity (owned variables being passed with RC 1). This mostly improves the situation in the parser which previously had many functions that were supposed to be linear in terms of `ParserState` but the compiler made them non-linear. For an example of how this affected parsers: ```lean-4 def optionalFn (p : ParserFn) : ParserFn := fun c s => let iniSz := s.stackSize let iniPos := s.pos let s := p c s let s := if s.hasError && s.pos == iniPos then s.restore iniSz iniPos else s s.mkNode nullKind iniSz ``` previously moved the `let iniSz := ...` declaration into the `hasError` branch. However, this means that at the point of calling the inner parser (`p c s`), the original state `s` needs to have RC>1 because it is used later in the `hasError` branch, breaking linearity. This fix prevents such moves, keeping `iniSz` before the `p c s` call.	2026-01-12 22:26:18 +00:00
Henrik Böving	c91a2c63c2	perf: fast paths for forEachWhere Expr.isFVar (#11973 ) Add a fast path for the pattern `forEachWhere Expr.isFVar` to avoid setting up the expression traversal etc. Pattern initially noticed by @Rob23oba	2026-01-11 22:38:16 +00:00
Henrik Böving	7e6365567f	refactor: preparatory change from structure to inductive on LCNF (#11934 )	2026-01-08 09:56:41 +00:00
Leonardo de Moura	514a5fddc6	refactor: `DiscrTree` (#11875 ) This PR adds the directory `Meta/DiscrTree` and reorganizes the code into different files. Motivation: we are going to have new functions for retrieving simplification theorems for the new structural simplifier.	2026-01-02 19:53:45 +00:00
Henrik Böving	2db0a98b7c	fix: internalize all arguments to Quot.lift during LCNF conversion (#11729 ) This PR internalizes all arguments of Quot.lift during LCNF conversion, preventing panics in certain non trivial programs that use quotients. Fixes #11719.	2025-12-18 09:31:48 +00:00
Henrik Böving	fe96911368	feat: proper recursive specialization (#11479 ) This PR enables the specializer to also recursively specialize in some non trivial higher order situations. The main motivation for this change is the upcoming changes to do notation by sgraf. In there he uses combinators such as ```lean @[specialize, expose] def List.newForIn {α β γ} (l : List α) (b : β) (kcons : α → (β → γ) → β → γ) (knil : β → γ) : γ := match l with \| [] => knil b \| a :: l => kcons a (l.newForIn · kcons knil) b ``` in programs such as ```lean def testing := let x := 42; List.newForIn (β := Nat) (γ := Id Nat) [1,2,3] x (fun i kcontinue s => let x := s; List.newForIn [i:10].toList x (fun j kcontinue s => let x := s; let x := x + i + j; kcontinue x) kcontinue) pure ``` inspecting this IR right before we get to the specializer in the current compiler we get: ``` [Compiler.eagerLambdaLifting] size: 22 def testing : Nat := fun _f.1 _y.2 : Nat := return _y.2; let x := 42; let _x.3 := 1; fun _f.4 i kcontinue s : Nat := fun _f.5 j kcontinue s : Nat := let _x.6 := Nat.add s i; let x := Nat.add _x.6 j; let _x.7 := kcontinue x; return _x.7; let _x.8 := 10; let _x.9 := Nat.sub _x.8 i; let _x.10 := Nat.add _x.9 _x.3; let _x.11 := 1; let _x.12 := Nat.sub _x.10 _x.11; let _x.13 := Nat.mul _x.3 _x.12; let _x.14 := Nat.add i _x.13; let _x.15 := @List.nil _; let _x.16 := List.range'TR.go _x.3 _x.12 _x.14 _x.15; let _x.17 := @List.newForIn _ _ _ _x.16 s _f.5 kcontinue; return _x.17; let _x.18 := 2; let _x.19 := 3; let _x.20 := @List.nil _; let _x.21 := @List.cons _ _x.19 _x.20; let _x.22 := @List.cons _ _x.18 _x.21; let _x.23 := @List.cons _ _x.3 _x.22; let _x.24 := @List.newForIn _ _ _ _x.23 x _f.4 _f.1; return _x.24 ``` Here the `kcontinue` higher order functions pose a special challenge because they delay the discovery of new specialization opportunities. Inspecting the IR after the current specializer (and a cleanup simp step) we get functions that look as follows: ``` [simp] size: 7 def List.newForIn._at_.testing.spec_0 i kcontinue l b : Nat := cases l : Nat \| List.nil => let _x.1 := kcontinue b; return _x.1 \| List.cons head.2 tail.3 => let _x.4 := Nat.add b i; let x := Nat.add _x.4 head.2; let _x.5 := List.newForIn._at_.testing.spec_0 i kcontinue tail.3 x; return _x.5 [simp] size: 14 def List.newForIn._at_.List.newForIn._at_.testing.spec_1.spec_1 _x.1 l b : Nat := cases l : Nat \| List.nil => return b \| List.cons head.2 tail.3 => fun _f.4 x.5 : Nat := let _x.6 := List.newForIn._at_.List.newForIn._at_.testing.spec_1.spec_1 _x.1 tail.3 x.5; return _x.6; let _x.7 := 10; let _x.8 := Nat.sub _x.7 head.2; let _x.9 := Nat.add _x.8 _x.1; let _x.10 := 1; let _x.11 := Nat.sub _x.9 _x.10; let _x.12 := Nat.mul _x.1 _x.11; let _x.13 := Nat.add head.2 _x.12; let _x.14 := @List.nil _; let _x.15 := List.range'TR.go _x.1 _x.11 _x.13 _x.14; let _x.16 := List.newForIn._at_.testing.spec_0 head.2 _f.4 _x.15 b; return _x.16 ``` Observe that the specializer decided to abstract over `kcontinue` instead of specializing further recursively. Thus this tight loop is now going through an indirect call. This PR now changes the specializer somewhat fundamentally to handle situations like this. The most notable change is going to a fixpoint loop of: 1. Specialize all current declarations in the worklist 2. If a declaration - succeeded in specializing run the simplifier on it and put it back onto the worklist - if it didn't don't put it back onto the worklist anymore 3. Put all newly generated specialisations on the worklist 4. Recompute fixed parameters for the current SCC 5. Repeat until the worklist is empty Furthermore, declarations that were already specialized: - only consider `fixedHO` parameters for specialization, in order to avoid termination issues with repeated specialization and abstraction of type class parameters under binders - recursively specialized declarations only allow specialization if at least one of their fixedHO arguments is not a parameter itself. The reason for allowing this in first generation specialization is that we refrain from specializing inside the body of a declaration marked as `@[specialize]`. Thus we need to specialize them even if their arguments don't actually contain anything of interest in order to ensure that type classes etc. are correctly cleaned up within their bodies. There is one last trade-off to consider. When specializing code generated by the new do elaborator we sometimes generate intermediate specializations that are not actually part of any call graph after we are done specializing. We could in principle detect these functions and delete them but having them in cache is potentially helpful for further specializations later. Once the new do elaborator lands we plan to test this trade-off. Closes #10924	2025-12-17 11:05:24 +00:00
Henrik Böving	b8c53b1d29	chore: remove IR elim dead branches (#11576 ) This PR removes the old ElimDeadBranches pass and shifts the new one past lambda lifting. The reason for dropping the old one is its general unsoundness and the fact that we want to do refactorings on the IR part. The reason for shifting the current pass past lambda lifting, is that its analysis is imprecise in the presence of local function symbols. I experimented with the exact placement for a while and it seems like it is optimal here. Overall we observe a slight regression in the amount of C code generated, likely because we don't propagate information into lambdas before lifting them anymore. But generally measure a slight performance improvement in general.	2025-12-11 10:39:02 +00:00
Joachim Breitner	3b40682b22	perf: handle per-constructor noConfusion in toLCNF (#11566 ) This PR lets the compiler treat per-constructor `noConfusion` like the general one, and moves some more logic closer to no confusion generation.	2025-12-10 09:03:55 +00:00
Henrik Böving	c5e04176b8	perf: eliminate cases with all branches unreachable (#11525 ) This PR makes the LCNF simplifier eliminate cases where all alts are `.unreach` to just an `.unreach`. an `.unreach` We considered dropping a cases in a situation like this but decided against it because it might hinder reuse. ``` def test x : Bool := cases x : Bool \| Except.error a.1 => ⊥ \| Except.ok a.2 => let _x.3 := true; return _x.3 ```	2025-12-05 20:30:20 +00:00
Henrik Böving	6ca57a74ed	feat: constant folding for Nat.mul (#11517 ) This PR implements constant folding for Nat.mul	2025-12-04 23:38:56 +00:00
Joachim Breitner	edf804c70f	feat: heterogeneous noConfusion (#11474 ) This PR generalizes the `noConfusion` constructions to heterogeneous equalities (assuming propositional equalities between the indices). This lays ground work for better support for applying injection to heterogeneous equalities in grind. The `Meta.mkNoConfusion` app builder shields most of the code from these changes. Since the per-constructor noConfusion principles are now more expressive, `Meta.mkNoConfusion` no longer uses the general one. In `Init.Prelude` some proofs are more pedestrian because `injection` now needs a bit more machinery. This is a breaking change for whoever uses the `noConfusion` principle manually and explicitly for a type with indices. Fixes #11450.	2025-12-02 15:19:47 +00:00
Henrik Böving	3dd99fc29c	perf: eta contract instead of lambda lifting if possible (#11451 ) This PR adapts the lambda lifter in LCNF to eta contract instead of lambda lift if possible. This prevents the creation of a few hundred unnecessary lambdas across the code base.	2025-12-02 08:39:24 +00:00
Henrik Böving	b21cef37e4	perf: sort before elim dead branches (#11366 ) This PR sorts the declarations fed into ElimDeadBranches in increasing size. This can improve performance when we are dealing with a lot of iterations. The motivation for this change is as follows. Currently the algorithm for doing one step of abstract interpretation is: ``` for decl in scc do interpDecl if summaryChanged decl then return true return false ``` whenever we return true we run another step. Now suppose we are in a situation where we have an SCC with one big decl in the front and then `n` small ones afterwards. For each time that the small ones change their summary, we will re-run analysis of the big one in the front. Currently the ordering is basically at "random" based on how other compilers inject things into the SCC. This change ensures the behavior is consistent and at least somewhat intelligent. By putting the small declarations first, whenever we trigger a rerun of the loop we bias analyzing the small declarations first, thus decreasing run time. Note that this change does not have much effect on the current pipeline because: We usually construct the SCCs in a way such that small ones happen to be in front anyways. However, with upcomping changes on specialization this is about to change.	2025-11-27 22:21:06 +00:00
Henrik Böving	586ea55c0d	fix: enforce choice invariant in ElimDeadBranches (#11398 ) This PR fixes a broken invariant in the choice nodes of ElimDeadBranches. Closes: #11389 and #11393	2025-11-27 11:41:43 +00:00
Henrik Böving	5dde403ec0	fix: toposort declarations to ensure proper constant initialization (#11388 ) This PR is a followup of #11381 and enforces the invariants on ordering of closed terms and constants required by the EmitC pass properly by toposorting before saving the declarations into the Environment.	2025-11-26 18:17:17 +00:00
Henrik Böving	e8da78adda	fix: enforce implicit invariants in EmitC stronger (#11381 ) This PR fixes a bug where the closed term extraction does not respect the implicit invariant of the c emitter to have closed term decls first, other decls second, within an SCC. This bug has not yet been triggered in the wild but was unearthed during work on upcoming modifications of the specializer.	2025-11-26 12:24:03 +00:00
Henrik Böving	cef200fda6	perf: speed up termination of ElimDeadBranches compiler pass (#11362 ) This PR accelerates termination of the ElimDeadBranches compiler pass. The implementation addresses situations such as `choice [none, some top]` which can be summarized to `top` because `Option` has only two constructors and all constructor arguments are `top`.	2025-11-25 22:52:43 +00:00
Henrik Böving	b6e6094f85	chore: beta reduce in specialization keys (#11353 ) This PR applies beta reduction to specialization keys, allowing us to reuse specializations in more situations.	2025-11-25 12:14:36 +00:00
Henrik Böving	57afb23c5c	fix: compilation of projections on non trivial structures (#11340 ) This PR fixes a miscompilation when encountering projections of non trivial structure types. Closes: #11322	2025-11-24 19:25:03 +00:00
Henrik Böving	80224c72c9	perf: improve specializer cache keys (#11310 ) This PR makes the specializer (correctly) share more cache keys across invocations, causing us to produce less code bloat. We observed that in functions with lots of specialization, sometimes cache keys are defeq but not BEq because one has unused let decls (introduced by specialization) that the other doesn't. This PR resolves this conflict by erasing unused let decls from specializer cache keys.	2025-11-21 23:21:40 +00:00
Joachim Breitner	4288aa71e0	chore: do not set unused Option.Decl.group (#11307 ) This PR removes all code that sets the `Option.Decl.group` field, which is unused and has no clearly documented meaning. The actual removal of the field would be #11305.	2025-11-21 16:44:38 +00:00
Joachim Breitner	63bd0b5e77	refactor: introduce Match.altInfos (#11256 ) This PR replaces `MatcherInfo.numAltParams` with a more detailed data structure that allows us, in particular, to distinguish between an alternative for a constructor with a `Unit` field and the alternative for a nullary constructor, where an artificial `Unit` argument is introduced.	2025-11-19 15:09:17 +00:00
Henrik Böving	bef8574b93	fix: be more careful when recording cases in the compiler (#11210 ) This PR fixes a bug in the LCNF simplifier unearthed while working on #11078. In some situations caused by `unsafeCast`, the simplifier would record incorrect information about `cases`, leading to further bugs down the line. Suppose we have `v : NonScalar` due to an `unsafeCast` and we run `cases` on it, expecting `Prod.mk fst snd`. The current code attempts to record both the arguments from the constructor application in the case arm `fst`, `snd` and the parameters for the type by inspecting the discr `v`. However, `NonScalar` does of course not have any parameters, causing the simplifier to record wrong information. This patch makes the `cases` infrastructure more cautious when extracting information from the type of `v`.	2025-11-17 11:34:16 +00:00
Rob23oba	eba5a5a6ef	fix: consider over-applications in `reduceArity` compiler pass (#11185 ) This PR fixes the `reduceArity` compiler pass to consider over-applications to functions that have their arity reduced. Previously, this pass assumed that the amount of arguments to applications was always the same as the number of parameters in the signature. This is usually true, since the compiler eagerly introduces parameters as long as the return type is a function type, resulting in a function with a return type that isn't a function type. However, for dependent types that sometimes are function types and sometimes not, this assumption is broken, resulting in the additional parameters to be dropped. Closes #11131	2025-11-17 07:51:37 +00:00
Sebastian Ullrich	ed34ee0cd5	chore: make `declMetaExt` persistent for `shake` (#11201 )	2025-11-16 20:11:56 +00:00
Sebastian Ullrich	5011b7bd89	chore: make compilation type mismatch error message from non-exposed defs a lot less mysterious (#11177 )	2025-11-14 10:50:43 +00:00
Sebastian Ullrich	4602586b6a	chore: suggest public `meta import` on phase check failure, which is more likely to be the correct variant (#11173 )	2025-11-14 10:10:04 +00:00
Joachim Breitner	d41f39fb10	perf: sparse case splitting in match compilation (#10823 ) This PR lets the match compilation procedure use sparse case analysis when the patterns only match on some but not all constructors of an inductive type. This way, less code is produce. Before, code handling each of the other cases was then optimized and commoned-up by later compilation pipeline, but that is wasteful to do. In some cases this will prevent Lean from noticing that a match statement is complete because it performs less case-splitting for the unreachable case. In this case, give explicit patterns to perform the deeper split with `by contradiction` as the right-hand side. At least temporarily, there is also the option to disable this behaviour with ``` set_option backwards.match.sparseCases false ```	2025-11-06 13:46:35 +00:00
Joachim Breitner	0cb79868f4	feat: sparse casesOn constructions (#11072 ) This PR adds “sparse casesOn” constructions. They are similar to `.casesOn`, but have arms only for some constructors and a catch-all (providing `t.ctorIdx ≠ 42` assumptions). The compiler has native support for these constructors and now (because of the similarity) also the per-constructor elimination principles.	2025-11-05 15:49:11 +00:00
Sebastian Ullrich	e4fb780f8a	perf: remove unused argument to `ExternEntry.opaque` (#11066 ) This used to create quite a few unique objects in public .olean	2025-11-03 17:26:32 +00:00
Henrik Böving	3d307925b7	refactor: make constant folding more robust for future bugs (#11044 ) This PR enforces users of the constant folder API to provide proofs of their algebraic properties, thus hopefully avoiding bugs such as #11042 and #11043 in the future.	2025-11-01 11:07:20 +00:00
Rob23oba	1fa67d0d47	fix: overeager `Nat.sub` constant folding (#11043 ) This PR fixes a case of overeager constant folding on Nat where the compiler would mistakenly assume `0 - x = x` (see also #11042 for the same bug on UInts).	2025-11-01 10:14:20 +00:00
Henrik Böving	51ef1dcc5e	fix: overeager uint constant folding (#11042 ) This PR fixes a case of overeager constant folding on UInts where the compiler would mistakenly assume `0 - x = x`.	2025-11-01 02:42:43 +00:00
Henrik Böving	8b28467655	perf: better detection of repeated branching on same value (#11020 ) This PR improves the detection of situations where we branch multiple times on the same value in the code generator. Previously this would only consider repeated branching on function arguments, now on arbitrary values. Closes: #11018	2025-10-30 16:02:45 +00:00
Henrik Böving	cc046e0c18	perf: improve join point finding (#10999 ) This PR improves join point finding in the compiler through two means: 1. We now handle situations where a function `f` can only become a join point when a function `g` becomes a join point as well correctly. 2. We introduce a second join point finding pass after specialisation and before the following simplification pass, as the specialiser might have introduced new join point opportunities for the simplifier to exploit. Notably in the code from #10995 we now correctly detect the missing join point which required both of these changes to be made. Closes: #10995	2025-10-30 15:05:11 +00:00
Henrik Böving	1587d02dfb	fix: more stable eager lambda lifting heuristic (#11010 ) This PR makes the eager lambda lifting heuristic more predictable by blocking it from lifting from any kind of inlineable function, not just `@[inline]`. It also adapts the doc-string to describe what is actually going on.	2025-10-29 13:58:23 +00:00
Henrik Böving	7e1be20317	perf: widen more in ElimDeadBranches (#10856 ) This PR performs more widening in ElimDeadBranches in an attempt to improve performance in situations with a lot of local precision. While this is not enough to make the compilation instant it pushes compilation time from 12s to 3s for the example in #10857 and barely introduces regressions so it seems like a good first step in this direction. Closes: #10857	2025-10-27 09:12:16 +00:00
Sebastian Ullrich	77ddfd49e6	chore: further `shake` improvements (#10947 )	2025-10-26 11:27:19 +00:00
Henrik Böving	52b1b342ab	feat: zero cost BaseIO (#10625 ) This PR implements zero cost `BaseIO` by erasing the `IO.RealWorld` parameter from argument lists and structures. This is a major breaking change for FFI. Concretely: - `BaseIO` is defined in terms of `ST IO.RealWorld` - `EIO` (and thus `IO`) is defined in terms of `EST IO.RealWorld` - The opaque `Void` type is introduced and the trivial structure optimization updated to account for it. Furthermore, arguments of type `Void s` are removed from the argument lists of the C functions. - `ST` is redefined as `Void s -> ST.Out s a` where `ST.Out` is a pair of `Void s` and `a` This together has the following major effects on our generated code: - Functions that return `BaseIO`/`ST`/`EIO`/`IO`/`EST` now do not take the dummy world parameter anymore. To account for this FFI code needs to delete the dummy world parameter from the argument lists. - Functions that return `BaseIO`/`ST` now return their wrapped value directly. In particular `BaseIO UInt32` now returns a `uint32_t` instead of a `lean_object*`. To account for this FFI code might have to change the return type and does not need to call `lean_io_result_mk_ok` anymore but can instead just `return` values right away (same with extracting values from `BaseIO` computations. - Functions that return `EIO`/`IO`/`EST` now only return the equivalent of an `Except` node which reduces the allocation size. The `lean_io_result_mk_ok`/`lean_io_result_mk_error` functions were updated to account for this already so no change is required. Besides improving performance by dropping allocation (sizes) we can now also do fun new things such as: ```lean @[extern "malloc"] opaque malloc (size : USize) : BaseIO USize ```	2025-10-22 10:55:12 +02:00
Henrik Böving	bd0b91de07	perf: reduce amount of symbols in DLLs (#10864 ) This PR reduces the amount of symbols in our DLLs by cutting open a linking cycle of the shape: `Environment -> Compiler -> Meta -> Environment` This is achieved by introducing a dynamic call to the compiler hidden behind a `Ref` as previously done in the pretty printer.	2025-10-21 09:00:56 +00:00
Sebastian Ullrich	37b78bd53d	chore: more module system fixes and refinements for finishing batteries port (#10819 )	2025-10-21 08:19:50 +00:00
Sebastian Ullrich	428355cf02	chore: remove redundant imports in core (#10750 )	2025-10-16 20:27:46 +00:00
Sebastian Ullrich	3b061a0996	chore: more module system fixes and improvements from Mathlib porting (#10655 )	2025-10-08 11:30:09 +00:00

1 2 3 4 5 ...

663 commits