This PR adds a Python script that helps find which commit introduced a
behavior change in Lean. It supports multiple bisection modes and
automatically downloads CI artifacts when available.
- [x] depends on: #11735
## Usage
```
usage: lean-bisect [-h] [--timeout SEC] [--ignore-messages] [--verbose]
[--selftest] [--clear-cache] [--nightly-only]
[file] [RANGE]
Bisect Lean toolchain versions to find where behavior changes.
positional arguments:
file Lean file to test (must only import Lean.* or Std.*)
RANGE Range to bisect: FROM..TO, FROM, or ..TO
options:
-h, --help show this help message and exit
--timeout SEC Timeout in seconds for each test run
--ignore-messages Compare only exit codes, ignore stdout/stderr differences
--verbose, -v Show stdout/stderr from each test
--selftest Run built-in selftest to verify lean-bisect works
--clear-cache Clear CI artifact cache (~600MB per commit) and exit
--nightly-only Stop after finding nightly range (don't bisect individual
commits)
Range Syntax:
FROM..TO Bisect between FROM and TO
FROM Start from FROM, bisect to latest nightly
..TO Bisect to TO, search backwards for regression start
If no range given, searches backwards from latest nightly to find regression.
Identifier Formats:
nightly-YYYY-MM-DD Nightly build date (e.g., nightly-2024-06-15)
Uses pre-built toolchains from leanprover/lean4-nightly.
Fast: downloads via elan (~30s each).
v4.X.Y or v4.X.Y-rcN Version tag (e.g., v4.8.0, v4.9.0-rc1)
Converts to equivalent nightly range.
Commit SHA Git commit hash (short or full, e.g., abc123def)
Bisects individual commits between two points.
Tries CI artifacts first (~30s), falls back to building (~2-5min).
Commits with failed CI builds are automatically skipped.
Artifacts cached in ~/.cache/lean-bisect/artifacts/
Bisection Modes:
Nightly mode: Both endpoints are nightly dates.
Binary search through nightlies to find the day behavior changed.
Then automatically continues to bisect individual commits.
Use --nightly-only to stop after finding the nightly range.
Version mode: Either endpoint is a version tag.
Converts to equivalent nightly range and bisects.
Commit mode: Both endpoints are commit SHAs.
Binary search through individual commits on master.
Output: "Behavior change introduced in commit abc123"
Examples:
# Simplest: just provide the file, finds the regression automatically
lean-bisect test.lean
# Specify an endpoint if you know roughly when it broke
lean-bisect test.lean ..nightly-2024-06-01
# Full manual control over the range
lean-bisect test.lean nightly-2024-01-01..nightly-2024-06-01
# Only find the nightly range, don't continue to commit bisection
lean-bisect test.lean nightly-2024-01-01..nightly-2024-06-01 --nightly-only
# Add a timeout (kills slow/hanging tests)
lean-bisect test.lean --timeout 30
# Bisect commits directly (if you already know the commit range)
lean-bisect test.lean abc1234..def5678
# Only compare exit codes, ignore output differences
lean-bisect test.lean --ignore-messages
# Clear downloaded CI artifacts to free disk space
lean-bisect --clear-cache
```
🤖 Prepared with Claude Code
---------
Co-authored-by: Claude <noreply@anthropic.com>
This PR adds a standalone script to download pre-built CI artifacts from
GitHub Actions. This allows us to quickly switch commits without
rebuilding.
**Features:**
- Downloads artifacts for current HEAD or specified commit (`--sha`)
- Caches in `~/.cache/lean_build_artifact/` for reuse
- Platform detection (Linux/macOS, x86_64/aarch64)
**Usage:**
```
build_artifact.py # Download for current HEAD
build_artifact.py --sha abc1234 # Download for specific commit
build_artifact.py --clear-cache # Clear cache
```
This is extracted to be shared with `lean-bisect`.
🤖 Prepared with Claude Code
Co-authored-by: Claude <noreply@anthropic.com>
This PR adds the following repositories to the release configuration:
- lean4-unicode-basic
- BibtexQuery (depends on lean4-unicode-basic)
- verso-web-components (depends on verso)
It also updates dependencies:
- doc-gen4 now depends on BibtexQuery
- lean-fro.org now depends on verso-web-components
🤖 Prepared with Claude Code
This PR updates the release checklist script. The cslib repository no
longer has a docs subdirectory, so the release script was failing when
trying to update lakefile.toml and lean-toolchain in that nonexistent
directory.
This PR implements new flags and annotations for `shake` for use in
Mathlib:
> Options:
> --keep-implied
> Preserves existing imports that are implied by other imports and thus
not technically needed
> anymore
>
> --keep-prefix
> If an import `X` would be replaced in favor of a more specific import
`X.Y...` it implies,
> preserves the original import instead. More generally, prefers
inserting `import X` even if it
> was not part of the original imports as long as it was in the original
transitive import closure
> of the current module.
>
> --keep-public
> Preserves all `public` imports to avoid breaking changes for external
downstream modules
>
> --add-public
> Adds new imports as `public` if they have been in the original public
closure of that module.
> In other words, public imports will not be removed from a module
unless they are unused even
> in the private scope, and those that are removed will be re-added as
`public` in downstream
> modules even if only needed in the private scope there. Unlike
`--keep-public`, this may
> introduce breaking changes but will still limit the number of inserted
imports.
>
> Annotations:
> The following annotations can be added to Lean files in order to
configure the behavior of
> `shake`. Only the substring `shake: ` directly followed by a directive
is checked for, so multiple
> directives can be mixed in one line such as `-- shake:
keep-downstream, shake: keep-all`, and they
> can be surrounded by arbitrary comments such as `-- shake: keep
(metaprogram output dependency)`.
>
> * `module -- shake: keep-downstream`:
> Preserves this module in all (current) downstream modules, adding new
imports of it if needed.
>
> * `module -- shake: keep-all`:
> Preserves all existing imports in this module as is. New imports now
needed because of upstream
> changes may still be added.
>
> * `import X -- shake: keep`:
> Preserves this specific import in the current module. The most common
use case is to preserve a
> public import that will be needed in downstream modules to make sense
of the output of a
> metaprogram defined in this module. For example, if a tactic is
defined that may synthesize a
> reference to a theorem when run, there is no way for `shake` to detect
this by itself and the
> module of that theorem should be publicly imported and annotated with
`keep` in the tactic's
> module.
> ```
> public import X -- shake: keep (metaprogram output dependency)
>
> ...
>
> elab \"my_tactic\" : tactic => do
> ... mkConst ``f -- `f`, defined in `X`, may appear in the output of
this tactic
> ```
Not tested carefully: I will shake out any problems during the next
release. This script would have detected the mistakes I made in recent
releases of `v4.24.1` / `v4.25.1` and `v4.25.2`. (And #11374 would have
prevented these mistakes.)
This PR adds a new [radar]-based [temci]-less bench suite that replaces
the `stdlib` benchmarks from the old suite and also measures per-module
instruction counts. All other benchmarks from the old suite are
unaffected.
The readme at `tests/bench-radar/README.md` explains in more detail how
the bench suite is structured and how it works. The readmes in the
benchmark subdirectories explain what each benchmark does and which
metrics it collects.
All metrics except `stdlib//max dynamic symbols` were ported to the new
suite, though most have been renamed.
[radar]: https://github.com/leanprover/radar
[temci]: https://github.com/parttimenerd/temci
This PR renames `String.ValidPos` to `String.Pos`, `String.endValidPos`
to `String.endPos` and `String.startValidPos` to `String.startPos`.
Accordingly, the deprecations of `String.Pos` to `String.Pos.Raw` and
`String.endPos` to `String.rawEndPos` are removed early, after an
abbreviated deprecation cycle of two releases.
This PR fixes fallout of the closure allocator changes in #10982. As far
as we know
this bug only meaningfully manifests in non default build configurations
without mimalloc such as:
`cmake --preset release -DUSE_MIMALLOC=OFF`
The issue is that I forgot to update the deallocation functions for
closures. However, this only
seems to matter if we disable mimalloc which is why this slipped through
testing.
This PR implements the `#grind_lint` command, a diagnostic tool for
analyzing the behavior of theorems annotated for theorem instantiation.
The command helps identify problematic theorems that produce excessive
or unbounded instance generation during E-matching, which can lead to
performance issues.
The main entry point is:
```
#grind_lint check
```
which analyzes all theorems marked with the `@[grind]` attribute.
For each theorem, it creates an artificial goal and runs `grind`,
collecting statistics about the number of instances produced.
Results are summarized using info messages, and detailed breakdowns are
shown for lemmas exceeding a configurable threshold.
Additional subcommands are provided for targeted inspection and control:
* `#grind_lint inspect thm`: analyzes one or more specific theorems in
detail
* `#grind_lint mute thm`: excludes a theorem from instantiation during
analysis
* `#grind_lint skip thm`: omits a theorem from being analyzed by
`#grind_lint check`
This PR renames `String.endPos` to `String.rawEndPos`, as in a future
release the name `String.endPos` will be taken by the function that is
currently called `String.endValidPos`.
This PR improves the release automation. We link to CI output for
building the release tag, don't give instructions for bumping downstream
repositories until the release it ready, and improve documentation and
prompts.
This PR improves the scripts assisting with cutting Lean releases (by
reporting CI status of open PRs, and adding documentation), and adds a
`.claude/commands/release.md` prompt file so Claude can assist.
This PR adds the necessary infrastructure for recording elaboration
dependencies that may not be apparent from the resulting environment
such as notations and other metaprograms. An adapted version of `shake`
from Mathlib is added to `script/` but may be moved to another location
or repo in the future.
This PR introduces a simple script that adjusts module headers in a
package for use of the module system, without further minimizing import
or annotation use.
---------
Co-authored-by: Kim Morrison <477956+kim-em@users.noreply.github.com>
This PR includes some improvements to the release process, making the
updating of `stable` branches more robust, and including `cslib` in the
release checklist.
This PR reviews the expected-to-fail-right-now tests for `grind`, moving
some (now passing) tests to the main test suite, updating some tests,
and adding some tests about normalisation of exponents.
This PR make some minor changes to the grind annotation analysis script,
including sorting results and handling errors. Still need to add an
external UI.
This PR adds a script for analyzing `grind` E-matching annotations. The
script is useful for detecting matching loops. We plan to add
user-facing commands for running the script in the future.
This PR updates `release_repos.yml` to reflect that `import-graph` no
longer depends on `batteries`, and reorders the repositories to better
reflect dependencies.
This PR further improves release automation, automatically incorporating
material from `nightly-testing` and `bump/v4.X.0` branches in the bump
PRs to downstream repositories.
This PR further updates release automation. The per-repository update
scripts `script/release_steps.py` now actually performs the tests,
rather than outputting a script for the release manager to run line by
line. It's been tested on `v4.21.0` (i.e. the easy case of a stable
release), and we'll debug its behaviour on `v4.22.0-rc1` tonight.
This PR changes the CI setup to generate `lean-pr-testing-NNNN` branches
for Mathlib on the `leanprover-community/mathlib4-nightly-testing` fork,
rather than on the main repo.
This PR removes an old workaround around non-implemented C++11 features
in the thread finalization.
This `ifdef` dates back to approximately 2015 as can be seen
[here](https://github.com/leanprover/lean3/blame/master/src/util/thread.cpp#L177),
the comments mention that it was originally implemented because not all
compilers at the time were able to support the C++11 `thread_local`
keyword. 10 years later this is hopefully the case and we can remove
this workaround.
There is an additional motivation for doing this,
`lean::initialize_thread` contains the following allocation:
```cpp
g_thread_finalizers_mgr = new thread_finalizers_manager;
```
this is supposed to be freed at some point but:
```cpp
// TODO(gabriel): race condition with thread finalizers
void delete_thread_finalizer_manager() {
// delete g_thread_finalizers_mgr;
// g_thread_finalizers_mgr = nullptr;
}
```
so `g_thread_finalizers_mgr` leaks upon repeated invocation of
`lean::initialize_thread`.
Note that Windows has already been using this alternative implementation
for a while so the alternative implementation has (hopefully) not rotten
away in the meantime.