The problem here was that in Mathlib's `lean-pr-testing-NNNN` branches,
we were setting Batteries to a `nightly-testing-YYYY-MM-DD` branch. This
means that when we merge or rebase a new `nightly-with-mathlib` into a
Lean PR, the corresponding Mathlib testing branch would keep using an
old version of Batteries.
We also make sure to bump Batteries if Mathlib's `lean-pr-testing-NNNN`
branch already exists.
Fixes a workflow bug where the `check-level` was not always set
correctly. Arguments to a `gh` call used to determine the `check_level`
were accidentally outside of the relevant command substitution (`$(gh
...)`).
-----
This can be observed in [these
logs](https://github.com/leanprover/lean4/actions/runs/10859763037/job/30139540920),
where the check level (shown first under "configure build matrix") is
`2`, but the PR does not have the `release-ci` tag. As a "test", run the
script for "set check level" printed in those logs (with some lines
omitted):
```
check_level=0
labels="$(gh api repos/leanprover/lean4/pulls/5343) --jq '.labels'"
if echo "$labels" | grep -q "release-ci"; then
check_level=2
elif echo "$labels" | grep -q "merge-ci"; then
check_level=1
fi
echo "check_level=$check_level"
```
Note that this prints `check_level=2`, but changing `labels` to
`labels="$(gh api repos/leanprover/lean4/pulls/5343 --jq '.labels')"`
prints `check_level=0`.
Since https://github.com/curl/curl/pull/4465 curl adheres to the
`Retry-After` header, so maybe this fixes the issues with
```
jq: error (at <stdin>:5): Cannot index string with string "body"
```
that sometimes make this workflow fail.
this job sometimes fails, maybe a race condition with the `gh run
cancel` not happenign quickly enough. Maybe more verbose output will
help understand this better.
Previously, the CI would run upon every label addition, including things
like `builds-mathlib`
or `will-merge-soon`, possibly triggering a new PR release, new mathlib
builds etc. Very wasteful!
Unfortunately (but not surprisingly) Github does not offer a nice way of
saying
“this workflow depends on that label, please re-run if changed”. Not
enough
functional programmer or nix enthusiasts there, I guess…
So here is the next iteration trying to work with what we have from
Github:
A new workflow watches for (only) `full-ci` label addition or deletion,
and then re-runs
the CI job for the current PR.
Sounds simple? But remember, this is github!
* `github.event.pull_request.labels.*.name` is *not* updated when a job
is re-run.
(This is actually a reasonable step towards determinism, but doesn't
help us
constructing this work-around.)
Ok, so let’s use the API to fetch the current state of the label.
* There is no good way to say “find the latest run of workflow `"CI"` on
PR `$n`”.
The best approximation seems to search by branch and triggering event.
This can
probably go wrong if there are multiple PRs from different repos with
the same
head ref name (`patch-1` anyone?). Let’s hope that it doesn’t happen too
often.
* You cannot just rerun a workflow. You can only rerun a finished
workflow. So cancel
it first. And `sleep` a bit…
So let’s see how well this will work. It’s plausibly an improvement.