151 lines
8.4 KiB
Markdown
151 lines
8.4 KiB
Markdown
# Lean Build Bootstrapping
|
|
|
|
Since version 4, Lean is a partially bootstrapped program: most parts of the
|
|
frontend and compiler are written in Lean itself and thus need to be built before
|
|
building Lean itself - which is needed to again build those parts. This cycle is
|
|
broken by using pre-built C files checked into the repository (which ultimately
|
|
go back to a point where the Lean compiler was not written in Lean) in place of
|
|
these Lean inputs and then compiling everything in multiple stages up to a fixed
|
|
point. The build directory is organized in these stages:
|
|
|
|
```bash
|
|
stage0/
|
|
# Bootstrap binary built from stage0/src/.
|
|
# We do not use any other files from this directory in further stages.
|
|
bin/lean
|
|
stage1/
|
|
include/
|
|
config.h # config variables used to build `lean` such as used allocator
|
|
runtime/lean.h # runtime header, used by extracted C code, uses `config.h`
|
|
share/lean/
|
|
lean.mk # used by `leanmake`
|
|
lib/
|
|
lean/**/*.olean # the Lean library (incl. the compiler) compiled by the previous stage's `lean`
|
|
temp/**/*.{c,o} # the library extracted to C and compiled by `leanc`
|
|
libInit.a libLean.a # static libraries of the Lean library
|
|
libleancpp.a # a static library of the C++ sources of Lean
|
|
libleanshared.so # a dynamic library including the static libraries above
|
|
bin/
|
|
lean # the Lean compiler & server, a small executable that calls directly into libleanshared.so
|
|
leanc # a wrapper around a C compiler supplying search paths etc
|
|
leanmake # a wrapper around `make` supplying the Makefile above
|
|
stage2/...
|
|
stage3/...
|
|
```
|
|
|
|
Stage 0 can be viewed as a blackbox since it does not depend on any local
|
|
changes and is equivalent to downloading a bootstrapping binary as done in other
|
|
compilers. The build for any other stage starts by building the runtime and
|
|
standard library from `src/`, using the `lean` binary from the previous stage in
|
|
the latter case, which are then assembled into a new `bin/lean` binary.
|
|
|
|
Each stage can be built by calling `make stageN` in the root build folder.
|
|
Running just `make` will default to stage 1, which is usually sufficient for
|
|
testing changes on the test suite or other files outside of the stdlib. However,
|
|
it might happen that the stage 1 compiler is not able to load its own stdlib,
|
|
e.g. when changing the .olean format: the stage 1 stdlib will use the format
|
|
generated by the stage 0 compiler, but the stage 1 compiler will expect the new
|
|
format. In this case, we should continue with building and testing stage 2
|
|
instead, which will both build and expect the new format. Note that this is only
|
|
possible because when building a stage's stdlib, we use the previous compiler
|
|
but never load the previous stdlib (since everything is `prelude`). We can also
|
|
use stage 2 to test changes in the compiler or other "meta" parts, i.e. changes
|
|
that affect the produced (.olean or .c) code, on the stdlib and compiler itself.
|
|
We are not aware of any "meta-meta" parts that influence more than two stages of
|
|
compilation, so stage 3 should always be identical to stage 2 and only exists as
|
|
a sanity check.
|
|
|
|
In summary, doing a standard build via `make` internally involves these steps:
|
|
|
|
1. compile the `stage0/src` archived sources into `stage0/bin/lean`
|
|
1. use it to compile the current library (*including* your changes) into `stage1/lib`
|
|
1. link that and the current C++ code from `src/` into `stage1/bin/lean`
|
|
|
|
You now have a Lean binary and library that include your changes, though their
|
|
own compilation was not influenced by them, that you can use to test your
|
|
changes on test programs whose compilation *will* be influenced by the changes.
|
|
|
|
## Updating stage0
|
|
|
|
Finally, when we want to use new language features in the library, we need to
|
|
update the archived C source code of the stage 0 compiler in `stage0/src`.
|
|
|
|
The github repository will automatically update stage0 on `master` once
|
|
`src/stdlib_flags.h` and `stage0/src/stdlib_flags.h` are out of sync.
|
|
|
|
If you have write access to the lean4 repository, you can also also manually
|
|
trigger that process, for example to be able to use new features in the compiler itself.
|
|
You can do that on <https://github.com/leanprover/lean4/actions/workflows/update-stage0.yml>
|
|
or using Github CLI with
|
|
```
|
|
gh workflow run update-stage0.yml
|
|
```
|
|
|
|
Leaving stage0 updates to the CI automation is preferable, but should you need
|
|
to do it locally, you can use `make update-stage0-commit` in `build/release` to
|
|
update `stage0` from `stage1` or `make -C stageN update-stage0-commit` to
|
|
update from another stage. This command will automatically stage the updated files
|
|
and introduce a commit,so make sure to commit your work before that.
|
|
|
|
If you rebased the branch (either onto a newer version of `master`, or fixing
|
|
up some commits prior to the stage0 update, recreate the stage0 update commits.
|
|
The script `script/rebase-stage0.sh` can be used for that.
|
|
|
|
The CI should prevent PRs with changes to stage0 (besides `stdlib_flags.h`)
|
|
from entering `master` through the (squashing!) merge queue, and label such PRs
|
|
with the `changes-stage0` label. Such PRs should have a cleaned up history,
|
|
with separate stage0 update commits; then coordinate with the admins to merge
|
|
your PR using rebase merge, bypassing the merge queue.
|
|
|
|
|
|
## Further Bootstrapping Complications
|
|
|
|
As written above, changes in meta code in the current stage usually will only
|
|
affect later stages. This is an issue in two specific cases.
|
|
|
|
* For *non-builtin* meta code such as `notation`s or `macro`s in
|
|
`Notation.lean`, we expect changes to affect the current file and all later
|
|
files of the same stage immediately, just like outside the stdlib. To ensure
|
|
this, we need to build the stage using `-Dinterpreter.prefer_native=false` -
|
|
otherwise, when executing a macro, the interpreter would notice that there is
|
|
already a native symbol available for this function and run it instead of the
|
|
new IR, but the symbol is from the previous stage!
|
|
|
|
To make matters more complicated, while `false` is a reasonable default
|
|
incurring only minor overhead (`ParserDescr`s and simple macros are cheap to
|
|
interpret), there are situations where we *need* to set the option to `true`:
|
|
when the interpreter is executed from the native code of the previous stage,
|
|
the type of the value it computes must be identical to/ABI-compatible with the
|
|
type in the previous stage. For example, if we add a new parameter to `Macro`
|
|
or reorder constructors in `ParserDescr`, building the stage with the
|
|
interpreter will likely fail. Thus we need to set `interpreter.prefer_native`
|
|
to `true` in such cases to "freeze" meta code at their versions in the
|
|
previous stage; no new meta code should be introduced in this stage. Any
|
|
further stages (e.g. after an `update-stage0`) will then need to be compiled
|
|
with the flag set to `false` again since they will expect the new signature.
|
|
|
|
For an example, see https://github.com/leanprover/lean4/commit/da4c46370d85add64ef7ca5e7cc4638b62823fbb.
|
|
|
|
* For the special case of *quotations*, it is desirable to have changes in
|
|
built-in parsers affect them immediately: when the changes in the parser
|
|
become active in the next stage, macros implemented via quotations should
|
|
generate syntax trees compatible with the new parser, and quotation patterns
|
|
in macro and elaborators should be able to match syntax created by the new
|
|
parser and macros. Since quotations capture the syntax tree structure during
|
|
execution of the current stage and turn it into code for the next stage, we
|
|
need to run the current stage's built-in parsers in quotation via the
|
|
interpreter for this to work. Caveats:
|
|
* Since interpreting full parsers is not nearly as cheap and we rarely change
|
|
built-in syntax, this needs to be opted in using `-Dinternal.parseQuotWithCurrentStage=true`.
|
|
* The parser needs to be reachable via an `import` statement, otherwise the
|
|
version of the previous stage will silently be used.
|
|
* Only the parser code (`Parser.fn`) is affected; all metadata such as leading
|
|
tokens is taken from the previous stage.
|
|
|
|
For an example, see https://github.com/leanprover/lean4/commit/f9dcbbddc48ccab22c7674ba20c5f409823b4cc1#diff-371387aed38bb02bf7761084fd9460e4168ae16d1ffe5de041b47d3ad2d22422
|
|
(from before the flag defaulted to `false`).
|
|
|
|
To modify either of these flags both for building and editing the stdlib, adjust
|
|
the code in `stage0/src/stdlib_flags.h`. The flags will automatically be reset
|
|
on the next `update-stage0` when the file is overwritten with the original
|
|
version in `src/`.
|