122 lines
7 KiB
Markdown
122 lines
7 KiB
Markdown
# Lean Build Bootstrapping
|
|
|
|
Since version 4, Lean is a partially bootstrapped program: most parts of the
|
|
frontend and compiler are written in Lean itself and thus need to be built before
|
|
building Lean itself - which is needed to again build those parts. This cycle is
|
|
broken by using pre-built C files checked into the repository (which ultimately
|
|
go back to a point where the Lean compiler was not written in Lean) in place of
|
|
these Lean inputs and then compiling everything in multiple stages up to a fixed
|
|
point. The build directory is organized in these stages:
|
|
|
|
```bash
|
|
stage0/
|
|
# Bootstrap binary built from stage0/src/.
|
|
# We do not use any other files from this directory in further stages.
|
|
bin/lean
|
|
stage1/
|
|
include/
|
|
config.h # config variables used to build `lean` such as used allocator
|
|
runtime/lean.h # runtime header, used by extracted C code, uses `config.h`
|
|
share/lean/
|
|
lean.mk # used by `leanmake`
|
|
lib/
|
|
lean/**/*.olean # the Lean library (incl. the compiler) compiled by the previous stage's `lean`
|
|
temp/**/*.{c,o} # the library extracted to C and compiled by `leanc`
|
|
libInit.a libStd.a libLean.a # static libraries of the Lean library
|
|
libleancpp.a # a static library of the C++ sources of Lean
|
|
libleanshared.so # a dynamic library including the static libraries above
|
|
bin/
|
|
lean # the Lean compiler & server, a small executable that calls directly into libleanshared.so
|
|
leanc # a wrapper around a C compiler supplying search paths etc
|
|
leanmake # a wrapper around `make` supplying the Makefile above
|
|
stage2/...
|
|
stage3/...
|
|
```
|
|
|
|
Stage 0 can be viewed as a blackbox since it does not depend on any local
|
|
changes and is equivalent to downloading a bootstrapping binary as done in other
|
|
compilers. The build for any other stage starts by building the runtime and
|
|
standard library from `src/`, using the `lean` binary from the previous stage in
|
|
the latter case, which are then assembled into a new `bin/lean` binary.
|
|
|
|
Each stage can be built by calling `make stageN` in the root build folder.
|
|
Running just `make` will default to stage 1, which is usually sufficient for
|
|
testing changes on the test suite or other files outside of the stdlib. However,
|
|
it might happen that the stage 1 compiler is not able to load its own stdlib,
|
|
e.g. when changing the .olean format: the stage 1 stdlib will use the format
|
|
generated by the stage 0 compiler, but the stage 1 compiler will expect the new
|
|
format. In this case, we should continue with building and testing stage 2
|
|
instead, which will both build and expect the new format. Note that this is only
|
|
possible because when building a stage's stdlib, we use the previous compiler
|
|
but never load the previous stdlib (since everything is `prelude`). We can also
|
|
use stage 2 to test changes in the compiler or other "meta" parts, i.e. changes
|
|
that affect the produced (.olean or .c) code, on the stdlib and compiler itself.
|
|
We are not aware of any "meta-meta" parts that influence more than two stages of
|
|
compilation, so stage 3 should always be identical to stage 2 and only exists as
|
|
a sanity check.
|
|
|
|
In summary, doing a standard build via `make` internally involves these steps:
|
|
|
|
1. compile the `stage0/src` archived sources into `stage0/bin/lean`
|
|
1. use it to compile the current library (*including* your changes) into `stage1/lib`
|
|
1. link that and the current C++ code from `src/` into `stage1/bin/lean`
|
|
|
|
You now have a Lean binary and library that include your changes, though their
|
|
own compilation was not influenced by them, that you can use to test your
|
|
changes on test programs whose compilation *will* be influenced by the changes.
|
|
|
|
Finally, when we want to use new language features in the library, we need to
|
|
update the stage 0 compiler, which can be done via `make -C stageN update-stage0`.
|
|
`make update-stage0` without `-C` defaults to stage1.
|
|
|
|
## Further Bootstrapping Complications
|
|
|
|
As written above, changes in meta code in the current stage usually will only
|
|
affect later stages. This is an issue in two specific cases.
|
|
|
|
* For *non-builtin* meta code such as `notation`s or `macro`s in
|
|
`Notation.lean`, we expect changes to affect the current file and all later
|
|
files of the same stage immediately, just like outside the stdlib. To ensure
|
|
this, we need to build the stage using `-Dinterpreter.prefer_native=false` -
|
|
otherwise, when executing a macro, the interpreter would notice that there is
|
|
already a native symbol available for this function and run it instead of the
|
|
new IR, but the symbol is from the previous stage!
|
|
|
|
To make matters more complicated, while `false` is a reasonable default
|
|
incurring only minor overhead (`ParserDescr`s and simple macros are cheap to
|
|
interpret), there are situations where we *need* to set the option to `true`:
|
|
when the interpreter is executed from the native code of the previous stage,
|
|
the type of the value it computes must be identical to/ABI-compatible with the
|
|
type in the previous stage. For example, if we add a new parameter to `Macro`
|
|
or reorder constructors in `ParserDescr`, building the stage with the
|
|
interpreter will likely fail. Thus we need to set `interpreter.prefer_native`
|
|
to `true` in such cases to "freeze" meta code at their versions in the
|
|
previous stage; no new meta code should be introduced in this stage. Any
|
|
further stages (e.g. after an `update-stage0`) will then need to be compiled
|
|
with the flag set to `false` again since they will expect the new signature.
|
|
|
|
For an example, see https://github.com/leanprover/lean4/commit/da4c46370d85add64ef7ca5e7cc4638b62823fbb.
|
|
|
|
* For the special case of *quotations*, it is desirable to have changes in
|
|
built-in parsers affect them immediately: when the changes in the parser
|
|
become active in the next stage, macros implemented via quotations should
|
|
generate syntax trees compatible with the new parser, and quotation patterns
|
|
in macro and elaborators should be able to match syntax created by the new
|
|
parser and macros. Since quotations capture the syntax tree structure during
|
|
execution of the current stage and turn it into code for the next stage, we
|
|
need to run the current stage's built-in parsers in quotation via the
|
|
interpreter for this to work. Caveats:
|
|
* Since interpreting full parsers is not nearly as cheap and we rarely change
|
|
built-in syntax, this needs to be opted in using `-Dinternal.parseQuotWithCurrentStage=true`.
|
|
* The parser needs to be reachable via an `import` statement, otherwise the
|
|
version of the previous stage will silently be used.
|
|
* Only the parser code (`Parser.fn`) is affected; all metadata such as leading
|
|
tokens is taken from the previous stage.
|
|
|
|
For an example, see https://github.com/leanprover/lean4/commit/f9dcbbddc48ccab22c7674ba20c5f409823b4cc1#diff-371387aed38bb02bf7761084fd9460e4168ae16d1ffe5de041b47d3ad2d22422
|
|
(from before the flag defaulted to `false`).
|
|
|
|
To modify either of these flags both for building and editing the stdlib, adjust
|
|
the code in `stage0/src/stdlib_flags.h`. The flags will automatically be reset
|
|
on the next `update-stage0` when the file is overwritten with the original
|
|
version in `src/`.
|