Motivation: explicit control flow graph
TODO: disabled `split_entries` for now.
I believe the new feature exposed a bug at `move_to_entries`.
I will fix this new issue in another commit.
We were getting assertion violations when compiling corelib in debug
mode. There were two problems:
1- We were not capturing free variables occurring in types.
2- A paremeter could depend on a free variable associated with a let-declaration.
This feature was originally added to implement the state monad.
It is not needed anymore since we have a better encoding that doesn't
require this feature and avoids memory allocations.
This transformation is useful for caching the construction of closures
at runtime. For example, consider the following piece of code
```
λ α a,
@tree.cases_on a visit._main._lambda_1
(λ a_a a_a_1 a_a_2,
let _x_1 := visit._main ◾ a_a,
_x_2 := visit._main._lambda_5 a_a_1 a_a_2
in bind._main ◾◾◾◾ _x_1 _x_2)
```
where `visit._main._lambda_1` is of the form
```
visit._main._lambda_1 :=
λ _x, ...
```
At runtime, we will create a closure object for `visit._main._lambda_1`
since it has arity 1, but no arguments have been provided.
This commit implements a new transformation that creates an auxiliary
declaration with arity 0.
```
visit._main._closed_1 :=
visit._main._lambda_1
```
Its value is cached by the runtime. That is, the closure is created only
once.
@kha This optimizations reduces the number of closures by another 200k
at `parser1.lean`. We are now under 2million :)
`uint32` is a definition, and `type_checker::whnf` unfolds it.
To preserve the information at `erase_irrelevant`, we use a custom
`whnf_type` method that stops reduction as soon as a builtin runtime
type is found.
A few weeks ago, it was not feasible to inline `bind_mk_res` and
`orelse_mk_res` because the compilation time would increase a lot.
Since then I have improved the heuristics for deciding whether to float
`cases_on` or not.
So, I tried today to mark them with `@[inline]` again.
The corelib build time increased only 1.2 secs, but the `parser1.lean` runtime improved:
Before:
```
num. allocated objects: 18025367
num. allocated closures: 2988476
```
After:
```
num. allocated objects: 15774515
num. allocated closures: 2488695
```
I used my desktop to collect the numbers above.
We do not try to check whether code generation will succeed or not for
some declaration. In the future, we should probably rename it to
`nocode` or something similar.
cc @kha
We must make sure we do not accidentally change the arity of a join
point. The arity is the number of nested lambda expressions.
For example, suppose we have
```
let jp := fun (x : unit), t
```
where `jp` is a join point of arity 1, i.e., `t` is not a lambda.
All "jumps" will be of the form: `jp ()`.
Now, suppose we simplify `t` and it becomes a lambda `fun (y : nat), y`.
We should to represent `jp` as
```
let jp := fun (x : unit) (y : nat), y
```
Because we would be changing `jp`'s arity.
We have the same problem with `cases_on` applications in LCNF.
So, we fix the problem using the same approach: an auxiliary
`let`-declaration. The simplified join point above is encoded as
```
let jp := fun (x : unit),
let _z := fun (y : nat), y
in _z
```
cc @kha This is the bug that I mentioned on slack :)