This transformation is useful for caching the construction of closures
at runtime. For example, consider the following piece of code
```
λ α a,
@tree.cases_on a visit._main._lambda_1
(λ a_a a_a_1 a_a_2,
let _x_1 := visit._main ◾ a_a,
_x_2 := visit._main._lambda_5 a_a_1 a_a_2
in bind._main ◾◾◾◾ _x_1 _x_2)
```
where `visit._main._lambda_1` is of the form
```
visit._main._lambda_1 :=
λ _x, ...
```
At runtime, we will create a closure object for `visit._main._lambda_1`
since it has arity 1, but no arguments have been provided.
This commit implements a new transformation that creates an auxiliary
declaration with arity 0.
```
visit._main._closed_1 :=
visit._main._lambda_1
```
Its value is cached by the runtime. That is, the closure is created only
once.
@kha This optimizations reduces the number of closures by another 200k
at `parser1.lean`. We are now under 2million :)
`uint32` is a definition, and `type_checker::whnf` unfolds it.
To preserve the information at `erase_irrelevant`, we use a custom
`whnf_type` method that stops reduction as soon as a builtin runtime
type is found.
We do not try to check whether code generation will succeed or not for
some declaration. In the future, we should probably rename it to
`nocode` or something similar.
cc @kha
We must make sure we do not accidentally change the arity of a join
point. The arity is the number of nested lambda expressions.
For example, suppose we have
```
let jp := fun (x : unit), t
```
where `jp` is a join point of arity 1, i.e., `t` is not a lambda.
All "jumps" will be of the form: `jp ()`.
Now, suppose we simplify `t` and it becomes a lambda `fun (y : nat), y`.
We should to represent `jp` as
```
let jp := fun (x : unit) (y : nat), y
```
Because we would be changing `jp`'s arity.
We have the same problem with `cases_on` applications in LCNF.
So, we fix the problem using the same approach: an auxiliary
`let`-declaration. The simplified join point above is encoded as
```
let jp := fun (x : unit),
let _z := fun (y : nat), y
in _z
```
cc @kha This is the bug that I mentioned on slack :)
The datastructures at `local_context` used to manage used user_names
introduce a lot of overhead. They do guarantee that `get_unused_name` is
`O(log(n))`, but they slowdown much more common operations such as:
local declaration creation/deletion. We create/delete local declarations
much more often than we use `get_unused_name`.
The corelib build time is now 34.18 secs on my desktop. It was 39.5 secs.
@kha: I changed the specialization candidate selection procedure.
Now, instances are not considered for specializations unless we mark
them with `[specialize]`. The idea is that an instance application is
morally just creating the "dictionary" for invoking a polymorphic
function.
@kha I had to add this attribute because the specializer was generated
many specialization candidates for functions that take `[has_tokens ...]`
as an argument. Moreover, these candidates had a lot of
dependencies. I am trying to workaround this issue by marking the
instances with the new attribute `[nospecialize]`.
I did not mark instances created by `[derive]`. It is quite tedious to
do it.
BTW, when I was investigating the problem I stumbled at `node.view`.
Its type is:
```
node.view :
Π {α : Type} {m : Type → Type} [_inst_1 : monad m] [_inst_2 : monad_except (parsec.message syntax) m]
[_inst_3 : monad_parsec syntax m] [_inst_4 : alternative m] (k : syntax_node_kind) (rs : list (m syntax))
[i : @has_view syntax_node_kind k α], @has_view (m syntax) (@node m _inst_1 _inst_2 _inst_3 _inst_4 k rs) α
```
This looks wrong: the view depends on `[monad_parsec syntax m]`
We should also make sure definitions do not have unnecessary type
class instances. Otherwise, we will put additional stress on the code
specializer. One option is to change the frontend and filter unused
instances.