The compiler frontend implemented in C++ is eagerly inlining local
functions. The new test would take an absurd amount of time without
the new hack.
We remove this hack when we re-implement the compiler frontend in Lean.
@Kha @dselsam:
This hack was preventing us from making `Expr` a "real" Lean type.
This was bad for a few reasons:
- It was hard to extend/modify `Expr` in Lean since we would also have
to modify the C++ code that creates the `Expr` objects with the hidden
fields.
- `Expr.lam` and `Expr.forallE` were not following the Lean layout
standard where we sort fields by size. @Kha: recall we used that to
avoid a UB. The issue with `Expr.lam` and `Expr.forallE` is that they
have a "visible" field (`BinderInfo`), which is smaller than
hidden fields such as hash code.
- `Expr.fvar` had only one field at `Expr.lean,` but four behind the
scenes.
I added a new constructor `Local` that is only accessible from C++.
It is only used in legacy code we inherited from Lean2.
We will eventually delete it.
This refactoring was quite painful since many parts of the codebase
were mixing the new `Expr.fvar` with the old `Expr.local`.
I doubt I would be able to do it without the new staging framework
@Kha built.
BTW, some of the patches are horrible. I didn't care much since we
are going to deleted the super ugly files. That being said,
you should expect new weird bevaior due to `Expr.fvar` vs `Expr.local`.
Next step: use the new `ExprCachedData` to make all `Expr` hidden visibles
accessible from Lean.
checkpoint
@kha @dselsam:
The main motivation for this change are functions such as `panic`.
I marked `panic` with the attribute `@[effectful]`.
Here is the summary of the changes. If `f` is marked as `@[effectful]`
1- Compiler will not perform common subexpression elimination on terms of the form `f ...`.
2- Compiler will not extract closed terms of the form `f ...`.
3- Compiler will throw an error if `f` is partially applied.
The compiler applies transformations `t ==> s` where `t` and `s` are
only provably equal, but not definitionally equal. Moreover, in
dependent type theory `C[t]` may be type correct, but `C[s]` is not
when `t` is provably equal to `s`, but not definitionally equal.
Consider the following example:
```
fun n : nat,
let v : nat := 0+n in
let b : bv (0+n) := bv.mk v in
b
```
If we replace the first `0+n` with `n`, we produce the type incorrect
term
```
fun n : nat,
let v : nat := n in
let b : bv (0+n) := bv.mk v in
b
```
This is incorrect because `b : bv (0+n)` and `bv.mk v : bv n`, and
`0+n` is not definitionally equal to `n`.
We considered two approaches to address this problem:
1- A relaxed type checker (see deleted file `ctype_checker.cpp`).
This approach does not solve the problem, see example in the end
of the commit.
2- Introduce `lc_cast`-applications to "fix" type problems. However,
it seems it is too expensive to detect all places that require a
`lc_ast` application. We did that in a few places, but there is major
omission: when we simplify `x : t := v` to `x : t := w` (like in the
example above), we did not find an efficient way to repair all
affected places. We have implemented the function `replace_fvar_with`,
but we would have to execute after each `x : t := v` ==> `x : t := w`
where `v` is not definitionally equal to `w`. We considered grouping
many `replace_fvar_with` together, but it would still be very
expensive. Basically, for any occurrence of `x` we would have to check
whether the resulting type changed or not. Another issue is that the
restrictions considered at `replace_fvar_with` seem to restrictive:
for example `x` cannot occur in lambda/let-decl types. This happens
very frequently in code that uses types such as `decidable p`.
Remark: note that the Lean3 simplifier does not have support for
simplyfing let-decls for this reason. It offers only two options:
expand the `let`, or skip it.
So, this commit gives up the idea of ensuring that each compiler step
produces a type correct Lean term. That is, we preserve type
information, but we do not guarantee the terms are type correct after
we apply compiler transformations.
Here are examples that demonstrate the `ctype_checker` approach does
not work.
```
def C : bool → Type → Type
| tt α := nat × α
| ff α := string × α
def ex (α β : Type) (x : nat × α) (h₂ : α = β) : nat × β :=
let f : Π (b : bool) (h₁ : b = tt) (α β : Type) (x : C b α) (h₂ : α = β), nat × β
:= λ (b : bool) (h₁ : b = tt) (α β : Type) (x : C b α) (h₂ : α = β),
let x₁ : C b β := @eq.ndrec Type α (λ z : Type, C b z) x β h₂ in
@eq.ndrec bool b (λ z : bool, C z β) x₁ tt h₁
in f tt rfl α β x h₂
```
Now, suppose we want to put in LCNF. Then, we have to decide whether
each `eq.ndrec` will become a cast or not. Now, suppose we use the
compiler type checker `is_def_eq`.
Then, the first `eq.ndrec` application is `@eq.ndrec Type α (λ z :
Type, C b z) x β h₂`, and we have that `x : C b α` and the expected
type is `C b β`, both types are stuck since they reduce to
`bool.cases_on b (string × α) (nat × α)` and `bool.cases_on b (string
× β) (nat × β)` respectively. Thus, they are considered definitionally
equal, and we do not introduce a cast.
The second `eq.ndrec` application is
`@eq.ndrec bool b (λ z : bool, C z β) x₁ tt h₁`, and we have
`x₁ : C b β` and the expected type is `nat × β`, again they are definitionally equal since `C b β`.
Thus, in LCNF, the example above would be:
```
def ex (α β : Type) (x : nat × α) (h₂ : α = β) : nat × β :=
let f : Π (b : bool) (h₁ : b = tt) (α β : Type) (x : C b α) (h₂ : α = β), nat × β
:= λ (b : bool) (h₁ : b = tt) (α β : Type) (x : C b α) (h₂ : α = β), x
in f tt rfl α β x h₂
```
The compiler type checker (`ctype_checker`) would say this is a type correct term.
Now, suppose we reduce `f tt rfl α β x h₂` using `csimp`, this term
would reduce to `x`.
Then, `f` is eliminated since it is not used anymore and we have the new simplified term
```
def ex (α β : Type) (x : nat × α) (h₂ : α = β) : nat × β := x
```
`ctype_checker` will report it to be type incorrect. Since, `x : nat × α`
and the result type is `nat × β`, and both are not stuck.
This example shows that by applying simple reduction rules (zeta/beta
in the example above) we can transform a type correct term into a type
incorrect one.
The type `any` is also problematic. Consider the following example.
```
def ex (x : nat) : real :=
let f : any -> real := fun z, z in
f x
```
It is type correct. If we apply `csimp`, we will get.
```
def ex (x : nat) : real := x
```
which is type incorrect. Again, reduction may transform a type correct term into a type incorrect one.
Note that, I am not using anything fancy here. We are just treating
`any` as definitionally equal to any type.
The LCNF format contained may `let`-declarations of the form
```
x : (fun y, c) a := t
```
where `c` does not depend on `y`.
We reduce them to
```
x : c := t
```
We also move `environment` and `name_generator` to
`type_checker::context`. Reason: the cache assumes the environment did
not change. The (cache) correctness relies on the fact that we don't
reuse free variable identifiers.