The `induction h` tactic tries to clear hypothesis `h` after it is
applied. But, before this commit, `cases h` would only try to clear `h`
when performing non-dependent elimination. This was problematic when
writing tactic scripts for automating proofs.
The `no_confusion` construction is only generated for inductive
datatypes supported in the kernel.
Before this commit, given `h : T`, `cases h` could leak the internal encoding
used by the inductive compiler WHEN a nested and/or mutual inductive
datatype is used to index the inductive datatype `T`.
The new test exposes the problem.
The solution implemented in this commit uses inj_arrow lemmas
generated by the inductive compiler. We only use the lemmas
if the target is a proposition. If it is not, we sign an error.
The reason for this limitation is documented in the source code.
cc @jroesch @dselsam
Jared: the information leakage has been fixed. So, students will not be
confused by the internal encoding used in the inductive compiler.
I added the example I posted on slack as a new test.
Note that, the workaround I used has been removed.
`{s with ...}` is now `{..., ..s}`, which more clearly expresses that the
result type is not necessarily equal to the type of `s` (in absence of an
expected type and a structure name, we still default to the type of `s`).
Multiple fallback sources can be given: `{..., ..s, ..t}` will fall back to
searching a field in `s`, then in `t`. The last component can also be `..`,
which will replace any missing fields with a placeholder.
The old notation will be removed in the future.
Function applications `(f ...)` were not being elaborated correctly when
`f` has implicit parameters occurring after auto_params.
The new test exposes the problem.
This bug was found when developing the red black tree module.
This commit also fixes the following bugs:
- Invoke type class resolution again after tactic execution at
synthesize method. Reason: metavariables occurring in type
class instances may have been synthesized by tactics.
- mctx.assign optimization at invoke_tactic was incorrect
when the metavariable was assigned by typing rules.
Now, `cmp` is just a fixed helper function.
In the future, we will be able to use (more efficient) specialized
versions during code generation by defining simp rules.
The new test exposes the problem. Before this commit, the common
subexpressions at
```
def tst : tree → nat
| (tree.leaf v) := v
| (tree.node v l r) :=
match f v with
| tt := tst l + tst l - tst l -- <<< HERE
| ff := tst r
end
```
were not converted into a let-exprs.
We use the auxiliary procedure pull_nested_rec_fn to pull recursive
application in nested match expressions. This is needed because the
nested match expression is compiled before we process the recursive
procedure that contains it. This transformation may produce
performance problems if the recursive application does not depend on
the data being matched. Here is an example from the new test:
```
def tst : tree → nat
| (tree.leaf v) := v
| (tree.node v l r) :=
match f v with
| tt := tst l
| ff := tst r
end
```
pull_nested_rec_fn will convert it into
```
def tst : tree → nat
| (tree.leaf v) := v
| (tree.node v l r) := tst._match_1 (f v) (tst l) (tst r)
```
Since our interpreter uses eager evaluation, both `(tst l)` and `(tst r)`
are executed. This commit fixes this issue by expanding `tst._match_1`
during code generation.
@kha I added the `d_array` type that we discussed today.
However, the VM implemantion is still using persistent arrays.
If we remove the persistent array support, then code using
hash_map will only be efficient if the hash_map is used linearly.
This is not the case in the reader module because we are planning
to support backtracking.
On the other hand, it is awkward we currently don't have a vanilla
array implementation in the VM. I suspect this will be a problem in
the future.
So, I see the following possibilities:
1- We implement a map data-structure using red-black trees in Lean.
We use this new data-structure to implement all maps in the new reader and
macro expander.
2- We implement a very simple map as a list of pairs.
Then, we replace it in the VM with an efficient implementation.
The VM implementation may use our internal red-black trees.
We may also use a persistent hash table implemented in C++,
but it would be awkward to ask the user to provide a hash function in the reference
implementation (i.e., the one using list of pairs), but not use it
anywhere :)
In contrast, if we use the red-black tree implementation we
would have to ask the user to provide a total order.
It is overkill for the list of pair reference implementation because
we just need an equality test, but, at least, the comparison function
will be used in the implementation.
3- Add types `d_parray` (dependent persistent array) and
`parray` (persistent array). In Lean, they would just wrap the
`d_array` and `array` types. In the VM, `d_array` and `array` would
be implemented using vanilla arrays and `d_parray` and `parray` would
be implemented using persistent arrays. Then, we could have
`d_hash_map`, `hash_map`, `d_phash_map` and `phash_map`. Argh, so many
versions :(
We would use `phash_map` to implement our reader and macro expander.
4- Add a `(persistent : bool := ff)` parameter to `d_array` and
`array` types. The disadvantage of this approach is that it has
a performance impact. The VM implementation would have to check
the `persisent` flag at runtime. The value of this flag is known
at compilation time, but we currently don't have a mechanism
for specializing native builtin C++ implementations for VM functions.