@kha I created a variant of the `rbmap` example where we create a tree
but also save "checkpoints". The idea is to simulate the idiom frequently
used in a backtracking search where we "save" the "context" before each
case split in a "trail stack".
The benchmark has two parameters: the number of nodes to be inserted, and a "frequency" (how often we create a "checkpoint").
The command `rbmap_checkpoint.lean.out n n` behaves like the original
`rbmap` benchmark, and `rbmap_checkpoint.lean.out n 1` creates a
checkpoint after each insertion. The frequency provides a simple way to
control the amount of sharing. We can provide performance numbers for
different frequencies, and show the impact on the `reset/reuse` optimization.
BTW, the performance numbers are much better than I expected. For
example,
`./rbmap_checkpoint.lean.out 1000000 10` is only 30% slower than
`./rbmap_checkpoint.lean.out 1000000 1000000`
although 100k checkpoints were created.
Another good news is that we are faster than Haskell even for
`./rbmap_checkpoint.lean.out 1000000 1`
For lists of size 0, 1 and 2, it avoids the overhead of creating
temporary lists of closures. I measure the overhead with `test1.lean`
and there is no overhead in this case.
`test1.lean` has a test for length = 4, and the overhead is 7%.
We only use longestMatch to implement the Pratt Parser.
The lists should be small. So, the overhead is acceptable.
If it is not. We can add back the `longestMatch` specific for `TermParser`.
cc @kha
The `mforeachAux` function was keeping two references to the array
because it was implemented using `miterate a ⟨a, rfl⟩ ...`
Thus, we would have to allocate a new array even if `a` was not shared.
Another issue is that when invoking `x ← f i v`, the array would still
have a reference to `v`, and consequently `RC(v) > 1`, and `f` would not
be able to perform destructive updates to `v` or reuse its memory cell.
Thus, I removed `mforeach` (we only used it to implement `hmap`: the
homogeneous map), and implemented a new `hmap` which makes sure
destructive updates can be performed modulo the issue with float `let`
inwards I described in the previous commit.
@kha I found the problem described in the previous commit when I was
using `Array.hmap`. If we use `Array`s to implement `Syntax` as we discussed,
then a `hmap` that does not prevent destructive updates from happening is
a must-have. Otherwise, any benefit we get from using `Array`s instead
of `List`s is gone.
@kha I keep finding problems with the float `let` inwards
transformation. It is always a nasty interaction between this
transformation and the `reset/reuse` insertion procedure.
The example I used in the new comment can be modified to a
`casesOn` with more than one branch (e.g., `Option.casesOn`).
Suppose we wrote
```
let o : Option Nat := Array.index a i in
let a := Array.update a i none in
Option.casesOn o
none
(fun n, some (Array.update a i (some (n + 1))))
```
In the example above, the compiler will float
`a := Array.update a i none` inwards.
```
let o : Option Nat := Array.index a i in
Option.casesOn o
none
(fun n,
let a := Array.update a i none in
some (Array.update a i (some (n + 1))))
```
Then, adding reset/reuse:
```
let o : Option Nat := Array.index a i in
Option.casesOn o
none
(fun n,
let o := reset o in
let a := Array.update a i none in
let n := n + 1 in
let o := reuse o (some n)
some (Array.update a i o))
```
Similarly to the example in the new comment, the `reset o` will fail since
the array `a` would still have a reference to `o`.
Remarks:
- Haskell also implements float `let` inwards.
- I am not sure how important the float `let` inward transformation is.
- I can see other nasty interactions after we implement user-defined
simplification rules. For example, I guess many users would find the
following lemma to be a good rewriting rule:
```
(Array.update (Array.update a i v) i w) = (Array.update a i w)
```
However, if we use this lemma in the example above, then `Array.update a i none` will be eliminated,
and `reset o` will fail.