A C++ vtable at `external_object` is bad because it prevents users
from implementing external object in different programming languages.
Another problem was memory leaks because of the vtable in the
beginning of the object.
cc @kha
The "quick" filter `&s1 != &s2` was incorrect.
It was actually always false, since it just comparing the stack address
of `s1` and `s2`.
I incorporated the quick filter into `string_eq`.
I measured the impact using `lean --new-frontend core.lean` and checking
the number of instructions executed reported by Valgrind.
Before: 5,210,225,530
After: 4,891,642,264
@kha
In this commit, we replace the option `LEAN_DEFERRED_FREE` with
`LEAN_LAZY_RC`. The idea is to match the nomenclature used in the
literature. See paper:
https://dl.acm.org/citation.cfm?id=964019
The following slide deck summarizes the paper:
http://www.hboehm.info/popl04/refcnt.pdf
We also implement the very simple approach described on this paper
where a `del(o)` just puts `o` in the "to free" list, and each
allocation frees at most one object. As pointed out in the paper
above lazy RC may prevent a lot of memory from being reclaimed.
For now, I am keeping the new option disabled.
That being said, the test `arith_eval_nat.lean` is 29% faster when
using lazy RC, and beats the OCaml version.
In the following paper
https://www.microsoft.com/en-us/research/wp-content/uploads/2017/01/tm567-1.pdf
a separate thread keeps processing the "to free" list. However, I
think this approach is not compatible with our
`object_memory_kind::STHeap` trick.
Tomorrow, I will measure the space overhead when compiling the Lean
corelib using Lazy RC using my linux desktop
cc @kha
Some of the primitives do not have optimal implementation.
@Kha Could you please check if everything we use in the parser has a
reasonable implementation?
The modification introduces an overhead of 1.5% on the
execution time. Here is the the time for compiling the corelib
Before: 8.61 secs (avg of 3 runs)
After: 8.74 secs (avg of 3 runs)
On the other hand, the size of the compacted region for the command
`#compact_tst 10` is smaller.
Before: 176687728
After: 153794704
The size before this change was 14.8% bigger.
For reference, using the old serializer we generate a buffer of size 105291117.
cc @kha