When used with `COMPRESSED_OBJECT_HEADER`, Lean uses a compressed
object header where only 32-bits are reserved for the RC.
The motivation is performance, in our experiments, it is faster to
access a 32-bit counter than a 45-bit one.
With a smaller RC, we can use 8-bits for the memory kind information,
and speedup its access.
The implementation is good enough for implementing extensible parsers,
elaborators and tactics, but there are a few TODOs
1- We should have a better story for standalone applications.
Most of them don't need `evalConst`, and the global table is
just initialization overhead.
2- The global table introduces a dependency on the `Lean.Name`
implementation. So, all standalone applications will depend on it.
3- We are not storing arity 0 constants in the table.
This one should be easy to fix in the future.
The Scala/Clojure approach for persistent arrays works great with our
`reset/reuse`. We seem to be much more efficient than their
implementations because of `reset/reuse`. The new approach also seems
better than the old one implemented in the runtime, and has a few
advantages:
1- The reroot procedure used in the old approach required
synchronization for multi-threaded code, or we would need to perform
deep copies when sending `parray` objects between threads.
2- We don't need any runtime extension for the new approach.
3- The old approach used "trail lists" for undoing array updates.
This works well for bactracking search use cases, but it is bad
in use cases where we are simultaneously updating the persistent
arrays that have shared nodes.
The new `partial def`s allow us to define `fix` in Lean, but the Lean
implementation is not as efficient as the native one. The native one
in C++ use weak pointers to prevent a closure allocation at every
recursive invocation.
This commit also fixes the `fixCore` helper functions that were broken
after we switched to camelCase.
We have updated the test `fix1.lean` to demonstrate the native
implementation is faster. Here are the numbers on my desktop.
```
./run.sh fix1.lean 24
721420279
Time for 'native fix': 816ms
721420279
Time for 'fix in lean': 1.34s
```
In 64-bit machines, the max small nat value should now be (2^63 - 1), and on 32-bit
machines (2^32 - 1).
The main motivation for this modification are the array indexing
operations. With the new representation, if a Nat index is not small,
then it must not be a valid index. This was not true in 64-bit
machines. Example: an array of size 2^33 would fit in memory, and but
an index `i` > 2^32 - 1 would not be a small nat value.
The idea is to avoid allocating tuples when creating the fixpoint of
nary functions. For example, consider the new tests:
- tests/playground/fix.lean
- tests/playground/fix_with_tuples.lean
The second one (`fix_with_tuples`) uses the `fix` operator and tuples. For input = 20,
it creates more than 1 million extra objects. The first implementation
(`fix.lean`) using `fix₂` avoids this overhead.
TODO: Add support for pattern #N with N > 9 at
```
def expand_extern_pattern_aux (args : list string) : nat → string.iterator → string → string
```