@kha I have added a few performance counters.
I collect their values at each snapshot.
Right now, I am printing only the values in the last snapshot, but if we want
we can even display their progress over time.
Right now, I track the following information
- number of allocated closures
- number of allocated constructors/objects
- number of allocated big numbers
@aqjune @nunoplopes: See new tests at tests/lean/run/random.lean
We have two actions in `io`. By default, `io` uses the C++
random number generator, but we can force it to use a `std_gen` with
the action `set_rand_gen`.
def rand (lo : nat := std_range.1) (hi : nat := std_range.2) : io nat
def set_rand_gen : std_gen → io unit
@kha I've added
iterator.extract : iterator -> iterator -> option string
It returns `none` if the iterators are "incompatible".
If this function is inconvenient to use, we can change it and return the
empty string in these cases.
Given iterators `it1` and `it2`, if they are sharing the same string
object in memory, then the cost is O(pos(it2) - pos(it1)).
If not, we have an extra O(N) step where we check whether the strings
being iterated by it1 and it2 are equal (`N` is the size of the strings).
In most applications, I believe the iterators will share the string
object.
I didn't test the code much. BTW, I found an unrelated bug at
vm_string.cpp. So, I'm not very confident this code is rock solid.
It fixes the issue by propagating the correct information to the
equation compiler.
The fix may be a little bit hackish, but it is comapatible with
the approach we are already using: store `m_is_meta` flag in the equation
macro.
Disclaimer: we may still have other instances of this bug, since
the information may still be propagated incorrectly in other places.
I will not refactor this code right now nor accept any PR that
changes the current design. I am busy in other parts of the code
base and do not have time to do the context switch required for
implementing this kind of change and/or review the PR and make sure I'm
happy with it.
`{s with ...}` is now `{..., ..s}`, which more clearly expresses that the
result type is not necessarily equal to the type of `s` (in absence of an
expected type and a structure name, we still default to the type of `s`).
Multiple fallback sources can be given: `{..., ..s, ..t}` will fall back to
searching a field in `s`, then in `t`. The last component can also be `..`,
which will replace any missing fields with a placeholder.
The old notation will be removed in the future.
Now, `cmp` is just a fixed helper function.
In the future, we will be able to use (more efficient) specialized
versions during code generation by defining simp rules.
@kha I added the `d_array` type that we discussed today.
However, the VM implemantion is still using persistent arrays.
If we remove the persistent array support, then code using
hash_map will only be efficient if the hash_map is used linearly.
This is not the case in the reader module because we are planning
to support backtracking.
On the other hand, it is awkward we currently don't have a vanilla
array implementation in the VM. I suspect this will be a problem in
the future.
So, I see the following possibilities:
1- We implement a map data-structure using red-black trees in Lean.
We use this new data-structure to implement all maps in the new reader and
macro expander.
2- We implement a very simple map as a list of pairs.
Then, we replace it in the VM with an efficient implementation.
The VM implementation may use our internal red-black trees.
We may also use a persistent hash table implemented in C++,
but it would be awkward to ask the user to provide a hash function in the reference
implementation (i.e., the one using list of pairs), but not use it
anywhere :)
In contrast, if we use the red-black tree implementation we
would have to ask the user to provide a total order.
It is overkill for the list of pair reference implementation because
we just need an equality test, but, at least, the comparison function
will be used in the implementation.
3- Add types `d_parray` (dependent persistent array) and
`parray` (persistent array). In Lean, they would just wrap the
`d_array` and `array` types. In the VM, `d_array` and `array` would
be implemented using vanilla arrays and `d_parray` and `parray` would
be implemented using persistent arrays. Then, we could have
`d_hash_map`, `hash_map`, `d_phash_map` and `phash_map`. Argh, so many
versions :(
We would use `phash_map` to implement our reader and macro expander.
4- Add a `(persistent : bool := ff)` parameter to `d_array` and
`array` types. The disadvantage of this approach is that it has
a performance impact. The VM implementation would have to check
the `persisent` flag at runtime. The value of this flag is known
at compilation time, but we currently don't have a mechanism
for specializing native builtin C++ implementations for VM functions.
The equation compiler uses different strategies for processing
recursive equations. Some of them may produce unclear runtime cost
model. For example, the following fibonacci functions was running in
linear time instead of exponential time because the equation compiler
used the brec_on recursor.
def fib : nat → nat
| 0 := 1
| 1 := 1
| (n+2) := fib (n+1) + fib n
@dselsam and @jroesch have reported examples were the equation compiler
produces a negative performance impact. The new test (`eval` function)
captures the problem reported by @jroesch. In this example, the runtime
should not depend on the "amount of fuel".
This commit addresses this issue.
closes#1175
The types `string_imp` and `string.iterator_imp` were supposed to be
marked private, but we cannot do it because we need to provide
`string_imp.mk`, `string_imp.cases_on`, `string.iterator_imp.mk` and
`string.iterator_imp.cases_on` in the VM since we use a different
internal representation. Note that marking them as private does not
work since users can still access `string_imp.cases_on` using
meta-programming.
So, we need better support for private declarations.
Missing feature, char literals do not support non ASCII values.
That is, in the current implementation, we cannot write 'α'.
This will be implemented in the future.
The VM native implementation does not behave correctly for huge
strings (i.e., strings with more than 4G characters).
The problem is that the current implementation relies on
```
size_t force_to_size_t(vm_obj const & o, size_t def)
```
We may also have overflow problems in the string.iterator implementation
code. This is not a big deal right now, since I doubt we will try
to process string with more than 2^32 characters.
@Kha the `core_lib` and tests seem to be working correctly, but
we need more tests.
We need this feature for:
1) Defining nonlinear search patterns. Example: (?m <= ?m + 1)
2) Preprocessing recursive equations and support the pattern
refinement approach used in Agda. Example: in Agda, they accept
```
def append {A : Type} : Π (m n : nat), Vec A m -> Vec A n -> Vec A (m + n)
| m n nil ys := ys
| m n (cons m' x xs) ys := cons x (append m' n xs ys)
```
These equations have to be refined. For example, `m` has to be
replaced with `0` (in the first equation), and `succ m'` in the
second. To implement this kind of refinement, we need to convert
the pattern variables (local constants) into metavariables during
elaboration. Then, the unassigned metavariables become local constants
again. This preprocessing step will fix some of the issues on #1594.
To completely fix#1594, we will need yet another preprocessing step
which will implement "complete transition" used in the equation
compiler before we start elim_match.cpp