This commit adds the auxiliary function `expand` to break a nasty
interaction between code specialization, erasure and memory reuse.
Suppose we have
```lean
let new_array := array.update array i v in
have pr : <some property that mentions array>, from <proof>,
f new_array pr
```
Suppose `f` is not marked with `[specialize]`. Then, `pr` is erased and we get:
```lean
let new_array := array.update array i v in
f new_array box(0)
```
If `RC(array) == 1`, then the update performs a destructive update, and
we are happy.
Now, assume that `f` is marked with `[specialize]`.
Moreover, function specialization occurs before erasure since we want to
be able to use the to-be-implemented user-defined rewriting rules after specialization.
When we specialize `f` we compute a closure of its dependencies, and one of them is array because of `pr`.
Thus, we get
```lean
let new_array := array.update array i v in
f_spec new_array array
```
after specialization and erasure, but now, we don't perfom the destructive update.
BTW, I assumed we should be able to reduce the arity of `f_spec` after
erasure since the parameter `array` be dead after it. However, I found the following TODO:
https://github.com/leanprover/lean4/blob/master/src/library/compiler/reduce_arity.cpp#L50
A few commits ago, a few `RBMap` and `RBTree` functions had
the parameters `(lt : a -> a -> Prop) [DecidableRel lt]` instead of
`(lt : a -> a -> Bool)`. Recall that the compiler automatically specializes
functions with arguments of the form `[ ... ]`. Thus, after we moved to
`(lt : a -> a -> Bool)` we have to include `@[specialize]` to force the
compiler to specialize.
Old `Nat.repeat` => `Nat.for`
Old `Nat.mrepeat` => `Nat.mfor`
New `Nat.repeat` has type
```
def repeat {α : Type u} (f : α → α) (n : Nat) (a : α) : α :=
``
`List.repeat` => `List.replicate` (like in Haskell)
Avoid weird `ℕ` in List library
@kha: I initially planned to use the UTF8 API only in very special
cases, but I found them to be super useful. They allow us to implement
an efficient String library mostly in Lean.
However, the there was a problem: `abbrev String.Pos := USize`.
This definition is fine for a low level API, but this is not the case
anymore. By having `String.Pos := USize`, we will not be able to
prove natural theorems for the `String` API. For example,
`String.map id s = s` did not hold. We would have to include the
artificial antecedent `s.length <= usizeMax` (or something like this).
I suspect it would be very painful.
So, this commit defines `String.Pos` as `Nat`. The performance
overhead seems to be very small.
The `offset` field is problematic because it prevents us from having an
efficient way of moving back and forth between `String.Pos` and `String.Iterator`.
@kha I temporarily added `String.OldIterator` for making sure the
parser doesn't break. This is a temporary fix that will be eliminated
after we replace `parsec`.
We can use this primitive to process command line arguments of the form
`-D <key> = <value>`
TODO: allow users to attach `[init]` to definitions of the form
```
@[init] def foo : IO Unit := ...
```
and avoid the awkward auxiliary constant.
@kha I am finding the UTF8 API super useful. So, I am giving nice names
to it. The API is safe for users and the runtime implementation should
match the reference one.
The array read and write operations are now called:
- "Comfortable" version (with runtime bound checks):
`Array.get` and `Array.set` like OCaml.
It is also consistent with `Ref.get` and `Ref.put`,
and `get` and `set` for `MonadState`.
- `Fin` version (without runtime bound checks):
`Array.index` and `Array.update` like in F*.
- `USize` version (without runtime bound checks and unboxing):
`Array.idx` and `Array.updt`.
cc @kha