@kha: I initially planned to use the UTF8 API only in very special
cases, but I found them to be super useful. They allow us to implement
an efficient String library mostly in Lean.
However, the there was a problem: `abbrev String.Pos := USize`.
This definition is fine for a low level API, but this is not the case
anymore. By having `String.Pos := USize`, we will not be able to
prove natural theorems for the `String` API. For example,
`String.map id s = s` did not hold. We would have to include the
artificial antecedent `s.length <= usizeMax` (or something like this).
I suspect it would be very painful.
So, this commit defines `String.Pos` as `Nat`. The performance
overhead seems to be very small.
After we erase types and proofs, `Decidable.toBool` can be replaced with
the identity function since `Decidable A` and `Bool` have the same
runtime representation. By eagerly expanding `toBool`, we introduce
unnecessary `cases` expressions.
`elim_jp1_fn` was incorrectly expanding join points that were used more
than once. The issue is that the `foreach` combinator "may" skip nodes
that have already been visited.
The `offset` field is problematic because it prevents us from having an
efficient way of moving back and forth between `String.Pos` and `String.Iterator`.
@kha I temporarily added `String.OldIterator` for making sure the
parser doesn't break. This is a temporary fix that will be eliminated
after we replace `parsec`.
@kha It will be awesome to automate the following idiom with a macro :)
```
def defIndent := 4
def getIndent (o : Options) : Nat := o.get `format.indent defIndent
@[init] def indentOption : IO Unit :=
registerOption `format.indent { defValue := defIndent, group := "format", descr := "indentation" }
```
We can use this primitive to process command line arguments of the form
`-D <key> = <value>`
TODO: allow users to attach `[init]` to definitions of the form
```
@[init] def foo : IO Unit := ...
```
and avoid the awkward auxiliary constant.
@kha I am finding the UTF8 API super useful. So, I am giving nice names
to it. The API is safe for users and the runtime implementation should
match the reference one.
The array read and write operations are now called:
- "Comfortable" version (with runtime bound checks):
`Array.get` and `Array.set` like OCaml.
It is also consistent with `Ref.get` and `Ref.put`,
and `get` and `set` for `MonadState`.
- `Fin` version (without runtime bound checks):
`Array.index` and `Array.update` like in F*.
- `USize` version (without runtime bound checks and unboxing):
`Array.idx` and `Array.updt`.
cc @kha