@kha: I initially planned to use the UTF8 API only in very special
cases, but I found them to be super useful. They allow us to implement
an efficient String library mostly in Lean.
However, the there was a problem: `abbrev String.Pos := USize`.
This definition is fine for a low level API, but this is not the case
anymore. By having `String.Pos := USize`, we will not be able to
prove natural theorems for the `String` API. For example,
`String.map id s = s` did not hold. We would have to include the
artificial antecedent `s.length <= usizeMax` (or something like this).
I suspect it would be very painful.
So, this commit defines `String.Pos` as `Nat`. The performance
overhead seems to be very small.
The `offset` field is problematic because it prevents us from having an
efficient way of moving back and forth between `String.Pos` and `String.Iterator`.
@kha I temporarily added `String.OldIterator` for making sure the
parser doesn't break. This is a temporary fix that will be eliminated
after we replace `parsec`.
@kha It will be awesome to automate the following idiom with a macro :)
```
def defIndent := 4
def getIndent (o : Options) : Nat := o.get `format.indent defIndent
@[init] def indentOption : IO Unit :=
registerOption `format.indent { defValue := defIndent, group := "format", descr := "indentation" }
```
Both `str` amd `raw_str` are used with string literals. This commit
makes sure we don't need to recompute the nested term
`dlist.singleton (repr s)`. This modification saves .2 secs when
parsing `core.lean` on my MacBook.
cc @kha
A few weeks ago, it was not feasible to inline `bind_mk_res` and
`orelse_mk_res` because the compilation time would increase a lot.
Since then I have improved the heuristics for deciding whether to float
`cases_on` or not.
So, I tried today to mark them with `@[inline]` again.
The corelib build time increased only 1.2 secs, but the `parser1.lean` runtime improved:
Before:
```
num. allocated objects: 18025367
num. allocated closures: 2988476
```
After:
```
num. allocated objects: 15774515
num. allocated closures: 2488695
```
I used my desktop to collect the numbers above.
We want them to be specialized for a given monad stack, but not
inlined. If we inline them, then every occurrence of `whitespace` and
`num` will specialize the nested `take_while?` application.
This is bad since we don't cache them.
Both `alternative` and `monad` implement `applicative`. However,
their default implementations for `seq_right` and `seq_left` are
different. The `alternative` implementation uses the inefficient default
version for `seq_right` available at `applicative`:
```
(seq_right := λ α β a b, const α id <$> a <*> b)
```
instead of the more efficient
```
(seq_right := λ α β x y, x >>= λ _, y)
```
defined at `monad` using the `bind` operator.
This commit makes sure the `applicative` instances for `reader_t`,
`state_t`, `option` and `parsec_t` use the efficient version.
I found the problem when inspecting the generated code for:
```
def symbol (s : string) : parsec' unit :=
(str s *> whitespace) <?> ("'" ++ s ++ "'")
```
@kha I was working in the new declaration type and using tasks there.
Since we don't have tasks yet in Lean, I decided to start refactoring
the `thunk` type. I defined it as:
```
-- TODO(Leo): mark as opaque, it is implemented by the new runtime
structure thunk (α : Type u) : Type u :=
(fn : unit → α)
def thunk.pure {α : Type u} (a : α) : thunk α :=
⟨λ _, a⟩
def thunk.get {α : Type u} (t : thunk α) : α :=
t.fn ()
```
The idea is to use the runtime primitives to implement them.
Then, I realized the support for `thunk`s in the elaborator are quite
hacky. Given `f x`, if `f`'s domain has type `thunk A`, we elaborate
`f x` as `f (fun _, x)` even if `x` has type `thunk A`.
This is quite bad, for example, suppose we have
```
def f (x : thunk A) := ...
```
Then, the following definition is type incorrect.
```
def g (x : thunk A) := f x
```
and we are forced to write
```
def g (x : thunk A) := f (x ())
```
The term `f (x ())` will be elaborated as `f (fun _, x ())` and an
unnecessary closure is created at runtime.
This mechanism inherited from Lean 3 is also incompatible with the
new thunk definition. Given `x : thunk A`, I want to write `x.get`
to retrieve the value instead of `x ()` as in Lean 3.
However, `x.get` expands into the nonsensical `(fun _, x).get`.
So, I decided to view the mapping `A` to `thunk A` as a "coercion".
I used double quotes, because it is a macro instead of a function.
If it were a coercion, then we would be using `thunk.pure` to coerce
values but this is not we want most of the time.
For example, given `f : thunk A -> B` and a term `t : A`, when we write
`f t`, we want it to be converted into `f (fun _, t)` instead of
`f (thunk.pure t)` which would eagerly compute `t`. The transformation
`t` into `fun _, t` is syntactic.
We cannot implement it using type classes. I implemented it as
a hard-coded extra case like the one from `Prop` to `bool`.
We can also add a coercion from `thunk A` to `A` to avoid the `.get`.
That being said, I had a few breakages in the code base since we only
use coercions when the given and expected type do not contain
metavariables.