This PR renames the function `ForwardPattern.dropPrefix?` to
`ForwardPattern.skipPrefix`?
This function `(s : String.Slice) -> Option s.Pos` is not to be confused
with `String.Slice.dropPrefix? : (s : String.Slice) -> Option
String.Slice`.
This PR verifies all of the `String` iterators except for the bytes
iterator by relating them to `String.toList`.
Along the way we define `String.posLE` and `String.posLT` analogously to
`String.posGE` and `String.posGT` and redefine `String.prev` to go
through `String.posLT`.
We also define and verify `String.positionsFrom` and
`String.revPositionsFrom`, which are the obvious generaliziations of
`String.positions` and `String.revPositions` starting at a positions
other than the start/end.
Finally, we get various lemmas about strings and positions, including
some nice induction principles `String.Pos.next_induction` and
`String.Pos.prev_induction`.
Of course, we also have all of the analogous results for `String.Slice`.
This PR reverses the relationship between the `ForwardPattern` and
`ToForwardSearcher` classes.
Previously, it was possible to derive `ForwardPattern` (i.e.,
`dropPrefix?`) from `ToForwardSearcher` (i.e., get an iterator of
`SearchStep (s)`). Now, we give the default instance in the other
direction: it is now possible to derive `ToForwardSearcher` from
`ForwardPattern`. Since it is usually much easier to provide
`ForwardPattern` than `ToForwardSearcher`, this means more shared code,
which pays off double since we will give a correctness proof for the
default implementation in an upcoming PR.
This PR also adds some string lemmas.
This adds `set_option debug.byAsSorry true` and `decreasing_by sorry` to
various files to allow bootstrapping with Config structure changes. These
changes will be restored after the bootstrap dance is complete.
This PR adds the function `Std.Iter.first?` and proves the specification
lemma `Std.Iter.first?_eq_match_step` if the iterator is productive.
The monadic variant on `Std.IterM` is also provided.
We use this new function to fix the default implementation for
`startsWith` and `dropPrefix` on `String` patterns, which used to fail
if the searcher returned a `skip` at the beginning. None of the patterns
we ship out of the box were affected by this, but user-defined patterns
were vulnerable.
---------
Co-authored-by: Paul Reichert <6992158+datokrat@users.noreply.github.com>
This PR moves many constants of the iterator API from `Std.Iterators` to
the `Std` namespace in order to make them more convenient to use. These
constants include, but are not limited to, `Iter`, `IterM` and
`IteratorLoop`. This is a breaking change. If something breaks, try
adding `open Std` in order to make these constants available again. If
some constants in the `Std.Iterators` namespace cannot be found, they
can be found directly in `Std` now.
This PR renames `String.ValidPos` to `String.Pos`, `String.endValidPos`
to `String.endPos` and `String.startValidPos` to `String.startPos`.
Accordingly, the deprecations of `String.Pos` to `String.Pos.Raw` and
`String.endPos` to `String.rawEndPos` are removed early, after an
abbreviated deprecation cycle of two releases.
This PR adds `Std.Slice.Pattern` instances for `p : Char -> Prop` as
long as `DecidablePred p`, to allow things like `"hello".dropWhile (· =
'h')`.
To achieve this, we refactor `ForwardPattern` and friends to be
"non-uniform", i.e., the class is now `ForwardPattern pat`, not
`ForwardPattern ρ` (where `pat : ρ`).
This PR ensures that searching for an empty string returns the expected
pattern of alternating size-zero matches and size-one rejects.
In particular, splitting by an empty string returns an array formed of
the empty string, all of the string's characters as singleton strings,
followed by another empty string. This matches the [Rust
behavior](https://doc.rust-lang.org/std/primitive.str.html#method.split),
for example.
This PR fixes a bug in `String.Slice.takeWhile` which caused it to get
its bookkeeping wrong and panic. The new version only uses safe
operations on `String.Slice.Pos`.
This PR shows that the iterators returned by `String.Slice.split` and
`String.Slice.splitInclusive` are finite as long as the forward matcher
iterator for the pattern is finite (which we already know for all of our
patterns).
At actually also completely redefines the iterators to avoid the inner
loop in `Internal.nextMatch` which generates inefficient code. Instead,
when encountering a mismach from the matcher, we `skip` the split
iterator.
This PR moves many operations involving `String.Pos.Raw` to a the
`String.Pos.Raw` namespace with the eventual aim of freeing up the
`String` namespace to contain operations using `String.ValidPos` (to be
renamed to `String.Pos`) instead.
This PR adds the `String.ValidPos.set` and `String.ValidPos.modify`
functions.
After this PR, `String.pos_lt_eq` is no longer a `simp` lemma. Add
`String.Pos.Raw.lt_iff` as a `simp` lemma if your proofs break.
This PR enforces rules around arithmetic of `String.Pos.Raw`.
Specifically, it adopts the following conventions:
- Byte indices ("ordinals") in strings should be represented using
`String.Pos.Raw`
- Amounts of bytes ("cardinals") in strings should be represented using
`Nat`.
For example, `String.Slice.utf8ByteSize` now returns `Nat` instead of
`String.Pos.Raw`, and there is a new function `String.Slice.rawEndPos`.
Finally, the `HAdd` and `HSub` instances for `String.Pos.Raw` are
reorganized. This is a **breaking change**.
The `HAdd/HSub String.Pos.Raw String.Pos.Raw String.Pos.Raw` instances
have been removed. For the use case of tracking positions relative to
some other position, we instead provide `offsetBy` and `unoffsetBy`
functions. For the use case of advancing/unadvancing a position by an
arbitrary number of bytes, we instead provide `increaseBy` and
`decreaseBy` functions. For
offsetting/unoffsetting/advancing/unadvancing a position `p` by the size
of a string `s` (resp. character `c`), use `s + p`/`p - s`/`p + s`/`p -
s` (resp. `c + p`/`p - c`/`p + c`/`p - c`).