Hi, these are just some spelling corrections.
There is one I wasn't completely sure about in
src/Init/Data/List/Lemmas.lean:
> See also
> ...
> Also
> \* \`Init.Data.List.Monadic\` for **addiation** _(additional?)_ lemmas
about \`List.mapM\` and \`List.forM\`
This PR performs minor maintenance on the String API
- Rename `String.Pos.toCopy` to `String.Pos.copy` to adhere to the
naming convention
- Rename `String.Pos.extract` to `String.extract` to get sane dot
notation again
- Add `String.Slice.Pos.extract`
This PR adds missing docstrings for constants that occur in the
reference manual.
---------
Co-authored-by: Johannes Tantow <44068763+jt0202@users.noreply.github.com>
This PR renames `String.Slice.Pos.ofSlice` to `String.Pos.ofToSlice` to
adhere with the (yet-to-be documented) naming convention for mapping
positions to positions. It then adds several new functions so that for
every way to construct a slice from a string and slice, there are now
functions for mapping positions forwards and backwards along this
construction.
This PR updates the `foldr`, `all`, `any` and `contains` functions on
`String` to be defined in terms of their `String.Slice` counterparts.
This is the last one in a long series of PRs. After this, all `String`
operations are polymorphic in the pattern, and no `String` operation
falls back to `String.Pos.Raw` internally (except those in the
`String.Pos.Raw` and `String.Substring.Raw` namespaces of course, which
still play a role in metaprogramming and will stay for the foreseeable
future).
This PR renames `String.bytes` to `String.toByteArray`.
This is for two reasons: first, `toByteArray` is a better name, and
second, we have something else that wants to use the name `bytes`,
namely the function that returns in iterator over the string's bytes.
This PR renames `String.ValidPos` to `String.Pos`, `String.endValidPos`
to `String.endPos` and `String.startValidPos` to `String.startPos`.
Accordingly, the deprecations of `String.Pos` to `String.Pos.Raw` and
`String.endPos` to `String.rawEndPos` are removed early, after an
abbreviated deprecation cycle of two releases.
This PR cleans up the API around `String.find` and moves it uniformly to
the new position types `String.ValidPos` and `String.Slice.Pos`
Overview:
- To search for a character, character predicate, string or slice in a
string or slice `s`, use `s.find?` or `s.find`.
- To do the same, but starting at a position `p` of a string or slice,
use `p.find?` or `p.find`.
- To do the same but between two positions `p` and `q`, construct the
slice from `p` to `q` and then use `find?` or `find` on that.
- To search backwards, all of the above applies, except that the
function is called `revFind?`, there is no non-question-mark version
(use `getD` if there is a sane default return value in your specific
application), and that you can only search for characters and character
predicates, not strings or slices.
This PR redefines `front` and `back` on `String` to go through
`String.Slice` and adds the new `String` functions `front?`, `back?`,
`positions`, `chars`, `revPositions`, `revChars`, `byteIterator`,
`revBytes`, `lines`.
This PR renames `String.replaceStartEnd` to `String.slice`,
`String.replaceStart` to `String.sliceFrom`, and `String.replaceEnd` to
`String.sliceTo`, and similar for the corresponding functions on
`String.Slice`.
This PR fixes several memory leaks in the new `String` API.
These leaks are mostly situations where we forgot to put borrowing
annotations. The single
exception is the new `String` constructor `ofByteArray`. It cannot take
the `ByteArray` as
a borrowed argument anymore and must thus free it on its own.
This PR introduces a function `String.split` which is based on
`String.Slice.split` and therefore supports all pattern types and
returns a `Std.Iter String.Slice`.
This supersedes the functions `String.splitOn` and `String.splitToList`,
and we remove all all uses of these functions from core. They will be
deprecated in a future PR.
Migrating from `String.splitOn` and `String.splitToList` is easy: we
introduce functions `Iter.toStringList` and `Iter.toStringArray` that
can be used to conveniently go from `Std.Iter String.Slice` to `List
String` and `Array String`, so for example `s.splitOn "foo"` can be
replaced by `s.split "foo" |>.toStringList`.
This PR redefines `String.take` and variants to operate on
`String.Slice`. While previously functions returning a substring of the
input sometimes returned `String` and sometimes returned
`Substring.Raw`, they now uniformly return `String.Slice`.
This is a BREAKING change, because many functions now have a different
return type. So for example, if `s` is a string and `f` is a function
accepting a string, `f (s.drop 1)` will no longer compile because
`s.drop 1` is a `String.Slice`. To fix this, insert a call to `copy` to
restore the old behavior: `f (s.drop 1).copy`.
Of course, in many cases, there will be more efficient options. For
example, don't write `f <| s.drop 1 |>.copy |>.dropEnd 1 |>.copy`, write
`f <| s.drop 1 |>.dropEnd 1 |>.copy` instead. Also, instead of `(s.drop
1).copy = "Hello"`, write `s.drop 1 == "Hello".toSlice` instead.
This PR is split from a future PR and adds the function
`String.Pos.next`, an alias (and soon to be correct name) of
`String.ValidPos.next`.
This is for boring bootstrapping reasons.
This PR renames `Substring` to `Substring.Raw`.
This is to signify its status as a second-class citizen (not deprecated,
but no real plans for verification, like `String.Pos.Raw`) and to free
up the name `Substring` for a possible future type `String.Substring :
String -> Type` so that `s.Substring` is the type of substrings of `s`.
The functions `String.toSubstring` and `String.toSubstring'` will remain
for now for bootstrapping reasons.
This PR aims to bring the performance of `String.ValidPos` closer to
that of `String.Pos.Raw` by adding/correcting `extern` annotations as
needed.
This is in response to a regression observed after #11127. The changes
to the `String` `Parsec` module lead to different compiler behavior for
functions like `strCore` and `natCore`. The new IR *looks* better than
the old IR, but the
[numbers](1e438647ba)
are a bit mixed.
This PR removes all uses of `String.Iterator` from core, preferring
`String.ValidPos` instead.
In an upcoming PR, `String.Iterator` will be renamed to
`String.Legacy.Iterator`.
This PR establishes `String.ofList` and `String.toList` as the preferred
method for converting between strings and lists of characters and
deprecates the alternatives `String.mk`, `List.asString` and
`String.data`.
This PR adds the basic infrastructure to perform termination proofs
about `String.ValidPos` and `String.Slice.Pos`.
We choose approach where the intended way to do termination arguments is
to argue about the position itself rather than some projection of it
like `remainingBytes`.
The types `String.ValidPos` and `String.Slice.Pos` are equipped with a
`WellFoundedRelation` instance given by the greater-than relation. This
means that if a function takes a position `p` and performs a recursive
call on `q`, then the decreasing obligation will be `p < q`. This works
well in the common case where `q` is `p.next h`, in which case the goal
`p < p.next h` is solved by the simplifier.
For stepping through a string backwards, we introduce a type synonym
with a `WellFoundedRelation` instance given by the less-than relation.
This means that if a function takes a position `p` and performs a
recursive call on `q` and specifies `termination_by p.down`, then the
decreasing obligation will be `q < p`. This works well in the case where
`q` is `p.prev h`, in which case the goal `p.prev h < p` is solved by
the simplifier.
For termination arguments invoving multiple strings, the lower-level
primitive `p.remainingBytes` (landing in `Nat`) is also available.
In a future PR, we will additionally provide the necessary typeclasses
instances to register `String.ValidPos` and `String.Slice.Pos` with
`grind` to make complex termination arguments more convenient in user
code.
This PR fixes name mangling to be unambiguous / injective by adding `00`
for disambiguation where necessary. Additionally, the inverse function,
`Lean.Name.unmangle` has been added which can be used to unmangle a
mangled identifier. This unmangler has been added to demonstrate the
injectivity but also to allow unmangling identifiers e.g. for debugging
purposes.
Closes#10724
This PR optimizes two `String` proofs and makes sure that
`MkIffOfInductiveProp` does not import `Lean.Elab.Tactic`, which
previously pushed it to the very end of the import graph.
This PR splits some low-hanging fruit out of `Init.Data.String.Basic`:
basic material about `String.Pos.Raw`, `String.Substrig`, and
`String.Iterator`.
More splitting required and the remaining material is quite unorganized,
but it's a start.
This PR renames the cast functions on `String.ValidPos` for `set` and
`modify` to adhere to the established naming convention.
It also fixes two typos and very slighly tweaks the import graph,
shortening the critical path by a negligible amount.
This PR renames `String.endPos` to `String.rawEndPos`, as in a future
release the name `String.endPos` will be taken by the function that is
currently called `String.endValidPos`.
This PR fixes a bug in `String.Slice.takeWhile` which caused it to get
its bookkeeping wrong and panic. The new version only uses safe
operations on `String.Slice.Pos`.
This PR moves many operations involving `String.Pos.Raw` to a the
`String.Pos.Raw` namespace with the eventual aim of freeing up the
`String` namespace to contain operations using `String.ValidPos` (to be
renamed to `String.Pos`) instead.
This PR adds the `String.ValidPos.set` and `String.ValidPos.modify`
functions.
After this PR, `String.pos_lt_eq` is no longer a `simp` lemma. Add
`String.Pos.Raw.lt_iff` as a `simp` lemma if your proofs break.
This PR renames `String.split` to `String.splitToList`, because soon the
name `String.split` will be used by a new implementation which is
superior because it is polymorphic over the pattern kind and it returns
an iterator of slices instead of a list of strings.
This PR enforces rules around arithmetic of `String.Pos.Raw`.
Specifically, it adopts the following conventions:
- Byte indices ("ordinals") in strings should be represented using
`String.Pos.Raw`
- Amounts of bytes ("cardinals") in strings should be represented using
`Nat`.
For example, `String.Slice.utf8ByteSize` now returns `Nat` instead of
`String.Pos.Raw`, and there is a new function `String.Slice.rawEndPos`.
Finally, the `HAdd` and `HSub` instances for `String.Pos.Raw` are
reorganized. This is a **breaking change**.
The `HAdd/HSub String.Pos.Raw String.Pos.Raw String.Pos.Raw` instances
have been removed. For the use case of tracking positions relative to
some other position, we instead provide `offsetBy` and `unoffsetBy`
functions. For the use case of advancing/unadvancing a position by an
arbitrary number of bytes, we instead provide `increaseBy` and
`decreaseBy` functions. For
offsetting/unoffsetting/advancing/unadvancing a position `p` by the size
of a string `s` (resp. character `c`), use `s + p`/`p - s`/`p + s`/`p -
s` (resp. `c + p`/`p - c`/`p + c`/`p - c`).
This PR defines `ByteArray.validateUTF8`, uses it to show that
`ByteArray.IsValidUtf8` is decidable and redefines `String.fromUTF8` and
friends to use it.
The functions `String.validateUTF8` and `String.utf8DecodeChar?` are
deprecated in favor of the identically named functions in the
`ByteArray` namespace.
This PR ensures that `Substring.beq` is reflexive, and in particular
satisfies the equivalence `ss1 == ss2 <-> ss1.toString = ss2.toString`.
Closes#10511.
Note: I also fixed a strange line in the `String.extract` documentation
which looks like it may have been a copypasta, and added another example
to show how invalid UTF8 positions work, but the doc also makes a point
of saying that it is unspecified so maybe it would be better not to have
the example? 🤷
This PR introduces safe alternatives to `String.Pos` and `Substring`
that can only represent valid positions/slices.
Specifically, the PR
- introduces the predicate `String.Pos.IsValid`;
- proves several nontrivial equivalent conditions for
`String.Pos.IsValid`;
- introduces `String.ValidPos`, which is a `String.Pos` with an
`IsValid` proof;
- introduces `String.Slice`, which is like `Substring` but made from
`String.ValidPos` instead of `Pos`;
- introduces `String.Pos.IsValidForSlice`, which is like
`String.Pos.IsValid` but for slices;
- introduces `String.Slice.Pos`, which is like `String.ValidPos` but for
slices;
- introduces various functions for converting between the two types of
positions.
The API added in this PR is not complete. It will be expanded in future
PRs with addional operations and verification.