This PR verifies all of the `String` iterators except for the bytes
iterator by relating them to `String.toList`.
Along the way we define `String.posLE` and `String.posLT` analogously to
`String.posGE` and `String.posGT` and redefine `String.prev` to go
through `String.posLT`.
We also define and verify `String.positionsFrom` and
`String.revPositionsFrom`, which are the obvious generaliziations of
`String.positions` and `String.revPositions` starting at a positions
other than the start/end.
Finally, we get various lemmas about strings and positions, including
some nice induction principles `String.Pos.next_induction` and
`String.Pos.prev_induction`.
Of course, we also have all of the analogous results for `String.Slice`.
This PR verifies the `String.Slice.splitToSubslice` function by relating
it to a model implementation `Model.split` based on a
`ForwardPatternModel`.
The proof is generic, so it works for splitting by characters, strings
etc.
From this, we will be able to give user-facing API lemmas for
`String.split` and friends in future PRs.
We also move the verification of string patterns from
`String.Slice.Pattern` to `String.Slice.Pattern.Model` to achieve better
separation between code that users run in their programs and code that
only supports the theory.
This PR gives a proof of `LawfulToForwardSearcherModel` for `Slice`
patterns, which amounts to proving that our implementation of KMP is
correct.
Note that this PR also changes the KMP implementation to make it
slightly more efficient and easier to verify. I also have a correctness
proof for the old implementation, so there were no bugs in the old
implementation.
This PR provides a `LawfulForwardPatternModel` instance for string
patterns, i.e., it proves correctness of the `dropPrefix?` and
`startsWith` functions for string patterns.
Note that this is "just" the correctness proof; there isn't a way to
actually use it yet. API lemmas will follow.
This PR adds `String.Slice.Subslice`, which is an unbundled version of
`String.Slice`.
This type is of interest because it is the correct type for string
searching and splitting operations to land in.
This PR just adds the type with minimal API. Additional API and
subsequent refactoring of the searching and splitting API is left for
future PRs.
This PR reverses the relationship between the `ForwardPattern` and
`ToForwardSearcher` classes.
Previously, it was possible to derive `ForwardPattern` (i.e.,
`dropPrefix?`) from `ToForwardSearcher` (i.e., get an iterator of
`SearchStep (s)`). Now, we give the default instance in the other
direction: it is now possible to derive `ToForwardSearcher` from
`ForwardPattern`. Since it is usually much easier to provide
`ForwardPattern` than `ToForwardSearcher`, this means more shared code,
which pays off double since we will give a correctness proof for the
default implementation in an upcoming PR.
This PR also adds some string lemmas.
This adds `set_option debug.byAsSorry true` and `decreasing_by sorry` to
various files to allow bootstrapping with Config structure changes. These
changes will be restored after the bootstrap dance is complete.
This PR introduces the functions `(String|Slice).posGE` and
`(String|Slice).posGT` will full verification and deprecates
`Slice.findNextPos` in favor of `Slice.posGT`.
The KMP implementation is adapted to use these two new functions.
Various useful string and order lemmas are added along the way.
Also add a `simp` attribute to `Std.le_refl` and fix the resulting
fallout (yes, this would have been better as a separate PR).
This PR adds the function `Std.Iter.first?` and proves the specification
lemma `Std.Iter.first?_eq_match_step` if the iterator is productive.
The monadic variant on `Std.IterM` is also provided.
We use this new function to fix the default implementation for
`startsWith` and `dropPrefix` on `String` patterns, which used to fail
if the searcher returned a `skip` at the beginning. None of the patterns
we ship out of the box were affected by this, but user-defined patterns
were vulnerable.
---------
Co-authored-by: Paul Reichert <6992158+datokrat@users.noreply.github.com>
This PR moves `String.ofList` to `Init.Prelude`. It is a function that
the Lean kernel expects to be present and has special support for (when
reducing string literals). By moving this to `Init.Prelude`, all
declarations that are special to the kernel are in that single module.
Typos in `Init/` and `Std/`.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
---------
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
This PR makes the `FinitenessRelation` structure, which is helpful when
proving the finiteness of iterators, part of the public API. Previously,
it was marked internal and experimental.
This PR moves many constants of the iterator API from `Std.Iterators` to
the `Std` namespace in order to make them more convenient to use. These
constants include, but are not limited to, `Iter`, `IterM` and
`IteratorLoop`. This is a breaking change. If something breaks, try
adding `open Std` in order to make these constants available again. If
some constants in the `Std.Iterators` namespace cannot be found, they
can be found directly in `Std` now.
This PR adds `@[suggest_for]` annotations to Lean, allowing lean to
provide corrections for `.every` or `.some` methods in place of `.all`
or `.any` methods for most default-imported types (arrays, lists,
strings, substrings, and subarrays, and vectors).
Due to the need for stage0 updates for new annotations, the
`suggest_for` annotation itself was introduced in previous PRs: #11367,
#11529, and #11590.
## Example
```
example := "abc".every (! ·.isWhitespace)
```
Error message:
```
Invalid field `every`: The environment does not contain `String.every`, so it is not possible to project the field `every` from an expression
"abc"
of type `String`
Hint: Perhaps you meant `String.all` in place of `String.every`:
.e̵v̵e̵r̵y̵a̲l̲l̲
```
(the hint is added by this PR)
## Additional changes
Adds suggestions that are not currently active but that can be used to
generate autocompletion suggestions in the reference manual:
- `Either` -> `Except` and `Sum`
- `Exception` -> `Except`
- `ℕ` -> `Nat`
- `Nullable` -> `Option`
- `Maybe` -> `Option`
- `Optional` -> `Option`
- `Result` -> `Except`
This PR introduces a new fixpoint combinator,
`WellFounded.extrinsicFix`. A termination proof, if provided at all, can
be given extrinsically, i.e., looking at the term from the outside, and
is only required if one intends to formally verify the behavior of the
fixpoint. The new combinator is then applied to the iterator API.
Consumers such as `toList` or `ForIn` no longer require a proof that the
underlying iterator is finite. If one wants to ensure the termination of
them intrinsically, there are strictly terminating variants available
as, for example, `it.ensureTermination.toList` instead of `it.toList`.
This PR adds support for underscores as digit separators in
String.toNat?, String.toInt?, and related parsing functions. This makes
the string parsing functions consistent with Lean's numeric literal
syntax, which already supports underscores for readability (e.g.,
100_000_000).
The implementation validates that underscores:
- Cannot appear at the start or end of the number
- Cannot appear consecutively
- Are ignored when calculating the numeric value
This resolves a common source of friction when parsing user input from
command-line arguments, environment variables, or configuration files,
where users naturally expect to use the same numeric syntax they use in
source code.
## Examples
Before:
```lean
#eval "100_000_000".toNat? -- none
```
After:
```lean
#eval "100_000_000".toNat? -- some 100000000
#eval "1_000".toInt? -- some 1000
#eval "-1_000_000".toInt? -- some (-1000000)
```
## Testing
Added comprehensive tests in
`tests/lean/run/string_toNat_underscores.lean` covering:
- Basic underscore support
- Edge cases (leading/trailing/consecutive underscores)
- Both `toNat?` and `toInt?` functions
- String, Slice, and Substring types
All existing tests continue to pass.
Closes#11538🤖 Prepared with Claude Code
---------
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
Hi, these are just some spelling corrections.
There is one I wasn't completely sure about in
src/Init/Data/List/Lemmas.lean:
> See also
> ...
> Also
> \* \`Init.Data.List.Monadic\` for **addiation** _(additional?)_ lemmas
about \`List.mapM\` and \`List.forM\`
This PR marks `Char -> Bool` patterns as default instances for string
search. This means that things like `" ".find (·.isWhitespace)` can now
be elaborated without error.
Previously, it was necessary to write `" ".find Char.isWhitespace`.
Thank you to David Christiansen for the idea of using a default
instance.
This PR performs minor maintenance on the String API
- Rename `String.Pos.toCopy` to `String.Pos.copy` to adhere to the
naming convention
- Rename `String.Pos.extract` to `String.extract` to get sane dot
notation again
- Add `String.Slice.Pos.extract`
This PR adds missing docstrings for constants that occur in the
reference manual.
---------
Co-authored-by: Johannes Tantow <44068763+jt0202@users.noreply.github.com>
This PR renames `String.Slice.Pos.ofSlice` to `String.Pos.ofToSlice` to
adhere with the (yet-to-be documented) naming convention for mapping
positions to positions. It then adds several new functions so that for
every way to construct a slice from a string and slice, there are now
functions for mapping positions forwards and backwards along this
construction.
This PR aims to improve the performance of `String.contains`,
`String.find`, etc. when using patterns of type `Char` or `Char -> Bool`
by moving the needle out of the iterator state and thus working around
missing unboxing in the compiler.
This PR updates the `foldr`, `all`, `any` and `contains` functions on
`String` to be defined in terms of their `String.Slice` counterparts.
This is the last one in a long series of PRs. After this, all `String`
operations are polymorphic in the pattern, and no `String` operation
falls back to `String.Pos.Raw` internally (except those in the
`String.Pos.Raw` and `String.Substring.Raw` namespaces of course, which
still play a role in metaprogramming and will stay for the foreseeable
future).
This PR renames `String.bytes` to `String.toByteArray`.
This is for two reasons: first, `toByteArray` is a better name, and
second, we have something else that wants to use the name `bytes`,
namely the function that returns in iterator over the string's bytes.
This PR adds a coercion from `String` to `String.Slice`.
In our envisioned future, most functions operating on strings will
accept `String.Slice` parameters by default (like `str` in Rust), and
this enables calling such functions with arguments of type `String`.
Closes#11298.
This PR renames `String.ValidPos` to `String.Pos`, `String.endValidPos`
to `String.endPos` and `String.startValidPos` to `String.startPos`.
Accordingly, the deprecations of `String.Pos` to `String.Pos.Raw` and
`String.endPos` to `String.rawEndPos` are removed early, after an
abbreviated deprecation cycle of two releases.
This PR cleans up the API around `String.find` and moves it uniformly to
the new position types `String.ValidPos` and `String.Slice.Pos`
Overview:
- To search for a character, character predicate, string or slice in a
string or slice `s`, use `s.find?` or `s.find`.
- To do the same, but starting at a position `p` of a string or slice,
use `p.find?` or `p.find`.
- To do the same but between two positions `p` and `q`, construct the
slice from `p` to `q` and then use `find?` or `find` on that.
- To search backwards, all of the above applies, except that the
function is called `revFind?`, there is no non-question-mark version
(use `getD` if there is a sane default return value in your specific
application), and that you can only search for characters and character
predicates, not strings or slices.
This PR redefines `front` and `back` on `String` to go through
`String.Slice` and adds the new `String` functions `front?`, `back?`,
`positions`, `chars`, `revPositions`, `revChars`, `byteIterator`,
`revBytes`, `lines`.
This PR renames `String.replaceStartEnd` to `String.slice`,
`String.replaceStart` to `String.sliceFrom`, and `String.replaceEnd` to
`String.sliceTo`, and similar for the corresponding functions on
`String.Slice`.
This PR adds `Std.Slice.Pattern` instances for `p : Char -> Prop` as
long as `DecidablePred p`, to allow things like `"hello".dropWhile (· =
'h')`.
To achieve this, we refactor `ForwardPattern` and friends to be
"non-uniform", i.e., the class is now `ForwardPattern pat`, not
`ForwardPattern ρ` (where `pat : ρ`).
This PR adds a function `String.Slice.length`, with the following
deprecation string: There is no constant-time length function on slices.
Use `s.positions.count` instead, or `isEmpty` if you only need to know
whether the slice is empty.
This PR adds the alias `String.Slice.any` for `String.Slice.contains`.
It would probably be even better to only have one, but we don't have a
good mechanism for pointing people looking for one towards the other, so
an alias it is for now.
This PR fixes several memory leaks in the new `String` API.
These leaks are mostly situations where we forgot to put borrowing
annotations. The single
exception is the new `String` constructor `ofByteArray`. It cannot take
the `ByteArray` as
a borrowed argument anymore and must thus free it on its own.