Commit graph

238 commits

Author SHA1 Message Date
Markus Himmel
8a9cb6def0
feat: Slice.posGE and Slice.posGT (#12301)
This PR introduces the functions `(String|Slice).posGE` and
`(String|Slice).posGT` will full verification and deprecates
`Slice.findNextPos` in favor of `Slice.posGT`.

The KMP implementation is adapted to use these two new functions.

Various useful string and order lemmas are added along the way.

Also add a `simp` attribute to `Std.le_refl` and fix the resulting
fallout (yes, this would have been better as a separate PR).
2026-02-04 09:45:44 +00:00
Paul Reichert
e7b6bd6734
refactor: rename Iter(M).count to Iter(M).length (#12210)
This PR renames `Iter(M).count` to `Iter(M).length` and updates lots of
lemmas, adding deprecations.
2026-01-29 07:26:13 +00:00
Markus Himmel
ba0e755adc
feat: Std.Iter.first? (#12162)
This PR adds the function `Std.Iter.first?` and proves the specification
lemma `Std.Iter.first?_eq_match_step` if the iterator is productive.

The monadic variant on `Std.IterM` is also provided.

We use this new function to fix the default implementation for
`startsWith` and `dropPrefix` on `String` patterns, which used to fail
if the searcher returned a `skip` at the beginning. None of the patterns
we ship out of the box were affected by this, but user-defined patterns
were vulnerable.

---------

Co-authored-by: Paul Reichert <6992158+datokrat@users.noreply.github.com>
2026-01-27 12:10:16 +00:00
Joachim Breitner
9167b13afa
refactor: move String.ofList to the Prelude (#12029)
This PR moves `String.ofList` to `Init.Prelude`. It is a function that
the Lean kernel expects to be present and has special support for (when
reducing string literals). By moving this to `Init.Prelude`, all
declarations that are special to the kernel are in that single module.
2026-01-19 08:22:13 +00:00
Alok Singh
4c360d50fa
style: fix typos in Init/ and Std/ docstrings (#11864)
Typos in `Init/` and `Std/`.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

---------

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-09 07:24:07 +00:00
Henrik Böving
4eb5b5776d
perf: inline IsUTF8FirstByte (#11872)
This PR marks IsUTF8FirstByte as inline.

I have a use case where it shows up significantly in the profile.
2026-01-02 11:21:54 +00:00
Paul Reichert
1590a72913
feat: make FinitenessRelation part of the public API (#11789)
This PR makes the `FinitenessRelation` structure, which is helpful when
proving the finiteness of iterators, part of the public API. Previously,
it was marked internal and experimental.
2025-12-29 20:45:41 +00:00
Paul Reichert
5ef0207a85
refactor: remove IteratorCollect (#11706)
This PR removes the `IteratorCollect` type class and hereby simplifies
the iterator API. Its limited advantages did not justify the complexity
cost.
2025-12-17 23:02:33 +00:00
Paul Reichert
c79d74d9a1
refactor: move Iter and others from Std.Iterators to Std (#11446)
This PR moves many constants of the iterator API from `Std.Iterators` to
the `Std` namespace in order to make them more convenient to use. These
constants include, but are not limited to, `Iter`, `IterM` and
`IteratorLoop`. This is a breaking change. If something breaks, try
adding `open Std` in order to make these constants available again. If
some constants in the `Std.Iterators` namespace cannot be found, they
can be found directly in `Std` now.
2025-12-15 08:24:12 +00:00
Robert J. Simmons
f88e503f3d
feat: @[suggest_for] annotations for prompting easy-to-miss names (#11554)
This PR adds `@[suggest_for]` annotations to Lean, allowing lean to
provide corrections for `.every` or `.some` methods in place of `.all`
or `.any` methods for most default-imported types (arrays, lists,
strings, substrings, and subarrays, and vectors).

Due to the need for stage0 updates for new annotations, the
`suggest_for` annotation itself was introduced in previous PRs: #11367,
#11529, and #11590.

## Example
```
example := "abc".every (! ·.isWhitespace)
```

Error message:
```
Invalid field `every`: The environment does not contain `String.every`, so it is not possible to project the field `every` from an expression
  "abc"
of type `String`

Hint: Perhaps you meant `String.all` in place of `String.every`:
  .e̵v̵e̵r̵y̵a̲l̲l̲
```

(the hint is added by this PR)

## Additional changes

Adds suggestions that are not currently active but that can be used to
generate autocompletion suggestions in the reference manual:
 - `Either` -> `Except` and `Sum`
 - `Exception` -> `Except`
 - `ℕ` -> `Nat`
 - `Nullable` -> `Option` 
 - `Maybe` -> `Option`
 - `Optional` -> `Option`
 - `Result` -> `Except`
2025-12-10 22:50:45 +00:00
Paul Reichert
383c0caa91
feat: remove Finite conditions from iterator consumers relying on a new fixpoint combinator (#11038)
This PR introduces a new fixpoint combinator,
`WellFounded.extrinsicFix`. A termination proof, if provided at all, can
be given extrinsically, i.e., looking at the term from the outside, and
is only required if one intends to formally verify the behavior of the
fixpoint. The new combinator is then applied to the iterator API.
Consumers such as `toList` or `ForIn` no longer require a proof that the
underlying iterator is finite. If one wants to ensure the termination of
them intrinsically, there are strictly terminating variants available
as, for example, `it.ensureTermination.toList` instead of `it.toList`.
2025-12-08 16:03:22 +00:00
Henrik Böving
e11800d3c8
perf: annotate built-in functions with tagged_return (#11549)
This PR annotates built-in `extern` functions with `tagged_return`.
2025-12-08 13:10:55 +00:00
Kim Morrison
6cbcbce750
feat: support underscores in String.toNat? and String.toInt? (#11541)
This PR adds support for underscores as digit separators in
String.toNat?, String.toInt?, and related parsing functions. This makes
the string parsing functions consistent with Lean's numeric literal
syntax, which already supports underscores for readability (e.g.,
100_000_000).

The implementation validates that underscores:
- Cannot appear at the start or end of the number
- Cannot appear consecutively
- Are ignored when calculating the numeric value

This resolves a common source of friction when parsing user input from
command-line arguments, environment variables, or configuration files,
where users naturally expect to use the same numeric syntax they use in
source code.

## Examples

Before:
```lean
#eval "100_000_000".toNat?  -- none
```

After:
```lean
#eval "100_000_000".toNat?  -- some 100000000
#eval "1_000".toInt?        -- some 1000
#eval "-1_000_000".toInt?   -- some (-1000000)
```

## Testing

Added comprehensive tests in
`tests/lean/run/string_toNat_underscores.lean` covering:
- Basic underscore support
- Edge cases (leading/trailing/consecutive underscores)
- Both `toNat?` and `toInt?` functions
- String, Slice, and Substring types

All existing tests continue to pass.

Closes #11538

🤖 Prepared with Claude Code

---------

Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2025-12-08 03:57:55 +00:00
Tom Levy
2ca3bc2859
chore: fix spelling (#11531)
Hi, these are just some spelling corrections.

There is one I wasn't completely sure about in
src/Init/Data/List/Lemmas.lean:

> See also
> ...
> Also
> \* \`Init.Data.List.Monadic\` for **addiation** _(additional?)_ lemmas
about \`List.mapM\` and \`List.forM\`
2025-12-06 13:54:27 +00:00
Joachim Breitner
d4c832ecb0
perf: de-fuel some recursive definitions in Core (#11416)
This PR follows up on #7965 and avoids manual fuel constructions in some
recursive definitions.
2025-12-05 16:16:31 +00:00
Markus Himmel
e548fa414c
fix: Char -> Bool as default instance for string search (#11503)
This PR marks `Char -> Bool` patterns as default instances for string
search. This means that things like `" ".find (·.isWhitespace)` can now
be elaborated without error.

Previously, it was necessary to write `" ".find Char.isWhitespace`.

Thank you to David Christiansen for the idea of using a default
instance.
2025-12-04 09:25:16 +00:00
Markus Himmel
1ae680c5e2
chore: minor String API improvements (#11439)
This PR performs minor maintenance on the String API

- Rename `String.Pos.toCopy` to `String.Pos.copy` to adhere to the
naming convention
- Rename `String.Pos.extract` to `String.extract` to get sane dot
notation again
- Add `String.Slice.Pos.extract`
2025-12-01 11:44:14 +00:00
David Thrane Christiansen
34adc4d941
doc: add missing docstrings (#11364)
This PR adds missing docstrings for constants that occur in the
reference manual.

---------

Co-authored-by: Johannes Tantow <44068763+jt0202@users.noreply.github.com>
2025-11-26 15:00:50 +00:00
Markus Himmel
5fb25fff06
feat: grind instances for String.Pos and variants (#11384)
This PR adds the necessary instances for `grind` to reason about
`String.Pos.Raw`, `String.Pos` and `String.Slice.Pos`.
2025-11-26 13:59:01 +00:00
Markus Himmel
d8913f88dc
feat: move String positions between slices (#11380)
This PR renames `String.Slice.Pos.ofSlice` to `String.Pos.ofToSlice` to
adhere with the (yet-to-be documented) naming convention for mapping
positions to positions. It then adds several new functions so that for
every way to construct a slice from a string and slice, there are now
functions for mapping positions forwards and backwards along this
construction.
2025-11-26 11:48:59 +00:00
Markus Himmel
5a5f8c4c2e
perf: unbundle needle from char/pred pattern (#11376)
This PR aims to improve the performance of `String.contains`,
`String.find`, etc. when using patterns of type `Char` or `Char -> Bool`
by moving the needle out of the iterator state and thus working around
missing unboxing in the compiler.
2025-11-26 07:30:29 +00:00
Markus Himmel
85d7f3321c
feat: String.Slice.toInt? (#11358)
This PR adds `String.Slice.toInt?` and variants.

Closes #11275.
2025-11-25 15:48:41 +00:00
Markus Himmel
d99c515b16
refactor: String functions foldr, all, any, contains to go trough String.Slice (#11357)
This PR updates the `foldr`, `all`, `any` and `contains` functions on
`String` to be defined in terms of their `String.Slice` counterparts.

This is the last one in a long series of PRs. After this, all `String`
operations are polymorphic in the pattern, and no `String` operation
falls back to `String.Pos.Raw` internally (except those in the
`String.Pos.Raw` and `String.Substring.Raw` namespaces of course, which
still play a role in metaprogramming and will stay for the foreseeable
future).
2025-11-25 15:42:43 +00:00
Markus Himmel
29ac158fcf
feat: String.Pos.le_find (#11354)
This PR adds simple lemmas that show that searching from a position in a
string returns something that is at least that position.
2025-11-25 11:05:58 +00:00
Markus Himmel
151c034f4f
refactor: rename String.bytes to String.toByteArray (#11343)
This PR renames `String.bytes` to `String.toByteArray`.

This is for two reasons: first, `toByteArray` is a better name, and
second, we have something else that wants to use the name `bytes`,
namely the function that returns in iterator over the string's bytes.
2025-11-24 18:59:49 +00:00
Markus Himmel
96c4b9ee4d
feat: coercion from String to String.Slice (#11341)
This PR adds a coercion from `String` to `String.Slice`.

In our envisioned future, most functions operating on strings will
accept `String.Slice` parameters by default (like `str` in Rust), and
this enables calling such functions with arguments of type `String`.

Closes #11298.
2025-11-24 16:50:08 +00:00
Markus Himmel
fa67f300f6
chore: rename String.ValidPos to String.Pos (#11240)
This PR renames `String.ValidPos` to `String.Pos`, `String.endValidPos`
to `String.endPos` and `String.startValidPos` to `String.startPos`.

Accordingly, the deprecations of `String.Pos` to `String.Pos.Raw` and
`String.endPos` to `String.rawEndPos` are removed early, after an
abbreviated deprecation cycle of two releases.
2025-11-24 16:40:21 +00:00
Markus Himmel
e6a07ca6b1
refactor: deprecate String.posOf and variants in favor of unified String.find (#11276)
This PR cleans up the API around `String.find` and moves it uniformly to
the new position types `String.ValidPos` and `String.Slice.Pos`

Overview:

- To search for a character, character predicate, string or slice in a
string or slice `s`, use `s.find?` or `s.find`.
- To do the same, but starting at a position `p` of a string or slice,
use `p.find?` or `p.find`.
- To do the same but between two positions `p` and `q`, construct the
slice from `p` to `q` and then use `find?` or `find` on that.
- To search backwards, all of the above applies, except that the
function is called `revFind?`, there is no non-question-mark version
(use `getD` if there is a sane default return value in your specific
application), and that you can only search for characters and character
predicates, not strings or slices.
2025-11-23 18:39:53 +00:00
Markus Himmel
fba166eea0
chore: expose more String.Slice functions on String (#11308)
This PR redefines `front` and `back` on `String` to go through
`String.Slice` and adds the new `String` functions `front?`, `back?`,
`positions`, `chars`, `revPositions`, `revChars`, `byteIterator`,
`revBytes`, `lines`.
2025-11-23 15:33:16 +00:00
Markus Himmel
dda6885eae
refactor: String.foldl and String.isNat go through String.Slice (#11289)
This PR redefines `String.foldl`, `String.isNat` to use their
`String.Slice` counterparts.
2025-11-21 11:17:50 +00:00
Markus Himmel
51b67385cc
refactor: better name for String.replaceStart and variants (#11290)
This PR renames `String.replaceStartEnd` to `String.slice`,
`String.replaceStart` to `String.sliceFrom`, and `String.replaceEnd` to
`String.sliceTo`, and similar for the corresponding functions on
`String.Slice`.
2025-11-20 16:42:27 +00:00
Markus Himmel
7267ed707a
feat: string patterns for decidable predicates on Char (#11285)
This PR adds `Std.Slice.Pattern` instances for `p : Char -> Prop` as
long as `DecidablePred p`, to allow things like `"hello".dropWhile (· =
'h')`.

To achieve this, we refactor `ForwardPattern` and friends to be
"non-uniform", i.e., the class is now `ForwardPattern pat`, not
`ForwardPattern ρ` (where `pat : ρ`).
2025-11-20 15:30:37 +00:00
Markus Himmel
f7ed158002
chore: introduce and immediately deprecate String.Slice.length (#11286)
This PR adds a function `String.Slice.length`, with the following
deprecation string: There is no constant-time length function on slices.
Use `s.positions.count` instead, or `isEmpty` if you only need to know
whether the slice is empty.
2025-11-20 14:31:46 +00:00
Markus Himmel
cf0e4441e8
chore: create alias String.Slice.any for String.Slice.contains (#11282)
This PR adds the alias `String.Slice.any` for `String.Slice.contains`.

It would probably be even better to only have one, but we don't have a
good mechanism for pointing people looking for one towards the other, so
an alias it is for now.
2025-11-20 13:21:30 +00:00
Markus Himmel
2c12bc9fdf
chore: more deprecations for string migration (#11281)
This PR adds a few deprecations for functions that never existed but
that are still helpful for people migrating their code post-#11180.
2025-11-20 13:09:52 +00:00
Henrik Böving
827a96ade3
fix: several memory leaks in the new String API (#11263)
This PR fixes several memory leaks in the new `String` API.

These leaks are mostly situations where we forgot to put borrowing
annotations. The single
exception is the new `String` constructor `ofByteArray`. It cannot take
the `ByteArray` as
a borrowed argument anymore and must thus free it on its own.
2025-11-19 18:23:35 +00:00
Henrik Böving
52b687cab4
perf: less allocations when using string patterns (#11255)
This PR reduces the allocations when using string patterns. In
particular
`startsWith`, `dropPrefix?`, `endsWith`, `dropSuffix?` are optimized.
2025-11-19 13:06:27 +00:00
Markus Himmel
52d05b6972
refactor: use String.split instead of String.splitOn or String.splitToList (#11250)
This PR introduces a function `String.split` which is based on
`String.Slice.split` and therefore supports all pattern types and
returns a `Std.Iter String.Slice`.

This supersedes the functions `String.splitOn` and `String.splitToList`,
and we remove all all uses of these functions from core. They will be
deprecated in a future PR.

Migrating from `String.splitOn` and `String.splitToList` is easy: we
introduce functions `Iter.toStringList` and `Iter.toStringArray` that
can be used to conveniently go from `Std.Iter String.Slice` to `List
String` and `Array String`, so for example `s.splitOn "foo"` can be
replaced by `s.split "foo" |>.toStringList`.
2025-11-19 09:35:19 +00:00
Markus Himmel
59949f89ee
chore: add function String.Pos.extract (#11251)
This PR is a preparatory bootstrapping PR for #11240.
2025-11-19 08:05:28 +00:00
Markus Himmel
fa5d08b7de
refactor: use String.Slice in String.take and variants (#11180)
This PR redefines `String.take` and variants to operate on
`String.Slice`. While previously functions returning a substring of the
input sometimes returned `String` and sometimes returned
`Substring.Raw`, they now uniformly return `String.Slice`.

This is a BREAKING change, because many functions now have a different
return type. So for example, if `s` is a string and `f` is a function
accepting a string, `f (s.drop 1)` will no longer compile because
`s.drop 1` is a `String.Slice`. To fix this, insert a call to `copy` to
restore the old behavior: `f (s.drop 1).copy`.

Of course, in many cases, there will be more efficient options. For
example, don't write `f <| s.drop 1 |>.copy |>.dropEnd 1 |>.copy`, write
`f <| s.drop 1 |>.dropEnd 1 |>.copy` instead. Also, instead of `(s.drop
1).copy = "Hello"`, write `s.drop 1 == "Hello".toSlice` instead.
2025-11-18 16:13:48 +00:00
Markus Himmel
03eb2f73ac
chore: deprecate String.toSubstring (#11232)
This PR deprecates `String.toSubstring` in favor of
`String.toRawSubstring` (cf. #11154).
2025-11-18 13:50:50 +00:00
Wrenna Robson
36a6844625
feat: add Std.Trichotomous (#10945)
This PR adds `Std.Tricho r`, a typeclass for relations which identifies
them as trichotomous. This is preferred to `Std.Antisymm (¬ r · ·)` in
all cases (which it is equivalent to).
2025-11-18 13:20:53 +00:00
Markus Himmel
e301f86c6c
chore: add String.Pos.next (#11238)
This PR is split from a future PR and adds the function
`String.Pos.next`, an alias (and soon to be correct name) of
`String.ValidPos.next`.

This is for boring bootstrapping reasons.
2025-11-18 10:41:22 +00:00
Markus Himmel
f6a9059709
chore: rename String.offsetOfPos to String.Pos.Raw.offsetOfPos (#11218)
This PR renames `String.offsetOfPos` to `String.Pos.Raw.offsetOfPos` to
align with the other `String.Pos.Raw` operations.
2025-11-18 07:24:06 +00:00
Markus Himmel
bf60550ce5
chore: rename Substring to Substring.Raw (#11154)
This PR renames `Substring`  to `Substring.Raw`.

This is to signify its status as a second-class citizen (not deprecated,
but no real plans for verification, like `String.Pos.Raw`) and to free
up the name `Substring` for a possible future type `String.Substring :
String -> Type` so that `s.Substring` is the type of substrings of `s`.

The functions `String.toSubstring` and `String.toSubstring'` will remain
for now for bootstrapping reasons.
2025-11-16 09:30:04 +00:00
Markus Himmel
aca297d1c5
chore: some String API cleanup in Lake.Util.Version (#11160)
This PR performs some cleanup in `Lake.Util.Version`.

---------

Co-authored-by: Mac Malone <tydeu@hatpress.net>
2025-11-14 08:56:56 +00:00
Markus Himmel
eb01aaeee4
chore: rename String.Iterator to String.Legacy.Iterator (#11152)
This PR renames `String.Iterator` to `String.Legacy.Iterator`.

From the docstring of `String.Legacy.Iterator`:

> This is a no-longer-supported legacy API that will be removed in a
future release. You should use
> `String.ValidPos` instead, which is similar, but safer. To iterate
over a string `s`, start with
> `p : s.startValidPos`, advance it using `p.next`, access the current
character using `p.get` and
> check if the position is at the end using `p = s.endValidPos` or
`p.IsAtEnd`.
2025-11-13 13:46:22 +00:00
Markus Himmel
f1224277e2
perf: improve performance of String.ValidPos (#11142)
This PR aims to bring the performance of `String.ValidPos` closer to
that of `String.Pos.Raw` by adding/correcting `extern` annotations as
needed.

This is in response to a regression observed after #11127. The changes
to the `String` `Parsec` module lead to different compiler behavior for
functions like `strCore` and `natCore`. The new IR *looks* better than
the old IR, but the
[numbers](1e438647ba)
are a bit mixed.
2025-11-11 15:30:47 +00:00
Markus Himmel
2c2fcff4f8
refactor: do not use String.Iterator (#11127)
This PR removes all uses of `String.Iterator` from core, preferring
`String.ValidPos` instead.

In an upcoming PR, `String.Iterator` will be renamed to
`String.Legacy.Iterator`.
2025-11-11 11:46:58 +00:00
Markus Himmel
d24ece1396
feat: String.toList_map (#11021)
This PR adds more theory about `Splits` for strings and deduces the
first user-facing `String` lemma, `String.toList_map`.
2025-11-01 13:54:39 +00:00