Commit graph

174 commits

Author SHA1 Message Date
Markus Himmel
6cbaada1bf
feat: verification of String.positions, String.chars, String.revPositions, String.revChars, ForIn m String Char (#12456)
This PR verifies all of the `String` iterators except for the bytes
iterator by relating them to `String.toList`.

Along the way we define `String.posLE` and `String.posLT` analogously to
`String.posGE` and `String.posGT` and redefine `String.prev` to go
through `String.posLT`.

We also define and verify `String.positionsFrom` and
`String.revPositionsFrom`, which are the obvious generaliziations of
`String.positions` and `String.revPositions` starting at a positions
other than the start/end.

Finally, we get various lemmas about strings and positions, including
some nice induction principles `String.Pos.next_induction` and
`String.Pos.prev_induction`.

Of course, we also have all of the analogous results for `String.Slice`.
2026-02-12 15:32:44 +00:00
Markus Himmel
7b29425361
chore: simplify Char.toString to String.singleton (#12449)
This PR marks `String.toString_eq_singleton` as a `simp` lemma.
2026-02-12 06:10:36 +00:00
Markus Himmel
5ba467d920
feat: LawfulForwardPatternModel for string patterns (#12360)
This PR provides a `LawfulForwardPatternModel` instance for string
patterns, i.e., it proves correctness of the `dropPrefix?` and
`startsWith` functions for string patterns.

Note that this is "just" the correctness proof; there isn't a way to
actually use it yet. API lemmas will follow.
2026-02-09 09:27:47 +00:00
Markus Himmel
52bc216351
feat: basic infrastructure for verification of string patterns (#12333)
This PR adds the basic typeclasses that will be used in the verification
of our string searching infrastructure.
2026-02-05 16:37:50 +00:00
Markus Himmel
f82c40857b
feat: String.Slice.Subslice (#12322)
This PR adds `String.Slice.Subslice`, which is an unbundled version of
`String.Slice`.

This type is of interest because it is the correct type for string
searching and splitting operations to land in.

This PR just adds the type with minimal API. Additional API and
subsequent refactoring of the searching and splitting API is left for
future PRs.
2026-02-05 10:09:04 +00:00
Sebastian Ullrich
b4d4e371d2
chore: shake core (#12276) 2026-02-05 09:10:32 +00:00
Markus Himmel
54cba90dc5
refactor: derive string searcher from string pattern (#12312)
This PR reverses the relationship between the `ForwardPattern` and
`ToForwardSearcher` classes.

Previously, it was possible to derive `ForwardPattern` (i.e.,
`dropPrefix?`) from `ToForwardSearcher` (i.e., get an iterator of
`SearchStep (s)`). Now, we give the default instance in the other
direction: it is now possible to derive `ToForwardSearcher` from
`ForwardPattern`. Since it is usually much easier to provide
`ForwardPattern` than `ToForwardSearcher`, this means more shared code,
which pays off double since we will give a correctness proof for the
default implementation in an upcoming PR.

This PR also adds some string lemmas.
2026-02-05 07:38:31 +00:00
Kim Morrison
d49e5d8a3d Revert "chore: temporarily disable proofs for bootstrap"
This reverts commit c56a5732a5a215f7b74d3f7a5cefd8612cf50474.
2026-02-05 13:41:34 +11:00
Kim Morrison
7b12b504df chore: temporarily disable proofs for bootstrap
This adds `set_option debug.byAsSorry true` and `decreasing_by sorry` to
various files to allow bootstrapping with Config structure changes. These
changes will be restored after the bootstrap dance is complete.
2026-02-05 13:41:34 +11:00
Markus Himmel
8a9cb6def0
feat: Slice.posGE and Slice.posGT (#12301)
This PR introduces the functions `(String|Slice).posGE` and
`(String|Slice).posGT` will full verification and deprecates
`Slice.findNextPos` in favor of `Slice.posGT`.

The KMP implementation is adapted to use these two new functions.

Various useful string and order lemmas are added along the way.

Also add a `simp` attribute to `Std.le_refl` and fix the resulting
fallout (yes, this would have been better as a separate PR).
2026-02-04 09:45:44 +00:00
Henrik Böving
e11800d3c8
perf: annotate built-in functions with tagged_return (#11549)
This PR annotates built-in `extern` functions with `tagged_return`.
2025-12-08 13:10:55 +00:00
Tom Levy
2ca3bc2859
chore: fix spelling (#11531)
Hi, these are just some spelling corrections.

There is one I wasn't completely sure about in
src/Init/Data/List/Lemmas.lean:

> See also
> ...
> Also
> \* \`Init.Data.List.Monadic\` for **addiation** _(additional?)_ lemmas
about \`List.mapM\` and \`List.forM\`
2025-12-06 13:54:27 +00:00
Joachim Breitner
d4c832ecb0
perf: de-fuel some recursive definitions in Core (#11416)
This PR follows up on #7965 and avoids manual fuel constructions in some
recursive definitions.
2025-12-05 16:16:31 +00:00
Markus Himmel
1ae680c5e2
chore: minor String API improvements (#11439)
This PR performs minor maintenance on the String API

- Rename `String.Pos.toCopy` to `String.Pos.copy` to adhere to the
naming convention
- Rename `String.Pos.extract` to `String.extract` to get sane dot
notation again
- Add `String.Slice.Pos.extract`
2025-12-01 11:44:14 +00:00
David Thrane Christiansen
34adc4d941
doc: add missing docstrings (#11364)
This PR adds missing docstrings for constants that occur in the
reference manual.

---------

Co-authored-by: Johannes Tantow <44068763+jt0202@users.noreply.github.com>
2025-11-26 15:00:50 +00:00
Markus Himmel
d8913f88dc
feat: move String positions between slices (#11380)
This PR renames `String.Slice.Pos.ofSlice` to `String.Pos.ofToSlice` to
adhere with the (yet-to-be documented) naming convention for mapping
positions to positions. It then adds several new functions so that for
every way to construct a slice from a string and slice, there are now
functions for mapping positions forwards and backwards along this
construction.
2025-11-26 11:48:59 +00:00
Markus Himmel
d99c515b16
refactor: String functions foldr, all, any, contains to go trough String.Slice (#11357)
This PR updates the `foldr`, `all`, `any` and `contains` functions on
`String` to be defined in terms of their `String.Slice` counterparts.

This is the last one in a long series of PRs. After this, all `String`
operations are polymorphic in the pattern, and no `String` operation
falls back to `String.Pos.Raw` internally (except those in the
`String.Pos.Raw` and `String.Substring.Raw` namespaces of course, which
still play a role in metaprogramming and will stay for the foreseeable
future).
2025-11-25 15:42:43 +00:00
Markus Himmel
29ac158fcf
feat: String.Pos.le_find (#11354)
This PR adds simple lemmas that show that searching from a position in a
string returns something that is at least that position.
2025-11-25 11:05:58 +00:00
Markus Himmel
151c034f4f
refactor: rename String.bytes to String.toByteArray (#11343)
This PR renames `String.bytes` to `String.toByteArray`.

This is for two reasons: first, `toByteArray` is a better name, and
second, we have something else that wants to use the name `bytes`,
namely the function that returns in iterator over the string's bytes.
2025-11-24 18:59:49 +00:00
Markus Himmel
fa67f300f6
chore: rename String.ValidPos to String.Pos (#11240)
This PR renames `String.ValidPos` to `String.Pos`, `String.endValidPos`
to `String.endPos` and `String.startValidPos` to `String.startPos`.

Accordingly, the deprecations of `String.Pos` to `String.Pos.Raw` and
`String.endPos` to `String.rawEndPos` are removed early, after an
abbreviated deprecation cycle of two releases.
2025-11-24 16:40:21 +00:00
Markus Himmel
e6a07ca6b1
refactor: deprecate String.posOf and variants in favor of unified String.find (#11276)
This PR cleans up the API around `String.find` and moves it uniformly to
the new position types `String.ValidPos` and `String.Slice.Pos`

Overview:

- To search for a character, character predicate, string or slice in a
string or slice `s`, use `s.find?` or `s.find`.
- To do the same, but starting at a position `p` of a string or slice,
use `p.find?` or `p.find`.
- To do the same but between two positions `p` and `q`, construct the
slice from `p` to `q` and then use `find?` or `find` on that.
- To search backwards, all of the above applies, except that the
function is called `revFind?`, there is no non-question-mark version
(use `getD` if there is a sane default return value in your specific
application), and that you can only search for characters and character
predicates, not strings or slices.
2025-11-23 18:39:53 +00:00
Markus Himmel
fba166eea0
chore: expose more String.Slice functions on String (#11308)
This PR redefines `front` and `back` on `String` to go through
`String.Slice` and adds the new `String` functions `front?`, `back?`,
`positions`, `chars`, `revPositions`, `revChars`, `byteIterator`,
`revBytes`, `lines`.
2025-11-23 15:33:16 +00:00
Markus Himmel
dda6885eae
refactor: String.foldl and String.isNat go through String.Slice (#11289)
This PR redefines `String.foldl`, `String.isNat` to use their
`String.Slice` counterparts.
2025-11-21 11:17:50 +00:00
Markus Himmel
51b67385cc
refactor: better name for String.replaceStart and variants (#11290)
This PR renames `String.replaceStartEnd` to `String.slice`,
`String.replaceStart` to `String.sliceFrom`, and `String.replaceEnd` to
`String.sliceTo`, and similar for the corresponding functions on
`String.Slice`.
2025-11-20 16:42:27 +00:00
Henrik Böving
827a96ade3
fix: several memory leaks in the new String API (#11263)
This PR fixes several memory leaks in the new `String` API.

These leaks are mostly situations where we forgot to put borrowing
annotations. The single
exception is the new `String` constructor `ofByteArray`. It cannot take
the `ByteArray` as
a borrowed argument anymore and must thus free it on its own.
2025-11-19 18:23:35 +00:00
Markus Himmel
52d05b6972
refactor: use String.split instead of String.splitOn or String.splitToList (#11250)
This PR introduces a function `String.split` which is based on
`String.Slice.split` and therefore supports all pattern types and
returns a `Std.Iter String.Slice`.

This supersedes the functions `String.splitOn` and `String.splitToList`,
and we remove all all uses of these functions from core. They will be
deprecated in a future PR.

Migrating from `String.splitOn` and `String.splitToList` is easy: we
introduce functions `Iter.toStringList` and `Iter.toStringArray` that
can be used to conveniently go from `Std.Iter String.Slice` to `List
String` and `Array String`, so for example `s.splitOn "foo"` can be
replaced by `s.split "foo" |>.toStringList`.
2025-11-19 09:35:19 +00:00
Markus Himmel
59949f89ee
chore: add function String.Pos.extract (#11251)
This PR is a preparatory bootstrapping PR for #11240.
2025-11-19 08:05:28 +00:00
Markus Himmel
fa5d08b7de
refactor: use String.Slice in String.take and variants (#11180)
This PR redefines `String.take` and variants to operate on
`String.Slice`. While previously functions returning a substring of the
input sometimes returned `String` and sometimes returned
`Substring.Raw`, they now uniformly return `String.Slice`.

This is a BREAKING change, because many functions now have a different
return type. So for example, if `s` is a string and `f` is a function
accepting a string, `f (s.drop 1)` will no longer compile because
`s.drop 1` is a `String.Slice`. To fix this, insert a call to `copy` to
restore the old behavior: `f (s.drop 1).copy`.

Of course, in many cases, there will be more efficient options. For
example, don't write `f <| s.drop 1 |>.copy |>.dropEnd 1 |>.copy`, write
`f <| s.drop 1 |>.dropEnd 1 |>.copy` instead. Also, instead of `(s.drop
1).copy = "Hello"`, write `s.drop 1 == "Hello".toSlice` instead.
2025-11-18 16:13:48 +00:00
Markus Himmel
e301f86c6c
chore: add String.Pos.next (#11238)
This PR is split from a future PR and adds the function
`String.Pos.next`, an alias (and soon to be correct name) of
`String.ValidPos.next`.

This is for boring bootstrapping reasons.
2025-11-18 10:41:22 +00:00
Markus Himmel
f6a9059709
chore: rename String.offsetOfPos to String.Pos.Raw.offsetOfPos (#11218)
This PR renames `String.offsetOfPos` to `String.Pos.Raw.offsetOfPos` to
align with the other `String.Pos.Raw` operations.
2025-11-18 07:24:06 +00:00
Markus Himmel
bf60550ce5
chore: rename Substring to Substring.Raw (#11154)
This PR renames `Substring`  to `Substring.Raw`.

This is to signify its status as a second-class citizen (not deprecated,
but no real plans for verification, like `String.Pos.Raw`) and to free
up the name `Substring` for a possible future type `String.Substring :
String -> Type` so that `s.Substring` is the type of substrings of `s`.

The functions `String.toSubstring` and `String.toSubstring'` will remain
for now for bootstrapping reasons.
2025-11-16 09:30:04 +00:00
Markus Himmel
f1224277e2
perf: improve performance of String.ValidPos (#11142)
This PR aims to bring the performance of `String.ValidPos` closer to
that of `String.Pos.Raw` by adding/correcting `extern` annotations as
needed.

This is in response to a regression observed after #11127. The changes
to the `String` `Parsec` module lead to different compiler behavior for
functions like `strCore` and `natCore`. The new IR *looks* better than
the old IR, but the
[numbers](1e438647ba)
are a bit mixed.
2025-11-11 15:30:47 +00:00
Markus Himmel
2c2fcff4f8
refactor: do not use String.Iterator (#11127)
This PR removes all uses of `String.Iterator` from core, preferring
`String.ValidPos` instead.

In an upcoming PR, `String.Iterator` will be renamed to
`String.Legacy.Iterator`.
2025-11-11 11:46:58 +00:00
Markus Himmel
d24ece1396
feat: String.toList_map (#11021)
This PR adds more theory about `Splits` for strings and deduces the
first user-facing `String` lemma, `String.toList_map`.
2025-11-01 13:54:39 +00:00
Markus Himmel
377f149862
refactor: use String.ofList and String.toList for String <-> List Char conversion (#11017)
This PR establishes `String.ofList` and `String.toList` as the preferred
method for converting between strings and lists of characters and
deprecates the alternatives `String.mk`, `List.asString` and
`String.data`.
2025-10-31 14:41:23 +00:00
Markus Himmel
5af12df54b
chore: add String.ofList redefine String.toList (#11016)
This PR ensures that `String.toList` and `String.ofList` exist and have
the right `extern` annotations.
2025-10-30 07:07:12 +00:00
Kim Morrison
335e34df19
chore: add deprecations for duplicated theorems (#10967) 2025-10-29 05:26:16 +00:00
Markus Himmel
8fe260de55
feat: termination arguments for String.ValidPos and String.Slice.Pos (#10933)
This PR adds the basic infrastructure to perform termination proofs
about `String.ValidPos` and `String.Slice.Pos`.

We choose approach where the intended way to do termination arguments is
to argue about the position itself rather than some projection of it
like `remainingBytes`.

The types `String.ValidPos` and `String.Slice.Pos` are equipped with a
`WellFoundedRelation` instance given by the greater-than relation. This
means that if a function takes a position `p` and performs a recursive
call on `q`, then the decreasing obligation will be `p < q`. This works
well in the common case where `q` is `p.next h`, in which case the goal
`p < p.next h` is solved by the simplifier.

For stepping through a string backwards, we introduce a type synonym
with a `WellFoundedRelation` instance given by the less-than relation.
This means that if a function takes a position `p` and performs a
recursive call on `q` and specifies `termination_by p.down`, then the
decreasing obligation will be `q < p`. This works well in the case where
`q` is `p.prev h`, in which case the goal `p.prev h < p` is solved by
the simplifier.

For termination arguments invoving multiple strings, the lower-level
primitive `p.remainingBytes` (landing in `Nat`) is also available.

In a future PR, we will additionally provide the necessary typeclasses
instances to register `String.ValidPos` and `String.Slice.Pos` with
`grind` to make complex termination arguments more convenient in user
code.
2025-10-27 10:05:44 +00:00
Markus Himmel
59573646c2
chore: more minor String improvements (#10930)
This PR moves some more material out of `Init.Data.String.Basic` and
fixes the incorrect name
`String.Pos.Raw.IsValidForSlice.le_utf8ByteSize`.
2025-10-23 13:57:23 +00:00
Markus Himmel
ba7798b389
chore: more reorganization of strings (#10928)
This PR splits more material out of `Init.Data.String.Basic`.
2025-10-23 11:56:11 +00:00
Rob23oba
fad0e69cc7
fix: make name mangling unambiguous (#10727)
This PR fixes name mangling to be unambiguous / injective by adding `00`
for disambiguation where necessary. Additionally, the inverse function,
`Lean.Name.unmangle` has been added which can be used to unmangle a
mangled identifier. This unmangler has been added to demonstrate the
injectivity but also to allow unmangling identifiers e.g. for debugging
purposes.

Closes #10724
2025-10-23 07:18:07 +00:00
Markus Himmel
3ce7d4ef5c
chore: minor optimizations on the critical path (#10900)
This PR optimizes two `String` proofs and makes sure that
`MkIffOfInductiveProp` does not import `Lean.Elab.Tactic`, which
previously pushed it to the very end of the import graph.
2025-10-22 19:32:26 +00:00
Markus Himmel
b5dc11e8d3
chore: move some material out of Init.Data.String.Basic (#10893)
This PR splits some low-hanging fruit out of `Init.Data.String.Basic`:
basic material about `String.Pos.Raw`, `String.Substrig`, and
`String.Iterator`.

More splitting required and the remaining material is quite unorganized,
but it's a start.
2025-10-22 16:31:08 +00:00
Markus Himmel
6a1cc7d6b8
chore: minor String improvements (#10891)
This PR renames the cast functions on `String.ValidPos` for `set` and
`modify` to adhere to the established naming convention.

It also fixes two typos and very slighly tweaks the import graph,
shortening the critical path by a negligible amount.
2025-10-22 06:35:51 +00:00
Markus Himmel
b28daa6d60
chore: rename String.endPos -> String.rawEndPos (#10853)
This PR renames `String.endPos` to `String.rawEndPos`, as in a future
release the name `String.endPos` will be taken by the function that is
currently called `String.endValidPos`.
2025-10-21 11:25:30 +00:00
Markus Himmel
196d50156a
fix: logic error in String.Slice.takeWhile (#10868)
This PR fixes a bug in `String.Slice.takeWhile` which caused it to get
its bookkeeping wrong and panic. The new version only uses safe
operations on `String.Slice.Pos`.
2025-10-21 09:52:11 +00:00
Markus Himmel
dad541265c
refactor: move operations on String.Pos.Raw to the String.Pos.Raw namespace (#10735)
This PR moves many operations involving `String.Pos.Raw` to a the
`String.Pos.Raw` namespace with the eventual aim of freeing up the
`String` namespace to contain operations using `String.ValidPos` (to be
renamed to `String.Pos`) instead.

This PR adds the `String.ValidPos.set` and `String.ValidPos.modify`
functions.

After this PR, `String.pos_lt_eq` is no longer a `simp` lemma. Add
`String.Pos.Raw.lt_iff` as a `simp` lemma if your proofs break.
2025-10-18 12:12:55 +00:00
Markus Himmel
ca7a8e18b7
refactor: rename String.split to String.splitToList (#10822)
This PR renames `String.split` to `String.splitToList`, because soon the
name `String.split` will be used by a new implementation which is
superior because it is polymorphic over the pattern kind and it returns
an iterator of slices instead of a list of strings.
2025-10-18 12:12:54 +00:00
Sebastian Ullrich
428355cf02
chore: remove redundant imports in core (#10750) 2025-10-16 20:27:46 +00:00
Markus Himmel
1dae353575
chore: duplicate some String functions ahead of deprecation (#10768)
This PR is split off from #10735 for boring bootstrapping reasons.
2025-10-14 07:36:05 +00:00