fix: make Substring.beq reflexive (#10552)

This PR ensures that `Substring.beq` is reflexive, and in particular
satisfies the equivalence `ss1 == ss2 <-> ss1.toString = ss2.toString`.

Closes #10511.

Note: I also fixed a strange line in the `String.extract` documentation
which looks like it may have been a copypasta, and added another example
to show how invalid UTF8 positions work, but the doc also makes a point
of saying that it is unspecified so maybe it would be better not to have
the example? 🤷
This commit is contained in:
Mario Carneiro 2025-09-25 07:08:41 +02:00 committed by GitHub
parent 055060990c
commit 9f41f3324a
No known key found for this signature in database
GPG key ID: B5690EEEBB952194

View file

@ -2343,7 +2343,8 @@ Examples:
* `"red green blue".extract ⟨3⟩ ⟨0⟩ = ""`
* `"red green blue".extract ⟨0⟩ ⟨100⟩ = "red green blue"`
* `"red green blue".extract ⟨4⟩ ⟨100⟩ = "green blue"`
* `"L∃∀N".extract ⟨2⟩ ⟨100⟩ = "green blue"`
* `"L∃∀N".extract ⟨1⟩ ⟨2⟩ = "∃∀N"`
* `"L∃∀N".extract ⟨2⟩ ⟨100⟩ = ""`
-/
@[extern "lean_string_utf8_extract", expose]
def extract : (@& String) → (@& Pos) → (@& Pos) → String
@ -3410,6 +3411,14 @@ def toNat? (s : Substring) : Option Nat :=
else
none
/--
Given a `Substring`, returns another one which has valid endpoints
and represents the same substring according to `Substring.toString`.
(Note, the substring may still be inverted, i.e. beginning greater than end.)
-/
def repair : Substring → Substring
| ⟨s, b, e⟩ => ⟨s, if b.isValid s then b else s.endPos, if e.isValid s then e else s.endPos⟩
/--
Checks whether two substrings represent equal strings. Usually accessed via the `==` operator.
@ -3417,6 +3426,8 @@ Two substrings do not need to have the same underlying string or the same start
instead, they are equal if they contain the same sequence of characters.
-/
def beq (ss1 ss2 : Substring) : Bool :=
let ss1 := ss1.repair
let ss2 := ss2.repair
ss1.bsize == ss2.bsize && ss1.str.substrEq ss1.startPos ss2.str ss2.startPos ss1.bsize
@[export lean_substring_beq]