lean4-htt/tests/elab/string_neq_kernel_cost.lean
Joachim Breitner c2d4079193
perf: optimize string literal equality simprocs for kernel efficiency (#12887)
This PR optimizes the `String.reduceEq`, `String.reduceNe`, and
`Sym.Simp` string equality simprocs to produce kernel-efficient proofs.
Previously, these used `String.decEq` which forced the kernel to run
UTF-8 encoding/decoding and byte array comparison, causing 86+ kernel
unfoldings on short strings.

The new approach reduces string inequality to `List Char` via
`String.ofList_injective`, then uses two strategies depending on the
difference:

- **Different characters at position `i`**: Projects to `Nat` via
`congrArg (fun l => (List.get!Internal l i).toNat)`, then uses
`Nat.ne_of_beq_eq_false rfl`. This avoids `Decidable` instances entirely
— the kernel only evaluates `Nat.beq` on two concrete natural numbers.

- **One string is a prefix of the other**: Uses `congrArg (List.drop n
·)` with `List.cons_ne_nil`, which is a definitional proof requiring no
`decide` step at all.

For equal strings, `eq_true rfl` avoids kernel evaluation entirely.

The shared proof construction is in `Lean.Meta.mkStringLitNeProof`
(`Lean/Meta/StringLitProof.lean`), used by both the standard simprocs
and the `Sym.Simp` ground evaluator.

Kernel max unfolds for `"hello" ≠ "foo"`: 86+ → 6.

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-14 10:30:31 +00:00

65 lines
1.9 KiB
Text

/-!
Tests that string equality/inequality simprocs produce kernel-efficient proofs.
Previously, `String.reduceNe` and `String.reduceEq` used `decide` with `String.decEq`, which
forced the kernel to run UTF-8 encoding/decoding and byte array comparison. This caused a high
number of kernel unfoldings (e.g. 86 for `List.rec` on short strings).
The fix uses `String.ofList_injective` to reduce to `List Char` comparison, which is much
cheaper in the kernel since string literals are `String.ofList [chars]` at the kernel level.
-/
-- String.reduceNe: different strings (the primary use case)
-- Kernel should NOT unfold ByteArray/utf8 functions
set_option diagnostics true in
set_option diagnostics.threshold 16 in
/-- -/
#guard_msgs in
example : "hello" ≠ "foo" := by simp
-- String.reduceNe: equal strings
set_option diagnostics true in
set_option diagnostics.threshold 16 in
/-- -/
#guard_msgs in
example : "hello" ≠ "hello" ↔ False := by simp
-- String.reduceEq: equal strings
set_option diagnostics true in
set_option diagnostics.threshold 16 in
/-- -/
#guard_msgs in
example : "hello" = "hello" := by simp
-- String.reduceEq: different strings
set_option diagnostics true in
set_option diagnostics.threshold 16 in
/-- -/
#guard_msgs in
example : "hello" = "foo" ↔ False := by simp
-- Corner case: empty string vs nonempty
set_option diagnostics true in
set_option diagnostics.threshold 16 in
/-- -/
#guard_msgs in
example : "" ≠ "a" := by simp
set_option diagnostics true in
set_option diagnostics.threshold 16 in
/-- -/
#guard_msgs in
example : "a" ≠ "" := by simp
-- Corner case: one string is a prefix of the other
set_option diagnostics true in
set_option diagnostics.threshold 16 in
/-- -/
#guard_msgs in
example : "foo" ≠ "foobar" := by simp
set_option diagnostics true in
set_option diagnostics.threshold 16 in
/-- -/
#guard_msgs in
example : "foobar" ≠ "foo" := by simp