Phase 3 Lean 4.30 String-internals analysis
Attempted to prove `readIdent_app` (the key distribution lemma)
but hit a wall on `String.push c ++ xs.asString = s ++ (c :: xs).asString`
which is not definitionally true in Lean 4.30 (String is
UTF-8 ByteArray-backed, not a List Char structure).
Documents two cleaner paths forward:
(a) Refactor `readIdent`/`readStrLit` to accumulate into
`List Char` instead of `String`, decoupling proofs from
String internals.
(b) Import Mathlib's richer String API which provides the
needed structural lemmas.
The committed state:
· Atomic Phase 3 witnesses via decide (5 theorems, kernel-rooted).
· 3 foundation tokenize lemmas (lparen/rparen/space).
· The full Phase 3 universal lemmas documented inline as
open work, with a proof sketch for both paths.
The token-level universal (already proven) plus closed-instance
decide tests cover the round-trip operationally. Adding Phase 3
proper is a focused multi-day refactor.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
parent
62b73bb176
commit
665046a353
1 changed files with 20 additions and 13 deletions
|
|
@ -821,21 +821,28 @@ theorem tokenize_render_artifact_empty :
|
|||
MetaArtifact.toTokens MetaArtifact.empty := by
|
||||
decide
|
||||
|
||||
-- The recursive arms (`.str`, `.app`, `.lam`, etc.) require the
|
||||
-- four substantial lemmas sketched below — multi-page Lean
|
||||
-- reasoning about `readIdent` / `readStrLit` distribution.
|
||||
-- Documented as future work; the token-level universal above
|
||||
-- plus `decide` on closed instances cover the round-trip
|
||||
-- operationally in the meantime.
|
||||
-- ── Phase 3 distribution lemmas — open question ────────────────────────────
|
||||
-- Proving `tokenize ∘ render = toTokens` requires four substantial
|
||||
-- distribution lemmas:
|
||||
--
|
||||
-- readIdent_app : readIdent on (ident-chars ++ rest) where
|
||||
-- rest starts cleanly returns (acc ++ ident, rest).
|
||||
-- readStrLit_app : readStrLit on (escapeStrLit body ++ '\"' ++ rest)
|
||||
-- returns (body, rest).
|
||||
-- readIdent_app : reading an ident sequence followed by a clean
|
||||
-- boundary returns the accumulated ident.
|
||||
-- readStrLit_app: reading an escapeStrLit-encoded body returns
|
||||
-- the original String.
|
||||
-- tokenize_app_clean: tokenize distributes over a concatenation
|
||||
-- where the prefix ends "cleanly".
|
||||
-- tokenize_render_X: induction over each meta-mirror type using
|
||||
-- the above plus IH on sub-values.
|
||||
-- where the prefix ends "cleanly".
|
||||
-- tokenize_render_X: induction over each meta-mirror type.
|
||||
--
|
||||
-- Each requires reasoning about Lean 4.30's `String` API (UTF-8
|
||||
-- ByteArray-backed) — specifically the equation
|
||||
-- `s.push c ++ xs.asString = s ++ (c :: xs).asString`, which is
|
||||
-- *not* definitionally true and needs a String-internals lemma.
|
||||
--
|
||||
-- Cleaner paths forward: (a) refactor `readIdent`/`readStrLit` to
|
||||
-- accumulate into a `List Char` rather than a `String`, then build
|
||||
-- the String at the end via `List.asString` — decouples the proof
|
||||
-- from String internals; or (b) import Mathlib's richer String
|
||||
-- API which provides the needed lemmas.
|
||||
|
||||
-- ── Round-trip — atomic kernel-reducible witnesses ─────────────────────────
|
||||
-- For non-recursive shapes the round-trip is closed by `rfl` (or
|
||||
|
|
|
|||
Loading…
Add table
Reference in a new issue