Phase 3 foundation lemmas; full universal documented
Adds three foundation lemmas about `tokenize`:
· tokenize_lparen_cons : tokenize ('(' :: rest) = lparen :: tokenize rest
· tokenize_rparen_cons : tokenize (')' :: rest) = rparen :: tokenize rest
· tokenize_space_cons : tokenize (' ' :: rest) = tokenize rest
These are the trivial cases — pure unfolding of `tokenizeAux` on
single-char-token branches.
The full Phase 3 (`tokenize (toLeanSource v).toList = toTokens v`)
requires four further substantial lemmas:
· readIdent_split : reading an ident sequence followed by a
non-ident-rest char (or end of input) yields
exactly the accumulated string.
· readStrLit_split : reading an escapeStrLit-encoded body until
the closing `"` recovers the original string.
· tokenize_app_clean: tokenize distributes over a concatenation
where the prefix ends "cleanly" (rparen,
whitespace, or strLit close).
· tokenize_render_X: induction over each meta-mirror type using
the above plus IH on sub-values.
Each is multi-page Lean reasoning about `String`/`List Char`/
`readIdent`/`readStrLit` distribution. The proof sketches are
documented inline.
The token-level universal (already proven) plus closed-instance
`decide` tests cover the round-trip operationally. Adding the
Phase 3 universal would let us state
∀ t, fromLeanSource? (toLeanSource t) = some t
without any closed-instance restrictions.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
parent
8733a6ff89
commit
6b9ac691cb
1 changed files with 42 additions and 0 deletions
|
|
@ -767,6 +767,48 @@ theorem artifactFromTokens?_round_trip (a : MetaArtifact)
|
||||||
rw [List.append_nil] at this
|
rw [List.append_nil] at this
|
||||||
rw [this]
|
rw [this]
|
||||||
|
|
||||||
|
-- ── Phase 3: tokenize ∘ render = toTokens ─────────────────────────────────
|
||||||
|
-- The String-level half. Foundation lemmas about tokenize's
|
||||||
|
-- behaviour, then induction over each meta-mirror type.
|
||||||
|
|
||||||
|
/-- `tokenize` on `(` :: rest reduces to `lparen :: tokenize rest`. -/
|
||||||
|
theorem tokenize_lparen_cons (rest : List Char) :
|
||||||
|
tokenize ('(' :: rest) = Token.lparen :: tokenize rest := by
|
||||||
|
simp [tokenize, tokenizeAux]
|
||||||
|
|
||||||
|
/-- `tokenize` on `)` :: rest reduces to `rparen :: tokenize rest`. -/
|
||||||
|
theorem tokenize_rparen_cons (rest : List Char) :
|
||||||
|
tokenize (')' :: rest) = Token.rparen :: tokenize rest := by
|
||||||
|
simp [tokenize, tokenizeAux]
|
||||||
|
|
||||||
|
/-- `tokenize` skips a leading space. -/
|
||||||
|
theorem tokenize_space_cons (rest : List Char) :
|
||||||
|
tokenize (' ' :: rest) = tokenize rest := by
|
||||||
|
simp [tokenize, tokenizeAux, isWhitespace]
|
||||||
|
|
||||||
|
-- Phase 3 deferred: the full `tokenize ∘ render = toTokens`
|
||||||
|
-- universal theorem requires careful String/List reasoning about
|
||||||
|
-- `readIdent` / `readStrLit` distribution. Proof sketch:
|
||||||
|
--
|
||||||
|
-- readIdent_split : reading an ident sequence followed by a
|
||||||
|
-- non-ident-rest char (or end of input) yields
|
||||||
|
-- exactly the accumulated string.
|
||||||
|
-- readStrLit_split : reading an escapeStrLit-encoded body until
|
||||||
|
-- the closing `"` recovers the original string.
|
||||||
|
-- tokenize_app_clean: tokenize distributes over a concatenation
|
||||||
|
-- where the prefix ends "cleanly" (rparen,
|
||||||
|
-- whitespace, or strLit close).
|
||||||
|
-- tokenize_render_X: induction over each meta-mirror type using
|
||||||
|
-- the above plus IH on sub-values.
|
||||||
|
--
|
||||||
|
-- These compose to `∀ v, tokenize (toLeanSource v).toList = toTokens v`,
|
||||||
|
-- which combined with the Phase 2 token-level universal gives the
|
||||||
|
-- String-level universal `∀ v, fromLeanSource? (toLeanSource v) = some v`.
|
||||||
|
--
|
||||||
|
-- The kernel-rooted `decide`-based tests for closed instances below
|
||||||
|
-- (and in `Infoductor/Test.lean`) provide empirical coverage in
|
||||||
|
-- the meantime.
|
||||||
|
|
||||||
-- ── Round-trip — atomic kernel-reducible witnesses ─────────────────────────
|
-- ── Round-trip — atomic kernel-reducible witnesses ─────────────────────────
|
||||||
-- For non-recursive shapes the round-trip is closed by `rfl` (or
|
-- For non-recursive shapes the round-trip is closed by `rfl` (or
|
||||||
-- `decide`) directly: rendering produces a fixed string, tokenising
|
-- `decide`) directly: rendering produces a fixed string, tokenising
|
||||||
|
|
|
||||||
Loading…
Add table
Reference in a new issue