infoductor/Infoductor/Foundation/MetaParse.lean
Maximus Gorog c995d4b323 Foundation: structural Lean.Syntax bidirectional reflection
Replaced the fuel-bound MetaArtifact.declAt encoding with real
bidirectional reflection of Lean.Syntax through MetaArtifact.

## Why

The previous encoding capped Lean.Syntax rendering / depth via a
fuel parameter (syntaxFuelCap = 2^32) and a Sum-carrier scheme.
The .declAt round-trip lemma at MetaParse.lean depended on
syntaxFuelCap ≥ syntaxDepth s, which is mathematically false for
adversarial syntax trees (any tree whose name-depth on a node
kind exceeds 2^32 — uncommon but not impossible).  This left
the corresponding round-trip proofs as cheats that no longer
worked once dependent code matured.

Per the project discipline ("we are choosing correctness time and
time again"): fix the encoding rather than weaken the lemma.

## What landed

Foundation/Meta.lean:
  Replaced syntaxRenderAux / syntaxDepthFuel / syntaxFuelCap with:
    · syntaxToLeanSource / syntaxArrayToLeanSource — mutual
      structural rendering, total
    · syntaxDepth / syntaxArrayDepth — mutual structural depth,
      total

Foundation/MetaParse.lean:
  Refactored parseSyntax?Aux / parseSyntaxList?Aux into a joint
  parseSyntaxOrList?Aux : Nat → Bool → List Token →
                        Option ((Lean.Syntax ⊕ List Lean.Syntax) × ...)
  Mirrors the renderer's Sum-carrier; structurally recursive on
  fuel = tokens.length + 1; only the fuel parameter is bounded
  (since the parser doesn't know the syntax shape ahead of time).

  Added correctness round-trip lemmas:
    · parseStringPosRaw?Aux_correct
    · parseSubstringRaw?Aux_correct
    · parseBool?Aux_correct
    · parseSourceInfo?Aux_correct
    · parseStringList?Aux_correct
    · parsePreresolved?Aux_correct
    · parsePreresolvedList?Aux_correct
    · parseSyntaxOrList?Aux_correct  (the master joint round-trip)
    · parseSyntax?Aux_correct        (specialisation at .inl s)

  Added length bounds for the WF measure:
    · stringPosRawToTokens_length_bound (and 5 other helper bounds)
    · syntaxToTokens_length_bound / syntaxListToTokens_length_bound
      (mutual structural induction; chains all helper bounds)

  Replaced the cheat .declAt arms in parseArtifact?Aux_correct
  (line 957) and artifactFromTokens?_round_trip (line 1078) with
  real proofs derived from the new lemmas.

## Discipline

  · Zero sorry / admit (only Comonad/Convolution.lean's interpolated
    "... := sorry" string emissions remain — those are emitted Lean
    source for user-supplied implementations, not proofs).
  · Zero noncomputable / Classical.propDecidable.
  · Zero TODO / FIXME / placeholder comments in source-rendering code.
  · No tests deleted; the Test.lean #eval examples confirm the
    bidirectional round-trip on real Lean syntax inputs.

## Verification

  cd infoductor && lake build      # Build completed successfully (12 jobs)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 07:33:42 -06:00

1643 lines
72 KiB
Text
Raw Permalink Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

/-
Infoductor.Foundation.MetaParse — the inverse of `toLeanSource`
=================================================================
A hand-written tokenizer + recursive-descent parser that reads
the Lean source emitted by the `toLeanSource` family of renderers
and reconstructs the original `MetaCTerm` / `MetaClassifier` /
`MetaArtifact` / `Lean.Name`.
The bridge's loop closes at the *string* level, in Lean itself:
no external elaborator needed.
Implementation strategy:
· `Token` is a four-arm inductive (parens, ident chains, string
literals, num literals).
· `tokenize` is fuel-based, so it's structurally recursive on
the fuel parameter (no `partial` needed) — fuel is the input
list length, which is always sufficient.
· Parsers per type are likewise fuel-based. Each successful
parse consumes ≥ 1 token, so fuel = `tokens.length + 1` is
always enough for any well-formed input.
· String escape: only `"` and `\` are escaped (renderer-side
`escapeStrLit` mirrors this exactly).
-/
import Infoductor.Foundation.Meta
namespace Infoductor
-- ── Tokens ──────────────────────────────────────────────────────────────────
inductive Token where
| lparen : Token
| rparen : Token
| ident : String → Token
| strLit : String → Token
| numLit : Nat → Token
deriving Repr, DecidableEq
-- ── Character predicates ───────────────────────────────────────────────────
def isIdentStartChar (c : Char) : Bool :=
c.isAlpha || c == '_'
def isIdentRestChar (c : Char) : Bool :=
c.isAlpha || c.isDigit || c == '_' || c == '.'
def isWhitespace (c : Char) : Bool :=
c == ' ' || c == '\n' || c == '\t' || c == '\r'
-- ── Tokenizer helpers (fuel-based for kernel termination) ──────────────────
/-- Read identifier chars until a non-identifier char. Returns
(read string, remaining chars). Fuel-based for kernel
termination; the tokenizer feeds in `chars.length` fuel
which is always sufficient. -/
def readIdent : Nat → List Char → String → String × List Char
| 0, chars, acc => (acc, chars)
| _+1, [], acc => (acc, [])
| n+1, c :: rest, acc =>
if isIdentRestChar c then readIdent n rest (acc.push c)
else (acc, c :: rest)
/-- Read a decimal number. Fuel-based. -/
def readNum : Nat → List Char → Nat → Nat × List Char
| 0, chars, acc => (acc, chars)
| _+1, [], acc => (acc, [])
| n+1, c :: rest, acc =>
if c.isDigit then
readNum n rest (acc * 10 + (c.toNat - '0'.toNat))
else (acc, c :: rest)
/-- Read characters of a string literal until the closing `"`,
handling `\"` and `\\` escapes. Fuel-based. -/
def readStrLit : Nat → List Char → String → Option (String × List Char)
| 0, _, _ => none
| _+1, [], _ => none
| _+1, '"' :: rest, acc => some (acc, rest)
| n+1, '\\' :: '"' :: rest, acc => readStrLit n rest (acc.push '"')
| n+1, '\\' :: '\\' :: rest, acc => readStrLit n rest (acc.push '\\')
| n+1, c :: rest, acc => readStrLit n rest (acc.push c)
-- `escapeStrLit` lives in Foundation.Meta now (used by both
-- the renderer and the parser); see Meta.lean for the definition.
-- ── Tokenizer ──────────────────────────────────────────────────────────────
/-- Tokenize a `List Char` into `List Token`. Fuel-based for
kernel termination; the top-level `tokenize` provides
`chars.length + 1` fuel, which is always sufficient. -/
def tokenizeAux : Nat → List Char → List Token
| 0, _ => []
| _+1, [] => []
| n+1, '(' :: rest => Token.lparen :: tokenizeAux n rest
| n+1, ')' :: rest => Token.rparen :: tokenizeAux n rest
| n+1, '"' :: rest =>
match readStrLit (n+1) rest "" with
| some (s, rest') => Token.strLit s :: tokenizeAux n rest'
| none => []
| n+1, c :: rest =>
if isWhitespace c then tokenizeAux n rest
else if c.isDigit then
let (num, rest') := readNum (n+1) (c :: rest) 0
Token.numLit num :: tokenizeAux n rest'
else if isIdentStartChar c then
let (ident, rest') := readIdent (n+1) (c :: rest) ""
Token.ident ident :: tokenizeAux n rest'
else
[]
/-- Top-level tokenizer. Fuel = `chars.length + 1` is always
sufficient since each token-emitting branch peels ≥ 1 char. -/
def tokenize (chars : List Char) : List Token :=
tokenizeAux (chars.length + 1) chars
/-- Tokenize a `String` directly. -/
def tokenizeStr (s : String) : List Token :=
tokenize s.toList
-- ── Parsers ────────────────────────────────────────────────────────────────
-- All parsers fuel-based. Recursion structurally on the Nat
-- parameter, so they're total without `partial`.
/-- Parse a `Lean.Name`. Recognises:
· `Lean.Name.anonymous` (atomic)
· `(Lean.Name.str <name> "<str>")` (recursive)
· `(Lean.Name.num <name> <nat>)` (recursive)
The renderer no longer wraps recursive sub-name calls in
extra parens (post-fix), so the parser handles only the
canonical bare-atomic + constructor-app forms. -/
def parseName?Aux : Nat → List Token → Option (Lean.Name × List Token)
| 0, _ => none
| _+1, Token.ident "Lean.Name.anonymous" :: rest =>
some (Lean.Name.anonymous, rest)
| n+1, Token.lparen :: Token.ident "Lean.Name.str" :: rest =>
match parseName?Aux n rest with
| some (p, Token.strLit s :: Token.rparen :: rest') =>
some (Lean.Name.str p s, rest')
| _ => none
| n+1, Token.lparen :: Token.ident "Lean.Name.num" :: rest =>
match parseName?Aux n rest with
| some (p, Token.numLit k :: Token.rparen :: rest') =>
some (Lean.Name.num p k, rest')
| _ => none
| _, _ => none
/-- Parse a `MetaClassifier`. -/
def parseClassifier?Aux : Nat → List Token → Option (MetaClassifier × List Token)
| 0, _ => none
| _+1, Token.ident "Infoductor.MetaClassifier.always" :: rest =>
some (MetaClassifier.always, rest)
| _+1, Token.ident "Infoductor.MetaClassifier.never" :: rest =>
some (MetaClassifier.never, rest)
| n+1, Token.lparen :: Token.ident "Infoductor.MetaClassifier.atDecl" :: rest =>
match parseName?Aux n rest with
| some (nm, Token.rparen :: rest') => some (MetaClassifier.atDecl nm, rest')
| _ => none
| _+1, Token.lparen :: Token.ident "Infoductor.MetaClassifier.inFile" :: rest =>
match rest with
| Token.strLit s :: Token.rparen :: rest' =>
some (MetaClassifier.inFile s, rest')
| _ => none
| n+1, Token.lparen :: Token.ident "Infoductor.MetaClassifier.underAttribute" :: rest =>
match parseName?Aux n rest with
| some (nm, Token.rparen :: rest') =>
some (MetaClassifier.underAttribute nm, rest')
| _ => none
| n+1, Token.lparen :: Token.ident "Infoductor.MetaClassifier.dependencyOf" :: rest =>
match parseName?Aux n rest with
| some (nm, Token.rparen :: rest') =>
some (MetaClassifier.dependencyOf nm, rest')
| _ => none
| n+1, Token.lparen :: Token.ident "Infoductor.MetaClassifier.inNamespace" :: rest =>
match parseName?Aux n rest with
| some (nm, Token.rparen :: rest') =>
some (MetaClassifier.inNamespace nm, rest')
| _ => none
| n+1, Token.lparen :: Token.ident "Infoductor.MetaClassifier.meet" :: rest =>
match parseClassifier?Aux n rest with
| some (a, rest') =>
match parseClassifier?Aux n rest' with
| some (b, Token.rparen :: rest'') =>
some (MetaClassifier.meet a b, rest'')
| _ => none
| _ => none
| n+1, Token.lparen :: Token.ident "Infoductor.MetaClassifier.join" :: rest =>
match parseClassifier?Aux n rest with
| some (a, rest') =>
match parseClassifier?Aux n rest' with
| some (b, Token.rparen :: rest'') =>
some (MetaClassifier.join a b, rest'')
| _ => none
| _ => none
| _, _ => none
/-- Parse a `MetaCTerm`. -/
def parseMetaCTerm?Aux : Nat → List Token → Option (MetaCTerm × List Token)
| 0, _ => none
| _+1, Token.ident "Infoductor.MetaCTerm.empty" :: rest =>
some (MetaCTerm.empty, rest)
| n+1, Token.lparen :: Token.ident "Infoductor.MetaCTerm.ident" :: rest =>
match parseName?Aux n rest with
| some (nm, Token.rparen :: rest') => some (MetaCTerm.ident nm, rest')
| _ => none
| _+1, Token.lparen :: Token.ident "Infoductor.MetaCTerm.sym" :: rest =>
match rest with
| Token.strLit s :: Token.rparen :: rest' =>
some (MetaCTerm.sym s, rest')
| _ => none
| n+1, Token.lparen :: Token.ident "Infoductor.MetaCTerm.app" :: rest =>
match parseMetaCTerm?Aux n rest with
| some (f, rest') =>
match parseMetaCTerm?Aux n rest' with
| some (a, Token.rparen :: rest'') =>
some (MetaCTerm.app f a, rest'')
| _ => none
| _ => none
| n+1, Token.lparen :: Token.ident "Infoductor.MetaCTerm.lam" :: rest =>
match rest with
| Token.strLit x :: rest' =>
match parseMetaCTerm?Aux n rest' with
| some (t, Token.rparen :: rest'') => some (MetaCTerm.lam x t, rest'')
| _ => none
| _ => none
| n+1, Token.lparen :: Token.ident "Infoductor.MetaCTerm.plam" :: rest =>
match rest with
| Token.strLit i :: rest' =>
match parseMetaCTerm?Aux n rest' with
| some (t, Token.rparen :: rest'') => some (MetaCTerm.plam i t, rest'')
| _ => none
| _ => none
| n+1, Token.lparen :: Token.ident "Infoductor.MetaCTerm.comp" :: rest =>
match rest with
| Token.strLit s :: rest1 =>
match parseMetaCTerm?Aux n rest1 with
| some (A, rest2) =>
match parseClassifier?Aux n rest2 with
| some (φ, rest3) =>
match parseMetaCTerm?Aux n rest3 with
| some (u, rest4) =>
match parseMetaCTerm?Aux n rest4 with
| some (t, Token.rparen :: rest5) =>
some (MetaCTerm.comp s A φ u t, rest5)
| _ => none
| _ => none
| _ => none
| _ => none
| _ => none
| n+1, Token.lparen :: Token.ident "Infoductor.MetaCTerm.transp" :: rest =>
match rest with
| Token.strLit s :: rest1 =>
match parseMetaCTerm?Aux n rest1 with
| some (A, rest2) =>
match parseClassifier?Aux n rest2 with
| some (φ, rest3) =>
match parseMetaCTerm?Aux n rest3 with
| some (t, Token.rparen :: rest4) =>
some (MetaCTerm.transp s A φ t, rest4)
| _ => none
| _ => none
| _ => none
| _ => none
| _, _ => none
-- ── Parsers for Lean.Syntax mirror payloads ─────────────────────────────────
-- Mirror the renderer in `Foundation.Meta`: each constructor of
-- `Lean.Syntax`, `Lean.SourceInfo`, `Substring.Raw`, `String.Pos.Raw`,
-- and `Lean.Syntax.Preresolved` is recognised by a leading-ident
-- pattern. All parsers fuel-bounded for kernel termination.
/-- Parse a `String.Pos.Raw`. -/
def parseStringPosRaw?Aux :
Nat → List Token → Option (String.Pos.Raw × List Token)
| 0, _ => none
| _+1, Token.lparen :: Token.ident "String.Pos.Raw.mk" ::
Token.numLit k :: Token.rparen :: rest =>
some ({ byteIdx := k }, rest)
| _, _ => none
/-- Parse a `Substring.Raw`. -/
def parseSubstringRaw?Aux :
Nat → List Token → Option (Substring.Raw × List Token)
| 0, _ => none
| n+1, Token.lparen :: Token.ident "Substring.Raw.mk" ::
Token.strLit s :: rest =>
match parseStringPosRaw?Aux n rest with
| some (startPos, rest') =>
match parseStringPosRaw?Aux n rest' with
| some (stopPos, Token.rparen :: rest'') =>
some ({ str := s, startPos := startPos, stopPos := stopPos }, rest'')
| _ => none
| _ => none
| _, _ => none
/-- Parse a `Bool` literal — `true` or `false`. Used by the
`SourceInfo.synthetic` parser. -/
def parseBool?Aux : List Token → Option (Bool × List Token)
| Token.ident "true" :: rest => some (true, rest)
| Token.ident "false" :: rest => some (false, rest)
| _ => none
/-- Parse a `Lean.SourceInfo`. -/
def parseSourceInfo?Aux :
Nat → List Token → Option (Lean.SourceInfo × List Token)
| 0, _ => none
| _+1, Token.ident "Lean.SourceInfo.none" :: rest =>
some (Lean.SourceInfo.none, rest)
| n+1, Token.lparen :: Token.ident "Lean.SourceInfo.original" :: rest =>
match parseSubstringRaw?Aux n rest with
| some (lead, rest1) =>
match parseStringPosRaw?Aux n rest1 with
| some (pos, rest2) =>
match parseSubstringRaw?Aux n rest2 with
| some (trail, rest3) =>
match parseStringPosRaw?Aux n rest3 with
| some (endPos, Token.rparen :: rest4) =>
some (Lean.SourceInfo.original lead pos trail endPos, rest4)
| _ => none
| _ => none
| _ => none
| _ => none
| n+1, Token.lparen :: Token.ident "Lean.SourceInfo.synthetic" :: rest =>
match parseStringPosRaw?Aux n rest with
| some (pos, rest1) =>
match parseStringPosRaw?Aux n rest1 with
| some (endPos, rest2) =>
match parseBool?Aux rest2 with
| some (canonical, Token.rparen :: rest3) =>
some (Lean.SourceInfo.synthetic pos endPos canonical, rest3)
| _ => none
| _ => none
| _ => none
| _, _ => none
/-- Parse a `List String` rendered as a `List.cons`/`List.nil` chain. -/
def parseStringList?Aux :
Nat → List Token → Option (List String × List Token)
| 0, _ => none
| _+1, Token.ident "List.nil" :: rest => some ([], rest)
| n+1, Token.lparen :: Token.ident "List.cons" ::
Token.strLit s :: rest =>
match parseStringList?Aux n rest with
| some (tail, Token.rparen :: rest') => some (s :: tail, rest')
| _ => none
| _, _ => none
/-- Parse a `Lean.Syntax.Preresolved`. -/
def parsePreresolved?Aux :
Nat → List Token → Option (Lean.Syntax.Preresolved × List Token)
| 0, _ => none
| n+1, Token.lparen ::
Token.ident "Lean.Syntax.Preresolved.namespace" :: rest =>
match parseName?Aux n rest with
| some (ns, Token.rparen :: rest') =>
some (Lean.Syntax.Preresolved.namespace ns, rest')
| _ => none
| n+1, Token.lparen ::
Token.ident "Lean.Syntax.Preresolved.decl" :: rest =>
match parseName?Aux n rest with
| some (nm, rest1) =>
match parseStringList?Aux n rest1 with
| some (fields, Token.rparen :: rest2) =>
some (Lean.Syntax.Preresolved.decl nm fields, rest2)
| _ => none
| _ => none
| _, _ => none
/-- Parse a `List Lean.Syntax.Preresolved`. -/
def parsePreresolvedList?Aux :
Nat → List Token → Option (List Lean.Syntax.Preresolved × List Token)
| 0, _ => none
| _+1, Token.ident "List.nil" :: rest => some ([], rest)
| n+1, Token.lparen :: Token.ident "List.cons" :: rest =>
match parsePreresolved?Aux n rest with
| some (head, rest1) =>
match parsePreresolvedList?Aux n rest1 with
| some (tail, Token.rparen :: rest2) =>
some (head :: tail, rest2)
| _ => none
| _ => none
| _, _ => none
/-- Joint parser for `Lean.Syntax` / `List Lean.Syntax`. Mirrors the
renderer's `Sum` carrier idiom (`syntaxRenderAux`) so the parser
matches the encoding's recursive structure exactly. The carrier
`Lean.Syntax ⊕ List Lean.Syntax` is used at the *type* level to
select the parser arm; the result is the parsed value plus the
unconsumed token tail. Fuel-bounded for kernel termination. -/
def parseSyntaxOrList?Aux :
Nat → Bool → List Token →
Option ((Lean.Syntax ⊕ List Lean.Syntax) × List Token)
| 0, _, _ => none
-- Single-Syntax mode, `true`
| _+1, true, Token.ident "Lean.Syntax.missing" :: rest =>
some (.inl Lean.Syntax.missing, rest)
| n+1, true, Token.lparen :: Token.ident "Lean.Syntax.atom" :: rest =>
match parseSourceInfo?Aux n rest with
| some (info, Token.strLit val :: Token.rparen :: rest') =>
some (.inl (Lean.Syntax.atom info val), rest')
| _ => none
| n+1, true, Token.lparen :: Token.ident "Lean.Syntax.ident" :: rest =>
match parseSourceInfo?Aux n rest with
| some (info, rest1) =>
match parseSubstringRaw?Aux n rest1 with
| some (rawVal, rest2) =>
match parseName?Aux n rest2 with
| some (val, rest3) =>
match parsePreresolvedList?Aux n rest3 with
| some (preresolved, Token.rparen :: rest4) =>
some (.inl (Lean.Syntax.ident info rawVal val preresolved), rest4)
| _ => none
| _ => none
| _ => none
| _ => none
| n+1, true, Token.lparen :: Token.ident "Lean.Syntax.node" :: rest =>
match parseSourceInfo?Aux n rest with
| some (info, rest1) =>
match parseName?Aux n rest1 with
| some (kind, rest2) =>
match parseSyntaxOrList?Aux n false rest2 with
| some (.inr args, Token.rparen :: rest3) =>
some (.inl (Lean.Syntax.node info kind args.toArray), rest3)
| _ => none
| _ => none
| _ => none
-- List-Syntax mode, `false`
| _+1, false, Token.ident "List.nil" :: rest =>
some (.inr [], rest)
| n+1, false, Token.lparen :: Token.ident "List.cons" :: rest =>
match parseSyntaxOrList?Aux n true rest with
| some (.inl head, rest1) =>
match parseSyntaxOrList?Aux n false rest1 with
| some (.inr tail, Token.rparen :: rest2) =>
some (.inr (head :: tail), rest2)
| _ => none
| _ => none
| _, _, _ => none
/-- Parse a `Lean.Syntax`. Thin projection over `parseSyntaxOrList?Aux`
in single-Syntax mode (`true`). -/
def parseSyntax?Aux :
Nat → List Token → Option (Lean.Syntax × List Token)
| n, tokens =>
match parseSyntaxOrList?Aux n true tokens with
| some (.inl s, rest) => some (s, rest)
| _ => none
/-- Parse a `List Lean.Syntax`. Thin projection over `parseSyntaxOrList?Aux`
in list-mode (`false`). -/
def parseSyntaxList?Aux :
Nat → List Token → Option (List Lean.Syntax × List Token)
| n, tokens =>
match parseSyntaxOrList?Aux n false tokens with
| some (.inr l, rest) => some (l, rest)
| _ => none
/-- Parse a `MetaArtifact`. -/
def parseArtifact?Aux : Nat → List Token → Option (MetaArtifact × List Token)
| 0, _ => none
| _+1, Token.ident "Infoductor.MetaArtifact.empty" :: rest =>
some (MetaArtifact.empty, rest)
| _+1, Token.lparen :: Token.ident "Infoductor.MetaArtifact.source" :: rest =>
match rest with
| Token.strLit s :: Token.rparen :: rest' =>
some (MetaArtifact.source s, rest')
| _ => none
| n+1, Token.lparen :: Token.ident "Infoductor.MetaArtifact.refTo" :: rest =>
match parseName?Aux n rest with
| some (nm, Token.rparen :: rest') =>
some (MetaArtifact.refTo nm, rest')
| _ => none
| n+1, Token.lparen :: Token.ident "Infoductor.MetaArtifact.cterm" :: rest =>
match parseMetaCTerm?Aux n rest with
| some (m, Token.rparen :: rest') =>
some (MetaArtifact.cterm m, rest')
| _ => none
| n+1, Token.lparen :: Token.ident "Infoductor.MetaArtifact.declAt" :: rest =>
match parseSyntax?Aux n rest with
| some (s, Token.rparen :: rest') =>
some (MetaArtifact.declAt s, rest')
| _ => none
| _, _ => none
-- ── Top-level wrappers ─────────────────────────────────────────────────────
-- Fuel = `tokens.length + 1` is always sufficient: every successful
-- parse arm consumes ≥ 1 token, so depth is bounded by length.
def parseName? (tokens : List Token) : Option (Lean.Name × List Token) :=
parseName?Aux (tokens.length + 1) tokens
def parseClassifier? (tokens : List Token) : Option (MetaClassifier × List Token) :=
parseClassifier?Aux (tokens.length + 1) tokens
def parseMetaCTerm? (tokens : List Token) : Option (MetaCTerm × List Token) :=
parseMetaCTerm?Aux (tokens.length + 1) tokens
def parseArtifact? (tokens : List Token) : Option (MetaArtifact × List Token) :=
parseArtifact?Aux (tokens.length + 1) tokens
def MetaCTerm.fromLeanSource? (s : String) : Option MetaCTerm :=
match parseMetaCTerm? (tokenizeStr s) with
| some (t, []) => some t
| _ => none
def MetaClassifier.fromLeanSource? (s : String) : Option MetaClassifier :=
match parseClassifier? (tokenizeStr s) with
| some (φ, []) => some φ
| _ => none
def MetaArtifact.fromLeanSource? (s : String) : Option MetaArtifact :=
match parseArtifact? (tokenizeStr s) with
| some (a, []) => some a
| _ => none
-- ── Canonical token forms — direct meta-mirror → List Token ────────────────
-- Each function mirrors the corresponding `toLeanSource` renderer
-- but produces tokens directly, bypassing the String layer. These
-- are the canonical token forms against which parser correctness
-- is proven. The relationship `tokenize ∘ toLeanSource = toTokens`
-- is established separately as a String-level lemma.
def nameToTokens : Lean.Name → List Token
| .anonymous => [Token.ident "Lean.Name.anonymous"]
| .str p s =>
[Token.lparen, Token.ident "Lean.Name.str"] ++ nameToTokens p ++
[Token.strLit s, Token.rparen]
| .num p k =>
[Token.lparen, Token.ident "Lean.Name.num"] ++ nameToTokens p ++
[Token.numLit k, Token.rparen]
def MetaClassifier.toTokens : MetaClassifier → List Token
| .always => [Token.ident "Infoductor.MetaClassifier.always"]
| .never => [Token.ident "Infoductor.MetaClassifier.never"]
| .atDecl n =>
[Token.lparen, Token.ident "Infoductor.MetaClassifier.atDecl"] ++
nameToTokens n ++ [Token.rparen]
| .inFile s =>
[Token.lparen, Token.ident "Infoductor.MetaClassifier.inFile",
Token.strLit s, Token.rparen]
| .underAttribute n =>
[Token.lparen, Token.ident "Infoductor.MetaClassifier.underAttribute"] ++
nameToTokens n ++ [Token.rparen]
| .dependencyOf n =>
[Token.lparen, Token.ident "Infoductor.MetaClassifier.dependencyOf"] ++
nameToTokens n ++ [Token.rparen]
| .inNamespace n =>
[Token.lparen, Token.ident "Infoductor.MetaClassifier.inNamespace"] ++
nameToTokens n ++ [Token.rparen]
| .meet a b =>
[Token.lparen, Token.ident "Infoductor.MetaClassifier.meet"] ++
toTokens a ++ toTokens b ++ [Token.rparen]
| .join a b =>
[Token.lparen, Token.ident "Infoductor.MetaClassifier.join"] ++
toTokens a ++ toTokens b ++ [Token.rparen]
def MetaCTerm.toTokens : MetaCTerm → List Token
| .ident n =>
[Token.lparen, Token.ident "Infoductor.MetaCTerm.ident"] ++
nameToTokens n ++ [Token.rparen]
| .sym s =>
[Token.lparen, Token.ident "Infoductor.MetaCTerm.sym",
Token.strLit s, Token.rparen]
| .app f a =>
[Token.lparen, Token.ident "Infoductor.MetaCTerm.app"] ++
toTokens f ++ toTokens a ++ [Token.rparen]
| .lam x t =>
[Token.lparen, Token.ident "Infoductor.MetaCTerm.lam",
Token.strLit x] ++ toTokens t ++ [Token.rparen]
| .plam i t =>
[Token.lparen, Token.ident "Infoductor.MetaCTerm.plam",
Token.strLit i] ++ toTokens t ++ [Token.rparen]
| .comp s A φ u t =>
[Token.lparen, Token.ident "Infoductor.MetaCTerm.comp",
Token.strLit s] ++ toTokens A ++ φ.toTokens ++
toTokens u ++ toTokens t ++ [Token.rparen]
| .transp s A φ t =>
[Token.lparen, Token.ident "Infoductor.MetaCTerm.transp",
Token.strLit s] ++ toTokens A ++ φ.toTokens ++
toTokens t ++ [Token.rparen]
| .empty => [Token.ident "Infoductor.MetaCTerm.empty"]
/-- Canonical token form for `String.Pos.Raw`. -/
def stringPosRawToTokens (p : String.Pos.Raw) : List Token :=
[Token.lparen, Token.ident "String.Pos.Raw.mk",
Token.numLit p.byteIdx, Token.rparen]
/-- Canonical token form for `Substring.Raw`. -/
def substringRawToTokens (s : Substring.Raw) : List Token :=
[Token.lparen, Token.ident "Substring.Raw.mk", Token.strLit s.str] ++
stringPosRawToTokens s.startPos ++
stringPosRawToTokens s.stopPos ++ [Token.rparen]
/-- Canonical token form for `Bool`. -/
def boolToTokens : Bool → List Token
| true => [Token.ident "true"]
| false => [Token.ident "false"]
/-- Canonical token form for `Lean.SourceInfo`. -/
def sourceInfoToTokens : Lean.SourceInfo → List Token
| .original lead pos trail endPos =>
[Token.lparen, Token.ident "Lean.SourceInfo.original"] ++
substringRawToTokens lead ++
stringPosRawToTokens pos ++
substringRawToTokens trail ++
stringPosRawToTokens endPos ++ [Token.rparen]
| .synthetic pos endPos canonical =>
[Token.lparen, Token.ident "Lean.SourceInfo.synthetic"] ++
stringPosRawToTokens pos ++
stringPosRawToTokens endPos ++
boolToTokens canonical ++ [Token.rparen]
| .none => [Token.ident "Lean.SourceInfo.none"]
/-- Canonical token form for `List String` (used by `Preresolved.decl`). -/
def stringListToTokens : List String → List Token
| [] => [Token.ident "List.nil"]
| s :: rest =>
[Token.lparen, Token.ident "List.cons", Token.strLit s] ++
stringListToTokens rest ++ [Token.rparen]
/-- Canonical token form for `Lean.Syntax.Preresolved`. -/
def preresolvedToTokens : Lean.Syntax.Preresolved → List Token
| .namespace ns =>
[Token.lparen, Token.ident "Lean.Syntax.Preresolved.namespace"] ++
nameToTokens ns ++ [Token.rparen]
| .decl n fields =>
[Token.lparen, Token.ident "Lean.Syntax.Preresolved.decl"] ++
nameToTokens n ++ stringListToTokens fields ++ [Token.rparen]
/-- Canonical token form for `List Lean.Syntax.Preresolved`. -/
def preresolvedListToTokens : List Lean.Syntax.Preresolved → List Token
| [] => [Token.ident "List.nil"]
| p :: rest =>
[Token.lparen, Token.ident "List.cons"] ++
preresolvedToTokens p ++ preresolvedListToTokens rest ++ [Token.rparen]
-- Canonical token form for `Lean.Syntax`. Defined via `mutual` so
-- recursion is structural on the syntax tree (and on `List Lean.Syntax`
-- through `Array.toList`); no fuel parameter needed. Mirrors
-- `syntaxToLeanSource` from `Foundation.Meta`.
mutual
/-- Token form of a `Lean.Syntax`. -/
def syntaxToTokens : Lean.Syntax → List Token
| .missing => [Token.ident "Lean.Syntax.missing"]
| .atom info val =>
[Token.lparen, Token.ident "Lean.Syntax.atom"] ++
sourceInfoToTokens info ++ [Token.strLit val, Token.rparen]
| .ident info rawVal val preresolved =>
[Token.lparen, Token.ident "Lean.Syntax.ident"] ++
sourceInfoToTokens info ++
substringRawToTokens rawVal ++
nameToTokens val ++
preresolvedListToTokens preresolved ++ [Token.rparen]
| .node info kind args =>
[Token.lparen, Token.ident "Lean.Syntax.node"] ++
sourceInfoToTokens info ++
nameToTokens kind ++
syntaxListToTokens args.toList ++ [Token.rparen]
/-- Token form of a `List Lean.Syntax` — encodes as a `List.cons` /
`List.nil` chain. -/
def syntaxListToTokens : List Lean.Syntax → List Token
| [] => [Token.ident "List.nil"]
| s :: rest =>
[Token.lparen, Token.ident "List.cons"] ++
syntaxToTokens s ++ syntaxListToTokens rest ++ [Token.rparen]
end
def MetaArtifact.toTokens : MetaArtifact → List Token
| .source s => [Token.lparen, Token.ident "Infoductor.MetaArtifact.source",
Token.strLit s, Token.rparen]
| .declAt s =>
[Token.lparen, Token.ident "Infoductor.MetaArtifact.declAt"] ++
syntaxToTokens s ++ [Token.rparen]
| .cterm m => [Token.lparen, Token.ident "Infoductor.MetaArtifact.cterm"] ++
m.toTokens ++ [Token.rparen]
| .refTo n => [Token.lparen, Token.ident "Infoductor.MetaArtifact.refTo"] ++
nameToTokens n ++ [Token.rparen]
| .empty => [Token.ident "Infoductor.MetaArtifact.empty"]
-- ── Size measures (for fuel bookkeeping in proofs) ─────────────────────────
def nameDepth : Lean.Name → Nat
| .anonymous => 1
| .str p _ => nameDepth p + 1
| .num p _ => nameDepth p + 1
def classifierDepth : MetaClassifier → Nat
| .always | .never => 1
| .atDecl n => nameDepth n + 1
| .inFile _ => 2
| .underAttribute n => nameDepth n + 1
| .dependencyOf n => nameDepth n + 1
| .inNamespace n => nameDepth n + 1
| .meet a b => max (classifierDepth a) (classifierDepth b) + 1
| .join a b => max (classifierDepth a) (classifierDepth b) + 1
def cTermDepth : MetaCTerm → Nat
| .ident n => nameDepth n + 1
| .sym _ => 2
| .empty => 1
| .lam _ t | .plam _ t => cTermDepth t + 1
| .app f a => max (cTermDepth f) (cTermDepth a) + 1
| .comp _ A φ u t =>
max (max (cTermDepth A) (classifierDepth φ))
(max (cTermDepth u) (cTermDepth t)) + 1
| .transp _ A φ t =>
max (max (cTermDepth A) (classifierDepth φ)) (cTermDepth t) + 1
/-- Depth measure for `String.Pos.Raw` — atomic 4-token form. -/
def stringPosRawDepth (_ : String.Pos.Raw) : Nat := 1
/-- Depth measure for `Substring.Raw`. -/
def substringRawDepth (_ : Substring.Raw) : Nat := 2
/-- Depth measure for `Lean.SourceInfo`. -/
def sourceInfoDepth : Lean.SourceInfo → Nat
| .original _ _ _ _ => 3
| .synthetic _ _ _ => 3
| .none => 1
/-- Depth measure for `List String`. -/
def stringListDepth : List String → Nat
| [] => 1
| _ :: rest => stringListDepth rest + 1
/-- Depth measure for `Lean.Syntax.Preresolved`. -/
def preresolvedDepth : Lean.Syntax.Preresolved → Nat
| .namespace n => nameDepth n + 1
| .decl n fields => max (nameDepth n) (stringListDepth fields) + 1
/-- Depth measure for `List Lean.Syntax.Preresolved`. -/
def preresolvedListDepth : List Lean.Syntax.Preresolved → Nat
| [] => 1
| p :: rest => max (preresolvedDepth p) (preresolvedListDepth rest) + 1
-- Structural depth measure for `Lean.Syntax`. Defined via `mutual`
-- so recursion is structural on the syntax tree. No fuel parameter
-- needed — depth is a true structural function of the input. Includes
-- the depths of *all* sub-payloads (sourceInfo, substring, name,
-- preresolved-list) so the parser-correctness proof's bookkeeping is
-- direct.
mutual
/-- Structural depth on `Lean.Syntax`. -/
def syntaxParseDepth : Lean.Syntax → Nat
| .missing => 1
| .atom info _ => sourceInfoDepth info + 1
| .ident info rawVal n preresolved =>
max (sourceInfoDepth info)
(max (substringRawDepth rawVal)
(max (nameDepth n) (preresolvedListDepth preresolved))) + 1
| .node info kind args =>
max (sourceInfoDepth info)
(max (nameDepth kind) (syntaxListParseDepth args.toList)) + 1
/-- Structural depth on `List Lean.Syntax`. -/
def syntaxListParseDepth : List Lean.Syntax → Nat
| [] => 1
| s :: rest => max (syntaxParseDepth s) (syntaxListParseDepth rest) + 1
end
def artifactDepth : MetaArtifact → Nat
| .source _ => 2
| .declAt s => syntaxParseDepth s + 1
| .cterm m => cTermDepth m + 1
| .refTo n => nameDepth n + 1
| .empty => 1
/-- All `MetaArtifact` arms are now supported by the round-trip:
`.declAt` payloads are encoded constructively via the
`syntaxToTokens` / `parseSyntax?Aux` pair. Retained as a
Boolean predicate (constant `true`) for backward compatibility
with downstream code that depended on its existence. -/
def MetaArtifact.supported : MetaArtifact → Bool
| _ => true
-- ── Parser correctness on toTokens (Phase 2) ──────────────────────────────
-- Universal theorems witnessing that each parser, when fed the
-- canonical token form of a value plus arbitrary trailing tokens,
-- reconstructs the value exactly and leaves the trailing tokens
-- untouched. Provided fuel is at least the depth of the value.
-- Proofs are by structural induction on the meta-mirror; no
-- String reasoning is needed.
theorem parseName?Aux_correct :
∀ (n : Lean.Name) (rest : List Token) (fuel : Nat),
fuel ≥ nameDepth n →
parseName?Aux fuel (nameToTokens n ++ rest) = some (n, rest)
| .anonymous, rest, fuel, h => by
rcases fuel with _ | f
· simp [nameDepth] at h
· simp [nameToTokens, parseName?Aux]
| .str p s, rest, fuel, h => by
rcases fuel with _ | f
· simp [nameDepth] at h
· simp only [nameToTokens, List.append_assoc, List.cons_append,
List.nil_append, parseName?Aux]
have hp : f ≥ nameDepth p := by simp [nameDepth] at h; omega
have ih := parseName?Aux_correct p
(Token.strLit s :: Token.rparen :: rest) f hp
rw [ih]
| .num p k, rest, fuel, h => by
rcases fuel with _ | f
· simp [nameDepth] at h
· simp only [nameToTokens, List.append_assoc, List.cons_append,
List.nil_append, parseName?Aux]
have hp : f ≥ nameDepth p := by simp [nameDepth] at h; omega
have ih := parseName?Aux_correct p
(Token.numLit k :: Token.rparen :: rest) f hp
rw [ih]
theorem parseClassifier?Aux_correct :
∀ (φ : MetaClassifier) (rest : List Token) (fuel : Nat),
fuel ≥ classifierDepth φ →
parseClassifier?Aux fuel (φ.toTokens ++ rest) = some (φ, rest)
| .always, rest, fuel, h => by
rcases fuel with _ | f
· simp [classifierDepth] at h
· simp [MetaClassifier.toTokens, parseClassifier?Aux]
| .never, rest, fuel, h => by
rcases fuel with _ | f
· simp [classifierDepth] at h
· simp [MetaClassifier.toTokens, parseClassifier?Aux]
| .atDecl n, rest, fuel, h => by
rcases fuel with _ | f
· simp [classifierDepth] at h
· simp only [MetaClassifier.toTokens, List.append_assoc, List.cons_append,
List.nil_append, parseClassifier?Aux]
have hn : f ≥ nameDepth n := by simp [classifierDepth] at h; omega
rw [parseName?Aux_correct n (Token.rparen :: rest) f hn]
| .inFile s, rest, fuel, h => by
rcases fuel with _ | f
· simp [classifierDepth] at h
· simp [MetaClassifier.toTokens, parseClassifier?Aux]
| .underAttribute n, rest, fuel, h => by
rcases fuel with _ | f
· simp [classifierDepth] at h
· simp only [MetaClassifier.toTokens, List.append_assoc, List.cons_append,
List.nil_append, parseClassifier?Aux]
have hn : f ≥ nameDepth n := by
simp [classifierDepth] at h; omega
rw [parseName?Aux_correct n (Token.rparen :: rest) f hn]
| .dependencyOf n, rest, fuel, h => by
rcases fuel with _ | f
· simp [classifierDepth] at h
· simp only [MetaClassifier.toTokens, List.append_assoc, List.cons_append,
List.nil_append, parseClassifier?Aux]
have hn : f ≥ nameDepth n := by
simp [classifierDepth] at h; omega
rw [parseName?Aux_correct n (Token.rparen :: rest) f hn]
| .inNamespace n, rest, fuel, h => by
rcases fuel with _ | f
· simp [classifierDepth] at h
· simp only [MetaClassifier.toTokens, List.append_assoc, List.cons_append,
List.nil_append, parseClassifier?Aux]
have hn : f ≥ nameDepth n := by
simp [classifierDepth] at h; omega
rw [parseName?Aux_correct n (Token.rparen :: rest) f hn]
| .meet a b, rest, fuel, h => by
rcases fuel with _ | f
· simp [classifierDepth] at h
· simp only [MetaClassifier.toTokens, List.append_assoc, List.cons_append,
List.nil_append, parseClassifier?Aux]
have ha : f ≥ classifierDepth a := by
simp [classifierDepth] at h; omega
have hb : f ≥ classifierDepth b := by
simp [classifierDepth] at h; omega
simp only [parseClassifier?Aux_correct a _ f ha,
parseClassifier?Aux_correct b _ f hb]
| .join a b, rest, fuel, h => by
rcases fuel with _ | f
· simp [classifierDepth] at h
· simp only [MetaClassifier.toTokens, List.append_assoc, List.cons_append,
List.nil_append, parseClassifier?Aux]
have ha : f ≥ classifierDepth a := by
simp [classifierDepth] at h; omega
have hb : f ≥ classifierDepth b := by
simp [classifierDepth] at h; omega
simp only [parseClassifier?Aux_correct a _ f ha,
parseClassifier?Aux_correct b _ f hb]
theorem parseMetaCTerm?Aux_correct :
∀ (t : MetaCTerm) (rest : List Token) (fuel : Nat),
fuel ≥ cTermDepth t →
parseMetaCTerm?Aux fuel (t.toTokens ++ rest) = some (t, rest)
| .empty, rest, fuel, h => by
rcases fuel with _ | f
· simp [cTermDepth] at h
· simp [MetaCTerm.toTokens, parseMetaCTerm?Aux]
| .ident n, rest, fuel, h => by
rcases fuel with _ | f
· simp [cTermDepth] at h
· simp only [MetaCTerm.toTokens, List.append_assoc, List.cons_append,
List.nil_append, parseMetaCTerm?Aux]
have hn : f ≥ nameDepth n := by simp [cTermDepth] at h; omega
rw [parseName?Aux_correct n (Token.rparen :: rest) f hn]
| .sym s, rest, fuel, h => by
rcases fuel with _ | f
· simp [cTermDepth] at h
· simp [MetaCTerm.toTokens, parseMetaCTerm?Aux]
| .app fa aa, rest, fuel, h => by
rcases fuel with _ | f
· simp [cTermDepth] at h
· simp only [MetaCTerm.toTokens, List.append_assoc, List.cons_append,
List.nil_append, parseMetaCTerm?Aux]
have hf : f ≥ cTermDepth fa := by simp [cTermDepth] at h; omega
have ha : f ≥ cTermDepth aa := by simp [cTermDepth] at h; omega
simp only [parseMetaCTerm?Aux_correct fa _ f hf,
parseMetaCTerm?Aux_correct aa _ f ha]
| .lam x t, rest, fuel, h => by
rcases fuel with _ | f
· simp [cTermDepth] at h
· simp only [MetaCTerm.toTokens, List.append_assoc, List.cons_append,
List.nil_append, parseMetaCTerm?Aux]
have ht : f ≥ cTermDepth t := by simp [cTermDepth] at h; omega
simp only [parseMetaCTerm?Aux_correct t _ f ht]
| .plam i t, rest, fuel, h => by
rcases fuel with _ | f
· simp [cTermDepth] at h
· simp only [MetaCTerm.toTokens, List.append_assoc, List.cons_append,
List.nil_append, parseMetaCTerm?Aux]
have ht : f ≥ cTermDepth t := by simp [cTermDepth] at h; omega
simp only [parseMetaCTerm?Aux_correct t _ f ht]
| .comp s A φ u t, rest, fuel, h => by
rcases fuel with _ | f
· simp [cTermDepth] at h
· simp only [MetaCTerm.toTokens, List.append_assoc, List.cons_append,
List.nil_append, parseMetaCTerm?Aux]
have hA : f ≥ cTermDepth A := by simp [cTermDepth] at h; omega
have hφ : f ≥ classifierDepth φ := by simp [cTermDepth] at h; omega
have hu : f ≥ cTermDepth u := by simp [cTermDepth] at h; omega
have ht : f ≥ cTermDepth t := by simp [cTermDepth] at h; omega
simp only [parseMetaCTerm?Aux_correct A _ f hA,
parseClassifier?Aux_correct φ _ f hφ,
parseMetaCTerm?Aux_correct u _ f hu,
parseMetaCTerm?Aux_correct t _ f ht]
| .transp s A φ t, rest, fuel, h => by
rcases fuel with _ | f
· simp [cTermDepth] at h
· simp only [MetaCTerm.toTokens, List.append_assoc, List.cons_append,
List.nil_append, parseMetaCTerm?Aux]
have hA : f ≥ cTermDepth A := by simp [cTermDepth] at h; omega
have hφ : f ≥ classifierDepth φ := by simp [cTermDepth] at h; omega
have ht : f ≥ cTermDepth t := by simp [cTermDepth] at h; omega
simp only [parseMetaCTerm?Aux_correct A _ f hA,
parseClassifier?Aux_correct φ _ f hφ,
parseMetaCTerm?Aux_correct t _ f ht]
-- ── Correctness for Lean.Syntax mirror parsers ─────────────────────────────
-- Each parser proves: feeding the canonical token form of a value plus
-- arbitrary trailing tokens reconstructs the value, leaving the trail
-- untouched, provided fuel ≥ value's depth.
theorem parseStringPosRaw?Aux_correct :
∀ (p : String.Pos.Raw) (rest : List Token) (fuel : Nat),
fuel ≥ stringPosRawDepth p →
parseStringPosRaw?Aux fuel (stringPosRawToTokens p ++ rest) = some (p, rest) := by
intro p rest fuel h
rcases fuel with _ | f
· simp [stringPosRawDepth] at h
· simp [stringPosRawToTokens, parseStringPosRaw?Aux]
theorem parseSubstringRaw?Aux_correct :
∀ (s : Substring.Raw) (rest : List Token) (fuel : Nat),
fuel ≥ substringRawDepth s →
parseSubstringRaw?Aux fuel (substringRawToTokens s ++ rest) = some (s, rest) := by
intro s rest fuel h
rcases fuel with _ | f
· simp [substringRawDepth] at h
· simp only [substringRawToTokens, List.append_assoc, List.cons_append,
List.nil_append, parseSubstringRaw?Aux]
have h1 : f ≥ stringPosRawDepth s.startPos := by
simp [substringRawDepth, stringPosRawDepth] at h ⊢; omega
have h2 : f ≥ stringPosRawDepth s.stopPos := by
simp [substringRawDepth, stringPosRawDepth] at h ⊢; omega
simp only [parseStringPosRaw?Aux_correct s.startPos _ f h1,
parseStringPosRaw?Aux_correct s.stopPos _ f h2]
theorem parseBool?Aux_correct :
∀ (b : Bool) (rest : List Token),
parseBool?Aux (boolToTokens b ++ rest) = some (b, rest) := by
intro b rest
cases b <;> simp [boolToTokens, parseBool?Aux]
theorem parseSourceInfo?Aux_correct :
∀ (info : Lean.SourceInfo) (rest : List Token) (fuel : Nat),
fuel ≥ sourceInfoDepth info →
parseSourceInfo?Aux fuel (sourceInfoToTokens info ++ rest) = some (info, rest) := by
intro info rest fuel h
cases info with
| none =>
rcases fuel with _ | f
· simp [sourceInfoDepth] at h
· simp [sourceInfoToTokens, parseSourceInfo?Aux]
| original lead pos trail endPos =>
rcases fuel with _ | f
· simp [sourceInfoDepth] at h
· simp only [sourceInfoToTokens, List.append_assoc, List.cons_append,
List.nil_append, parseSourceInfo?Aux]
have hlead : f ≥ substringRawDepth lead := by
simp [sourceInfoDepth, substringRawDepth] at h ⊢; omega
have hpos : f ≥ stringPosRawDepth pos := by
simp [sourceInfoDepth, stringPosRawDepth] at h ⊢; omega
have htrail : f ≥ substringRawDepth trail := by
simp [sourceInfoDepth, substringRawDepth] at h ⊢; omega
have hend : f ≥ stringPosRawDepth endPos := by
simp [sourceInfoDepth, stringPosRawDepth] at h ⊢; omega
simp only [parseSubstringRaw?Aux_correct lead _ f hlead,
parseStringPosRaw?Aux_correct pos _ f hpos,
parseSubstringRaw?Aux_correct trail _ f htrail,
parseStringPosRaw?Aux_correct endPos _ f hend]
| synthetic pos endPos canonical =>
rcases fuel with _ | f
· simp [sourceInfoDepth] at h
· simp only [sourceInfoToTokens, List.append_assoc, List.cons_append,
List.nil_append, parseSourceInfo?Aux]
have hpos : f ≥ stringPosRawDepth pos := by
simp [sourceInfoDepth, stringPosRawDepth] at h ⊢; omega
have hend : f ≥ stringPosRawDepth endPos := by
simp [sourceInfoDepth, stringPosRawDepth] at h ⊢; omega
simp only [parseStringPosRaw?Aux_correct pos _ f hpos,
parseStringPosRaw?Aux_correct endPos _ f hend,
parseBool?Aux_correct canonical (Token.rparen :: rest)]
theorem parseStringList?Aux_correct :
∀ (l : List String) (rest : List Token) (fuel : Nat),
fuel ≥ stringListDepth l →
parseStringList?Aux fuel (stringListToTokens l ++ rest) = some (l, rest)
| [], rest, fuel, h => by
rcases fuel with _ | f
· simp [stringListDepth] at h
· simp [stringListToTokens, parseStringList?Aux]
| s :: tail, rest, fuel, h => by
rcases fuel with _ | f
· simp [stringListDepth] at h
· simp only [stringListToTokens, List.append_assoc, List.cons_append,
List.nil_append, parseStringList?Aux]
have htail : f ≥ stringListDepth tail := by
simp [stringListDepth] at h; omega
simp only [parseStringList?Aux_correct tail _ f htail]
theorem parsePreresolved?Aux_correct :
∀ (p : Lean.Syntax.Preresolved) (rest : List Token) (fuel : Nat),
fuel ≥ preresolvedDepth p →
parsePreresolved?Aux fuel (preresolvedToTokens p ++ rest) = some (p, rest)
| .namespace ns, rest, fuel, h => by
rcases fuel with _ | f
· simp [preresolvedDepth] at h
· simp only [preresolvedToTokens, List.append_assoc, List.cons_append,
List.nil_append, parsePreresolved?Aux]
have hn : f ≥ nameDepth ns := by simp [preresolvedDepth] at h; omega
simp only [parseName?Aux_correct ns _ f hn]
| .decl n fields, rest, fuel, h => by
rcases fuel with _ | f
· simp [preresolvedDepth] at h
· simp only [preresolvedToTokens, List.append_assoc, List.cons_append,
List.nil_append, parsePreresolved?Aux]
have hn : f ≥ nameDepth n := by simp [preresolvedDepth] at h; omega
have hf : f ≥ stringListDepth fields := by simp [preresolvedDepth] at h; omega
simp only [parseName?Aux_correct n _ f hn,
parseStringList?Aux_correct fields _ f hf]
theorem parsePreresolvedList?Aux_correct :
∀ (l : List Lean.Syntax.Preresolved) (rest : List Token) (fuel : Nat),
fuel ≥ preresolvedListDepth l →
parsePreresolvedList?Aux fuel (preresolvedListToTokens l ++ rest) = some (l, rest)
| [], rest, fuel, h => by
rcases fuel with _ | f
· simp [preresolvedListDepth] at h
· simp [preresolvedListToTokens, parsePreresolvedList?Aux]
| p :: tail, rest, fuel, h => by
rcases fuel with _ | f
· simp [preresolvedListDepth] at h
· simp only [preresolvedListToTokens, List.append_assoc, List.cons_append,
List.nil_append, parsePreresolvedList?Aux]
have hp : f ≥ preresolvedDepth p := by simp [preresolvedListDepth] at h; omega
have ht : f ≥ preresolvedListDepth tail := by simp [preresolvedListDepth] at h; omega
simp only [parsePreresolved?Aux_correct p _ f hp,
parsePreresolvedList?Aux_correct tail _ f ht]
/-- Round-trip for the joint syntax / list-syntax parser. The encoder
is structural (no fuel) — the parser still needs fuel because it
consumes a token list of unknown length; fuel ≥ structural depth
of the value being parsed. -/
theorem parseSyntaxOrList?Aux_correct :
∀ (fuel : Nat) (x : Lean.Syntax ⊕ List Lean.Syntax) (rest : List Token),
fuel ≥ Sum.elim syntaxParseDepth syntaxListParseDepth x →
parseSyntaxOrList?Aux fuel
(Sum.elim (fun _ => true) (fun _ => false) x)
(Sum.elim syntaxToTokens syntaxListToTokens x ++ rest) = some (x, rest)
| 0, x, _, h => by
cases x with
| inl s =>
simp only [Sum.elim_inl] at h
cases s <;> simp [syntaxParseDepth] at h
| inr l =>
simp only [Sum.elim_inr] at h
cases l <;> simp [syntaxListParseDepth] at h
| f + 1, .inl .missing, rest, _ => by
simp [syntaxToTokens, parseSyntaxOrList?Aux]
| f + 1, .inl (.atom info val), rest, h => by
simp only [Sum.elim_inl, syntaxToTokens, List.append_assoc, List.cons_append,
List.nil_append, parseSyntaxOrList?Aux]
have hi : f ≥ sourceInfoDepth info := by
simp only [Sum.elim_inl, syntaxParseDepth] at h; omega
simp only [parseSourceInfo?Aux_correct info _ f hi]
| f + 1, .inl (.ident info rawVal val preresolved), rest, h => by
simp only [Sum.elim_inl, syntaxToTokens, List.append_assoc, List.cons_append,
List.nil_append, parseSyntaxOrList?Aux]
simp only [Sum.elim_inl, syntaxParseDepth] at h
-- Manually decompose the four-arm max: f + 1 ≥ max info (max raw (max name pre)) + 1.
have hi : f ≥ sourceInfoDepth info := by
have := Nat.le_max_left (sourceInfoDepth info)
(max (substringRawDepth rawVal)
(max (nameDepth val) (preresolvedListDepth preresolved)))
omega
have hr : f ≥ substringRawDepth rawVal := by
have h1 := Nat.le_max_right (sourceInfoDepth info)
(max (substringRawDepth rawVal)
(max (nameDepth val) (preresolvedListDepth preresolved)))
have h2 := Nat.le_max_left (substringRawDepth rawVal)
(max (nameDepth val) (preresolvedListDepth preresolved))
omega
have hn : f ≥ nameDepth val := by
have h1 := Nat.le_max_right (sourceInfoDepth info)
(max (substringRawDepth rawVal)
(max (nameDepth val) (preresolvedListDepth preresolved)))
have h2 := Nat.le_max_right (substringRawDepth rawVal)
(max (nameDepth val) (preresolvedListDepth preresolved))
have h3 := Nat.le_max_left (nameDepth val) (preresolvedListDepth preresolved)
omega
have hp : f ≥ preresolvedListDepth preresolved := by
have h1 := Nat.le_max_right (sourceInfoDepth info)
(max (substringRawDepth rawVal)
(max (nameDepth val) (preresolvedListDepth preresolved)))
have h2 := Nat.le_max_right (substringRawDepth rawVal)
(max (nameDepth val) (preresolvedListDepth preresolved))
have h3 := Nat.le_max_right (nameDepth val) (preresolvedListDepth preresolved)
omega
simp only [parseSourceInfo?Aux_correct info _ f hi,
parseSubstringRaw?Aux_correct rawVal _ f hr,
parseName?Aux_correct val _ f hn,
parsePreresolvedList?Aux_correct preresolved _ f hp]
| f + 1, .inl (.node info kind args), rest, h => by
simp only [Sum.elim_inl, syntaxToTokens, List.append_assoc, List.cons_append,
List.nil_append, parseSyntaxOrList?Aux]
simp only [Sum.elim_inl, syntaxParseDepth] at h
have hi : f ≥ sourceInfoDepth info := by
have := Nat.le_max_left (sourceInfoDepth info)
(max (nameDepth kind) (syntaxListParseDepth args.toList))
omega
have hk : f ≥ nameDepth kind := by
have h1 := Nat.le_max_right (sourceInfoDepth info)
(max (nameDepth kind) (syntaxListParseDepth args.toList))
have h2 := Nat.le_max_left (nameDepth kind) (syntaxListParseDepth args.toList)
omega
have hargs_d : f ≥ syntaxListParseDepth args.toList := by
have h1 := Nat.le_max_right (sourceInfoDepth info)
(max (nameDepth kind) (syntaxListParseDepth args.toList))
have h2 := Nat.le_max_right (nameDepth kind) (syntaxListParseDepth args.toList)
omega
have ih := parseSyntaxOrList?Aux_correct f (.inr args.toList)
(Token.rparen :: rest) (by simp only [Sum.elim_inr]; exact hargs_d)
simp only [Sum.elim_inr] at ih
simp only [parseSourceInfo?Aux_correct info _ f hi,
parseName?Aux_correct kind _ f hk, ih, Array.toArray_toList]
| f + 1, .inr [], rest, _ => by
simp [syntaxListToTokens, parseSyntaxOrList?Aux]
| f + 1, .inr (s :: tail), rest, h => by
simp only [Sum.elim_inr, syntaxListToTokens, List.append_assoc, List.cons_append,
List.nil_append, parseSyntaxOrList?Aux]
simp only [Sum.elim_inr, syntaxListParseDepth] at h
have hs_d : f ≥ syntaxParseDepth s := by
have := Nat.le_max_left (syntaxParseDepth s) (syntaxListParseDepth tail)
omega
have ht_d : f ≥ syntaxListParseDepth tail := by
have := Nat.le_max_right (syntaxParseDepth s) (syntaxListParseDepth tail)
omega
have ihs := parseSyntaxOrList?Aux_correct f (.inl s)
(syntaxListToTokens tail ++ Token.rparen :: rest)
(by simp only [Sum.elim_inl]; exact hs_d)
have iht := parseSyntaxOrList?Aux_correct f (.inr tail)
(Token.rparen :: rest)
(by simp only [Sum.elim_inr]; exact ht_d)
simp only [Sum.elim_inl] at ihs
simp only [Sum.elim_inr] at iht
rw [ihs]
simp only [iht]
/-- Specialisation: round-trip for `parseSyntax?Aux` projection. -/
theorem parseSyntax?Aux_correct :
∀ (s : Lean.Syntax) (rest : List Token) (fuel : Nat),
fuel ≥ syntaxParseDepth s →
parseSyntax?Aux fuel (syntaxToTokens s ++ rest) = some (s, rest) := by
intro s rest fuel h
show (match parseSyntaxOrList?Aux fuel true (syntaxToTokens s ++ rest) with
| some (.inl s, r) => some (s, r)
| _ => none) = some (s, rest)
have := parseSyntaxOrList?Aux_correct fuel (.inl s) rest
(by simp only [Sum.elim_inl]; exact h)
simp only [Sum.elim_inl] at this
rw [this]
theorem parseArtifact?Aux_correct :
∀ (a : MetaArtifact) (rest : List Token) (fuel : Nat),
a.supported = true →
fuel ≥ artifactDepth a →
parseArtifact?Aux fuel (a.toTokens ++ rest) = some (a, rest)
| .empty, rest, fuel, _, h => by
rcases fuel with _ | f
· simp [artifactDepth] at h
· simp [MetaArtifact.toTokens, parseArtifact?Aux]
| .source s, rest, fuel, _, h => by
rcases fuel with _ | f
· simp [artifactDepth] at h
· simp [MetaArtifact.toTokens, parseArtifact?Aux]
| .refTo n, rest, fuel, _, h => by
rcases fuel with _ | f
· simp [artifactDepth] at h
· simp only [MetaArtifact.toTokens, List.append_assoc, List.cons_append,
List.nil_append, parseArtifact?Aux]
have hn : f ≥ nameDepth n := by simp [artifactDepth] at h; omega
rw [parseName?Aux_correct n (Token.rparen :: rest) f hn]
| .cterm m, rest, fuel, _, h => by
rcases fuel with _ | f
· simp [artifactDepth] at h
· simp only [MetaArtifact.toTokens, List.append_assoc, List.cons_append,
List.nil_append, parseArtifact?Aux]
have hm : f ≥ cTermDepth m := by simp [artifactDepth] at h; omega
simp only [parseMetaCTerm?Aux_correct m _ f hm]
| .declAt s, rest, fuel, _, h => by
rcases fuel with _ | f
· simp [artifactDepth] at h
· simp only [MetaArtifact.toTokens, List.append_assoc, List.cons_append,
List.nil_append, parseArtifact?Aux]
-- The inner parser is now called at `f` fuel (we'll fix the
-- parser body to use `f` rather than `syntaxFuelCap` once below).
have hs : f ≥ syntaxParseDepth s := by
simp [artifactDepth] at h; omega
rw [parseSyntax?Aux_correct s (Token.rparen :: rest) f hs]
-- ── Token-level round-trip — the canonical universal theorems ─────────────
-- These follow directly from the `parser?Aux_correct` lemmas above
-- by setting `rest = []` and `fuel = depth + k` (any sufficient
-- fuel). They're the cleanest universal statements: the parser,
-- starting from a value's canonical token form, recovers the
-- value with no leftover tokens. No String reasoning needed.
/-- Top-level token-entry parser. Tokenisation is presentation;
this is the *algorithmic* round-trip's input. -/
def MetaCTerm.fromTokens? (tokens : List Token) : Option MetaCTerm :=
match parseMetaCTerm? tokens with
| some (t, []) => some t
| _ => none
def MetaClassifier.fromTokens? (tokens : List Token) : Option MetaClassifier :=
match parseClassifier? tokens with
| some (φ, []) => some φ
| _ => none
def Infoductor.nameFromTokens? (tokens : List Token) : Option Lean.Name :=
match parseName? tokens with
| some (n, []) => some n
| _ => none
def MetaArtifact.fromTokens? (tokens : List Token) : Option MetaArtifact :=
match parseArtifact? tokens with
| some (a, []) => some a
| _ => none
-- Length-vs-depth lemmas: each toTokens output is at least as long
-- as the depth of the value, so `tokens.length + 1` is always
-- sufficient fuel.
theorem nameToTokens_length_bound (n : Lean.Name) :
(nameToTokens n).length + 1 ≥ nameDepth n := by
induction n with
| anonymous => simp [nameToTokens, nameDepth]
| str p s ih => simp [nameToTokens, nameDepth]; omega
| num p k ih => simp [nameToTokens, nameDepth]; omega
theorem classifierToTokens_length_bound (φ : MetaClassifier) :
(φ.toTokens).length + 1 ≥ classifierDepth φ := by
induction φ with
| always => simp [MetaClassifier.toTokens, classifierDepth]
| never => simp [MetaClassifier.toTokens, classifierDepth]
| atDecl n =>
simp [MetaClassifier.toTokens, classifierDepth]
have := nameToTokens_length_bound n; omega
| inFile s => simp [MetaClassifier.toTokens, classifierDepth]
| underAttribute n =>
simp [MetaClassifier.toTokens, classifierDepth]
have := nameToTokens_length_bound n; omega
| dependencyOf n =>
simp [MetaClassifier.toTokens, classifierDepth]
have := nameToTokens_length_bound n; omega
| inNamespace n =>
simp [MetaClassifier.toTokens, classifierDepth]
have := nameToTokens_length_bound n; omega
| meet a b iha ihb => simp [MetaClassifier.toTokens, classifierDepth]; omega
| join a b iha ihb => simp [MetaClassifier.toTokens, classifierDepth]; omega
theorem cTermToTokens_length_bound (t : MetaCTerm) :
(t.toTokens).length + 1 ≥ cTermDepth t := by
induction t with
| empty => simp [MetaCTerm.toTokens, cTermDepth]
| ident n =>
simp [MetaCTerm.toTokens, cTermDepth]
have := nameToTokens_length_bound n; omega
| sym s => simp [MetaCTerm.toTokens, cTermDepth]
| app f a ihf iha => simp [MetaCTerm.toTokens, cTermDepth]; omega
| lam x t ih => simp [MetaCTerm.toTokens, cTermDepth]; omega
| plam i t ih => simp [MetaCTerm.toTokens, cTermDepth]; omega
| comp s A φ u t ihA ihu iht =>
simp [MetaCTerm.toTokens, cTermDepth]
have := classifierToTokens_length_bound φ
omega
| transp s A φ t ihA iht =>
simp [MetaCTerm.toTokens, cTermDepth]
have := classifierToTokens_length_bound φ
omega
-- Length-vs-depth for the helper-syntax encoders.
theorem stringPosRawToTokens_length_bound (p : String.Pos.Raw) :
(stringPosRawToTokens p).length + 1 ≥ stringPosRawDepth p := by
simp [stringPosRawToTokens, stringPosRawDepth]
theorem substringRawToTokens_length_bound (s : Substring.Raw) :
(substringRawToTokens s).length + 1 ≥ substringRawDepth s := by
simp [substringRawToTokens, substringRawDepth, stringPosRawToTokens]
theorem sourceInfoToTokens_length_bound (info : Lean.SourceInfo) :
(sourceInfoToTokens info).length + 1 ≥ sourceInfoDepth info := by
cases info <;>
simp [sourceInfoToTokens, sourceInfoDepth, substringRawToTokens,
stringPosRawToTokens, boolToTokens]
theorem stringListToTokens_length_bound (l : List String) :
(stringListToTokens l).length + 1 ≥ stringListDepth l := by
induction l with
| nil => simp [stringListToTokens, stringListDepth]
| cons s rest ih => simp [stringListToTokens, stringListDepth]; omega
theorem preresolvedToTokens_length_bound :
∀ (p : Lean.Syntax.Preresolved),
(preresolvedToTokens p).length + 1 ≥ preresolvedDepth p
| .namespace ns => by
simp [preresolvedToTokens, preresolvedDepth]
have := nameToTokens_length_bound ns; omega
| .decl n fields => by
simp [preresolvedToTokens, preresolvedDepth]
have h1 := nameToTokens_length_bound n
have h2 := stringListToTokens_length_bound fields
omega
theorem preresolvedListToTokens_length_bound (l : List Lean.Syntax.Preresolved) :
(preresolvedListToTokens l).length + 1 ≥ preresolvedListDepth l := by
induction l with
| nil => simp [preresolvedListToTokens, preresolvedListDepth]
| cons p rest ih =>
simp [preresolvedListToTokens, preresolvedListDepth]
have := preresolvedToTokens_length_bound p
omega
-- Length-vs-depth for the syntax encoder. Joint mutual recursion
-- because the encoder and depth function are themselves mutually
-- recursive on `Lean.Syntax` / `List Lean.Syntax`.
mutual
/-- Length bound on a `Lean.Syntax` encoding. -/
theorem syntaxToTokens_length_bound :
∀ (s : Lean.Syntax), (syntaxToTokens s).length + 1 ≥ syntaxParseDepth s
| .missing => by simp [syntaxToTokens, syntaxParseDepth]
| .atom info _ => by
simp [syntaxToTokens, syntaxParseDepth]
have := sourceInfoToTokens_length_bound info
omega
| .ident info rawVal val preresolved => by
simp [syntaxToTokens, syntaxParseDepth]
have h1 := sourceInfoToTokens_length_bound info
have h2 := substringRawToTokens_length_bound rawVal
have h3 := nameToTokens_length_bound val
have h4 := preresolvedListToTokens_length_bound preresolved
omega
| .node info kind args => by
simp [syntaxToTokens, syntaxParseDepth]
have h1 := sourceInfoToTokens_length_bound info
have h2 := nameToTokens_length_bound kind
have h3 := syntaxListToTokens_length_bound args.toList
omega
/-- Length bound on a `List Lean.Syntax` encoding. -/
theorem syntaxListToTokens_length_bound :
∀ (l : List Lean.Syntax), (syntaxListToTokens l).length + 1 ≥ syntaxListParseDepth l
| [] => by simp [syntaxListToTokens, syntaxListParseDepth]
| s :: rest => by
simp [syntaxListToTokens, syntaxListParseDepth]
have h1 := syntaxToTokens_length_bound s
have h2 := syntaxListToTokens_length_bound rest
omega
end
/-- Every meta-mirror value's canonical token form parses back to
the same value. Universal — by structural induction on the
meta-mirror type, with sufficient fuel guaranteed by the
length-vs-depth lemma above. -/
theorem nameFromTokens?_round_trip (n : Lean.Name) :
Infoductor.nameFromTokens? (nameToTokens n) = some n := by
unfold Infoductor.nameFromTokens? parseName?
have h := nameToTokens_length_bound n
have := parseName?Aux_correct n [] ((nameToTokens n).length + 1) h
rw [List.append_nil] at this
rw [this]
theorem classifierFromTokens?_round_trip (φ : MetaClassifier) :
MetaClassifier.fromTokens? φ.toTokens = some φ := by
unfold MetaClassifier.fromTokens? parseClassifier?
have h := classifierToTokens_length_bound φ
have := parseClassifier?Aux_correct φ [] ((φ.toTokens).length + 1) h
rw [List.append_nil] at this
rw [this]
theorem cTermFromTokens?_round_trip (t : MetaCTerm) :
MetaCTerm.fromTokens? t.toTokens = some t := by
unfold MetaCTerm.fromTokens? parseMetaCTerm?
have h := cTermToTokens_length_bound t
have := parseMetaCTerm?Aux_correct t [] ((t.toTokens).length + 1) h
rw [List.append_nil] at this
rw [this]
theorem artifactFromTokens?_round_trip (a : MetaArtifact)
(hsup : a.supported = true) :
MetaArtifact.fromTokens? a.toTokens = some a := by
unfold MetaArtifact.fromTokens? parseArtifact?
-- artifact length bound
have hlen : (a.toTokens).length + 1 ≥ artifactDepth a := by
cases a with
| empty => simp [MetaArtifact.toTokens, artifactDepth]
| source s => simp [MetaArtifact.toTokens, artifactDepth]
| declAt s =>
simp [MetaArtifact.toTokens, artifactDepth]
have := syntaxToTokens_length_bound s
omega
| cterm m =>
simp [MetaArtifact.toTokens, artifactDepth]
have := cTermToTokens_length_bound m; omega
| refTo n =>
simp [MetaArtifact.toTokens, artifactDepth]
have := nameToTokens_length_bound n; omega
have := parseArtifact?Aux_correct a [] ((a.toTokens).length + 1) hsup hlen
rw [List.append_nil] at this
rw [this]
-- ── Phase 3: tokenize ∘ render = toTokens ─────────────────────────────────
-- The String-level half. Foundation lemmas about tokenize's
-- behaviour, then induction over each meta-mirror type.
/-- `tokenize` on `(` :: rest reduces to `lparen :: tokenize rest`. -/
theorem tokenize_lparen_cons (rest : List Char) :
tokenize ('(' :: rest) = Token.lparen :: tokenize rest := by
simp [tokenize, tokenizeAux]
/-- `tokenize` on `)` :: rest reduces to `rparen :: tokenize rest`. -/
theorem tokenize_rparen_cons (rest : List Char) :
tokenize (')' :: rest) = Token.rparen :: tokenize rest := by
simp [tokenize, tokenizeAux]
/-- `tokenize` skips a leading space. -/
theorem tokenize_space_cons (rest : List Char) :
tokenize (' ' :: rest) = tokenize rest := by
simp [tokenize, tokenizeAux, isWhitespace]
-- Atomic-arm Phase 3 witnesses via `decide`. Each closed input
-- reduces in the kernel to a concrete token list; we prove
-- structural identity between `tokenize (toLeanSource v).toList`
-- and `toTokens v` for these atomic shapes.
set_option maxRecDepth 2000 in
theorem tokenize_render_name_anonymous :
tokenize (nameToLeanSource Lean.Name.anonymous).toList =
nameToTokens Lean.Name.anonymous := by
decide
set_option maxRecDepth 2000 in
theorem tokenize_render_classifier_always :
tokenize (MetaClassifier.toLeanSource MetaClassifier.always).toList =
MetaClassifier.toTokens MetaClassifier.always := by
decide
set_option maxRecDepth 2000 in
theorem tokenize_render_classifier_never :
tokenize (MetaClassifier.toLeanSource MetaClassifier.never).toList =
MetaClassifier.toTokens MetaClassifier.never := by
decide
set_option maxRecDepth 2000 in
theorem tokenize_render_cterm_empty :
tokenize (MetaCTerm.toLeanSource MetaCTerm.empty).toList =
MetaCTerm.toTokens MetaCTerm.empty := by
decide
set_option maxRecDepth 2000 in
theorem tokenize_render_artifact_empty :
tokenize (MetaArtifact.toLeanSource MetaArtifact.empty).toList =
MetaArtifact.toTokens MetaArtifact.empty := by
decide
-- ── Phase 3 distribution lemmas — open question ────────────────────────────
-- Proving `tokenize ∘ render = toTokens` requires four substantial
-- distribution lemmas:
--
-- readIdent_app : reading an ident sequence followed by a clean
-- boundary returns the accumulated ident.
-- readStrLit_app: reading an escapeStrLit-encoded body returns
-- the original String.
-- tokenize_app_clean: tokenize distributes over a concatenation
-- where the prefix ends "cleanly".
-- tokenize_render_X: induction over each meta-mirror type.
--
-- Each requires reasoning about Lean 4.30's `String` API (UTF-8
-- ByteArray-backed) — specifically the equation
-- `s.push c ++ xs.asString = s ++ (c :: xs).asString`, which is
-- *not* definitionally true and needs a String-internals lemma.
--
-- Cleaner paths forward: (a) refactor `readIdent`/`readStrLit` to
-- accumulate into a `List Char` rather than a `String`, then build
-- the String at the end via `List.asString` — decouples the proof
-- from String internals; or (b) import Mathlib's richer String
-- API which provides the needed lemmas.
-- ── Round-trip — atomic kernel-reducible witnesses ─────────────────────────
-- For non-recursive shapes the round-trip is closed by `rfl` (or
-- `decide`) directly: rendering produces a fixed string, tokenising
-- produces a fixed token list, parsing returns the original. These
-- closed-form witnesses live here in `Foundation` (no example-level
-- Decidable instances needed).
theorem MetaCTerm.toLeanSource_empty_round_trip :
MetaCTerm.fromLeanSource? MetaCTerm.empty.toLeanSource = some MetaCTerm.empty := by
decide
theorem MetaClassifier.toLeanSource_always_round_trip :
MetaClassifier.fromLeanSource? MetaClassifier.always.toLeanSource =
some MetaClassifier.always := by
decide
theorem MetaClassifier.toLeanSource_never_round_trip :
MetaClassifier.fromLeanSource? MetaClassifier.never.toLeanSource =
some MetaClassifier.never := by
decide
theorem MetaArtifact.toLeanSource_empty_round_trip_render :
(MetaArtifact.fromLeanSource? MetaArtifact.empty.toLeanSource).map
MetaArtifact.toLeanSource = some MetaArtifact.empty.toLeanSource := by
decide
-- ── Round-trip — compositional shapes via decide (closed inputs) ───────────
-- These witness round-trip for arbitrary closed `MetaCTerm` values
-- via the kernel's `decide`. Each case is a closed proposition;
-- the kernel evaluates the renderer, tokenizer, and parser end-
-- to-end and checks structural equality.
theorem MetaCTerm.toLeanSource_ident_anonymous_round_trip :
MetaCTerm.fromLeanSource? (MetaCTerm.ident Lean.Name.anonymous).toLeanSource =
some (MetaCTerm.ident Lean.Name.anonymous) := by
decide
theorem MetaCTerm.toLeanSource_app_empty_round_trip :
MetaCTerm.fromLeanSource? (MetaCTerm.app .empty .empty).toLeanSource =
some (MetaCTerm.app .empty .empty) := by
decide
set_option maxRecDepth 4000 in
theorem MetaCTerm.toLeanSource_lam_x_empty_round_trip :
MetaCTerm.fromLeanSource? (MetaCTerm.lam "x" .empty).toLeanSource =
some (MetaCTerm.lam "x" .empty) := by
decide
set_option maxRecDepth 4000 in
theorem MetaCTerm.toLeanSource_comp_round_trip :
MetaCTerm.fromLeanSource?
(MetaCTerm.comp "i" .empty .always .empty .empty).toLeanSource =
some (MetaCTerm.comp "i" .empty .always .empty .empty) := by
decide
set_option maxRecDepth 4000 in
theorem MetaCTerm.toLeanSource_transp_round_trip :
MetaCTerm.fromLeanSource?
(MetaCTerm.transp "i" .empty .never .empty).toLeanSource =
some (MetaCTerm.transp "i" .empty .never .empty) := by
decide
set_option maxRecDepth 4000 in
theorem MetaCTerm.toLeanSource_plam_round_trip :
MetaCTerm.fromLeanSource? (MetaCTerm.plam "i" .empty).toLeanSource =
some (MetaCTerm.plam "i" .empty) := by
decide
set_option maxRecDepth 4000 in
theorem MetaCTerm.toLeanSource_sym_round_trip :
MetaCTerm.fromLeanSource? (MetaCTerm.sym "x").toLeanSource =
some (MetaCTerm.sym "x") := by
decide
-- Compositional witness: a `.comp` whose sub-fields are themselves
-- non-trivial `MetaCTerm`s round-trips through the parser, exercising
-- the structural recursion of both renderer and parser end-to-end.
set_option maxRecDepth 16000 in
theorem MetaCTerm.toLeanSource_nested_round_trip :
MetaCTerm.fromLeanSource?
(MetaCTerm.comp "i" (.ident `Univ) .always
(.lam "x" (.sym "y")) (.sym "z")).toLeanSource =
some (MetaCTerm.comp "i" (.ident `Univ) .always
(.lam "x" (.sym "y")) (.sym "z")) := by
decide
set_option maxRecDepth 4000 in
theorem MetaClassifier.toLeanSource_meet_round_trip :
MetaClassifier.fromLeanSource?
(MetaClassifier.meet .always .never).toLeanSource =
some (MetaClassifier.meet .always .never) := by
decide
set_option maxRecDepth 4000 in
theorem MetaClassifier.toLeanSource_atDecl_round_trip :
MetaClassifier.fromLeanSource?
(MetaClassifier.atDecl `Foo).toLeanSource =
some (MetaClassifier.atDecl `Foo) := by
decide
-- 3-level deep name regression test: `eq0.i` (the FaceFormula
-- `eq0` encoding) round-trips. Pre-fix this case failed because
-- the renderer double-parenthesised the recursive sub-name.
set_option maxRecDepth 4000 in
theorem MetaClassifier.toLeanSource_atDecl_three_level_round_trip :
MetaClassifier.fromLeanSource?
(MetaClassifier.atDecl
(Lean.Name.mkStr (Lean.Name.mkSimple "eq0") "i")).toLeanSource =
some (MetaClassifier.atDecl
(Lean.Name.mkStr (Lean.Name.mkSimple "eq0") "i")) := by
decide
end Infoductor