Replaced the fuel-bound MetaArtifact.declAt encoding with real
bidirectional reflection of Lean.Syntax through MetaArtifact.
## Why
The previous encoding capped Lean.Syntax rendering / depth via a
fuel parameter (syntaxFuelCap = 2^32) and a Sum-carrier scheme.
The .declAt round-trip lemma at MetaParse.lean depended on
syntaxFuelCap ≥ syntaxDepth s, which is mathematically false for
adversarial syntax trees (any tree whose name-depth on a node
kind exceeds 2^32 — uncommon but not impossible). This left
the corresponding round-trip proofs as cheats that no longer
worked once dependent code matured.
Per the project discipline ("we are choosing correctness time and
time again"): fix the encoding rather than weaken the lemma.
## What landed
Foundation/Meta.lean:
Replaced syntaxRenderAux / syntaxDepthFuel / syntaxFuelCap with:
· syntaxToLeanSource / syntaxArrayToLeanSource — mutual
structural rendering, total
· syntaxDepth / syntaxArrayDepth — mutual structural depth,
total
Foundation/MetaParse.lean:
Refactored parseSyntax?Aux / parseSyntaxList?Aux into a joint
parseSyntaxOrList?Aux : Nat → Bool → List Token →
Option ((Lean.Syntax ⊕ List Lean.Syntax) × ...)
Mirrors the renderer's Sum-carrier; structurally recursive on
fuel = tokens.length + 1; only the fuel parameter is bounded
(since the parser doesn't know the syntax shape ahead of time).
Added correctness round-trip lemmas:
· parseStringPosRaw?Aux_correct
· parseSubstringRaw?Aux_correct
· parseBool?Aux_correct
· parseSourceInfo?Aux_correct
· parseStringList?Aux_correct
· parsePreresolved?Aux_correct
· parsePreresolvedList?Aux_correct
· parseSyntaxOrList?Aux_correct (the master joint round-trip)
· parseSyntax?Aux_correct (specialisation at .inl s)
Added length bounds for the WF measure:
· stringPosRawToTokens_length_bound (and 5 other helper bounds)
· syntaxToTokens_length_bound / syntaxListToTokens_length_bound
(mutual structural induction; chains all helper bounds)
Replaced the cheat .declAt arms in parseArtifact?Aux_correct
(line 957) and artifactFromTokens?_round_trip (line 1078) with
real proofs derived from the new lemmas.
## Discipline
· Zero sorry / admit (only Comonad/Convolution.lean's interpolated
"... := sorry" string emissions remain — those are emitted Lean
source for user-supplied implementations, not proofs).
· Zero noncomputable / Classical.propDecidable.
· Zero TODO / FIXME / placeholder comments in source-rendering code.
· No tests deleted; the Test.lean #eval examples confirm the
bidirectional round-trip on real Lean syntax inputs.
## Verification
cd infoductor && lake build # Build completed successfully (12 jobs)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Attempted to prove `readIdent_app` (the key distribution lemma)
but hit a wall on `String.push c ++ xs.asString = s ++ (c :: xs).asString`
which is not definitionally true in Lean 4.30 (String is
UTF-8 ByteArray-backed, not a List Char structure).
Documents two cleaner paths forward:
(a) Refactor `readIdent`/`readStrLit` to accumulate into
`List Char` instead of `String`, decoupling proofs from
String internals.
(b) Import Mathlib's richer String API which provides the
needed structural lemmas.
The committed state:
· Atomic Phase 3 witnesses via decide (5 theorems, kernel-rooted).
· 3 foundation tokenize lemmas (lparen/rparen/space).
· The full Phase 3 universal lemmas documented inline as
open work, with a proof sketch for both paths.
The token-level universal (already proven) plus closed-instance
decide tests cover the round-trip operationally. Adding Phase 3
proper is a focused multi-day refactor.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds three foundation lemmas about `tokenize`:
· tokenize_lparen_cons : tokenize ('(' :: rest) = lparen :: tokenize rest
· tokenize_rparen_cons : tokenize (')' :: rest) = rparen :: tokenize rest
· tokenize_space_cons : tokenize (' ' :: rest) = tokenize rest
These are the trivial cases — pure unfolding of `tokenizeAux` on
single-char-token branches.
The full Phase 3 (`tokenize (toLeanSource v).toList = toTokens v`)
requires four further substantial lemmas:
· readIdent_split : reading an ident sequence followed by a
non-ident-rest char (or end of input) yields
exactly the accumulated string.
· readStrLit_split : reading an escapeStrLit-encoded body until
the closing `"` recovers the original string.
· tokenize_app_clean: tokenize distributes over a concatenation
where the prefix ends "cleanly" (rparen,
whitespace, or strLit close).
· tokenize_render_X: induction over each meta-mirror type using
the above plus IH on sub-values.
Each is multi-page Lean reasoning about `String`/`List Char`/
`readIdent`/`readStrLit` distribution. The proof sketches are
documented inline.
The token-level universal (already proven) plus closed-instance
`decide` tests cover the round-trip operationally. Adding the
Phase 3 universal would let us state
∀ t, fromLeanSource? (toLeanSource t) = some t
without any closed-instance restrictions.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Proves four ∀-quantified, structurally-inductive round-trip theorems:
· nameFromTokens?_round_trip : ∀ n, fromTokens? (toTokens n) = some n
· classifierFromTokens?_round_trip: ∀ φ, fromTokens? φ.toTokens = some φ
· cTermFromTokens?_round_trip : ∀ t, fromTokens? t.toTokens = some t
· artifactFromTokens?_round_trip : ∀ a, a.supported → fromTokens? a.toTokens = some a
These are the canonical universal round-trips — the parser
inverts the canonical token form on every meta-mirror value.
No `decide`, no `native_decide`, no kernel-depth tricks: pure
structural induction on the meta-mirror type, with sufficient
fuel guaranteed by the per-type length-vs-depth lemma.
Implementation:
(1) Fixed latent double-paren bug in `nameToLeanSource`: dropped
extra parens around recursive sub-name calls (consistent
with classifier/cterm renderers). Pre-fix, 3-level deep
names like `eq0.i` (FaceFormula.eq0 encoding) failed to
round-trip silently — no test exercised them. Added a
`set_option maxRecDepth 4000 in theorem … decide`-based
regression test.
(2) Refactored parsers to fuel-based. `parseName?Aux`,
`parseClassifier?Aux`, `parseMetaCTerm?Aux`, `parseArtifact?Aux`
each take a Nat fuel that decreases on every recursive call,
so they're total without `partial`. Top-level wrappers pass
`tokens.length + 1`, always sufficient.
(3) Added canonical token forms `nameToTokens`,
`MetaClassifier.toTokens`, `MetaCTerm.toTokens`,
`MetaArtifact.toTokens` — direct value→[Token] mappings,
parallel to the renderers but at the token level.
(4) Phase 2 (parser correctness on toTokens): four mutual-induction
theorems, one per meta-mirror type. Each proves
`parser?Aux fuel (value.toTokens ++ rest) = some (value, rest)`
when fuel ≥ value.depth.
(5) Length-vs-depth lemmas: nameToTokens_length_bound,
classifierToTokens_length_bound, cTermToTokens_length_bound.
Each by induction.
(6) Token-level universal round-trip theorems: composed from (4)
and (5) by setting rest = []. These are the headline results.
Phase 3 (tokenize ∘ render = toTokens, the String-level extension)
is documented but unproven — substantial String/List reasoning
required. The kernel-rooted decide tests for closed instances
(MetaCTerm.empty, sym, app, etc.) provide empirical evidence.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Refactor MetaParse.lean to use explicit fuel parameters on every
parser, eliminating `partial def` entirely. Each parser is now
structurally recursive on the Nat fuel, so it's total and
kernel-evaluable. Top-level wrappers pass `tokens.length + 1`
as fuel — always sufficient since each successful parse consumes
≥ 1 token.
Move `escapeStrLit` to Foundation/Meta.lean so the renderer uses
it (in place of `repr`) for kernel-reducible string-literal
escaping. This unblocks `decide`-based round-trip proofs at
the kernel level — `repr String` was previously the bottleneck.
Round-trip witnesses (kernel-level via `decide`, set_option
maxRecDepth bumped where needed):
· MetaCTerm.empty / sym / ident / app / lam / plam / comp /
transp — atomic and compositional shapes.
· MetaClassifier.always / never / meet / atDecl.
· MetaArtifact.empty (rendering-equivalence for the .declAt-
containing inductive).
· A nested .comp witness exercising the full chain end-to-end
(renderer → tokenizer → parser → equality, all reducing in
the kernel).
Universal ∀-theorem not yet proven via structural induction;
each constructor's kernel-rooted witness covers the surface.
The existing `native_decide` round-trip tests in Infoductor/
Test.lean remain as additional empirical coverage.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
A hand-written tokenizer + recursive-descent parser that reads
the Lean source emitted by `toLeanSource` and reconstructs the
original meta-mirror value. Foundation/MetaParse.lean: 300
lines, faithful to the renderer's exact format.
Components:
· Token type (parens, ident chains, string literals, num literals).
· `tokenize : List Char → List Token` (partial; structural
decrease is implicit via helpers).
· `parseName?`, `parseClassifier?`, `parseMetaCTerm?`,
`parseArtifact?` — recursive-descent, return Option (T × tail).
· `MetaCTerm.fromLeanSource?` / `MetaClassifier.fromLeanSource?`
/ `MetaArtifact.fromLeanSource?` — top-level wrappers
demanding full input consumption.
Foundation/Meta.lean: derive `DecidableEq` on `MetaCTerm` (its
field types — Lean.Name, String, MetaClassifier — all have
DecidableEq). Switch FaceFormula.eq0/eq1 encoding from
`Name.appendAfter "_eq_0"` (string suffix) to a 2-component
`Name.mkStr (.mkSimple "eq0") i.name` form so reflection
round-trips by rfl with no string-suffix munging.
Foundation/MetaParse.lean: parsers are `partial def` because
the recursive calls land on output tails of helper readers,
which Lean can't see as structurally smaller without auxiliary
"consumes input" lemmas. Kernel-reducible round-trip is
deferred — `native_decide`-based tests in Infoductor/Test.lean
witness round-trip operationally for every meta-mirror arm.
Tests: 11 native_decide examples covering empty/ident/sym/app/
lam/comp/transp on MetaCTerm, always/meet on MetaClassifier,
empty/cterm on MetaArtifact (artifact uses rendering-equivalence
since Lean.Syntax in `.declAt` lacks DecidableEq).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>