cubical-transport-hott-lean4/ZIGZAG_PORT.md
Maximus Gorog c2e3ecb3e3
Some checks are pending
Lean Action CI / build (push) Waiting to run
Initial commit: topolei — cubical-transport HoTT in Lean 4 + Rust FFI
Implements the cells-spec vision: a computation space that preserves
auditability, correctness, interactivity. Phase 1 (Lean kernel +
naga-IR Rust backend) is closed; foundation hypothesis stack
(Selection H1+H2, Subobject H3, Trace H5, Obs.Ctx C2, Cubical.Trace)
landed.

Highlights:
- Cubical-HoTT syntax + value/eval/readback in Lean
- naga-IR pipeline (no GLSL string crosses FFI; 17/17 probes pass)
- Honesty audit: every non-transport (sealed cells, vertex shader,
  Y-flip, presentation conventions) is documented as such
- Polymorphic Trace α as free monoid; Cubical.Trace gives
  CTerm → Trace CTerm by structural fold (homomorphism = definition)
- Selection as Huet zipper; Subobject as Boolean algebra over WCell
- All theorems proven; the proof IS the implementation

See STATUS.md for the resume guide.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-27 20:40:45 -06:00

13 KiB
Raw Blame History

Zigzag Engine — Lean Port Plan

Parallel to TRANSPORT_PLAN.md (which guided Phase 1's cubical formalisation). This document plans the step-by-step port of the zigzag engine from its Rust reference implementation into Lean 4, as the combinatorial n-category backend for topolei's cell layer.


Decision (2026-04-22)

The zigzag engine will be reimplemented in Lean 4. The existing Rust implementation at zigzag-engine/zigzag-engine/ is reference material only — a structural template for the port, not a dependency. This matches topolei's Lean-as-host discipline and maximises the project's medium-term goal of Lean-native reasoning (see below).

The only Rust component in topolei is the cubical evaluator FFI backend — the module that discharges the Phase 1 axioms (step, eval, vApp, vPApp, vTransp, etc.) via @[extern] + @[implemented_by]. That one Rust crate exists to extend Lean 4 with computational cubical-transport HoTT; it is topolei's reason for any FFI whatsoever. Everything else lives in Lean.

Why Lean, not the Rust backend (Option A over Option B)

Axiomatic Rust backends give axiom normalise_idempotent : ... — statements we can use but cannot prove. Porting to Lean makes each such statement a theorem the kernel checks. The project's medium-term goal is to maximise what can be reasoned about in Lean; that forecloses FFI-backed hiding of mathematical content.

The Rust implementation was itself AI-assisted and is not hand-polished artefact we are throwing away; it is a test-oracle-quality scaffold that the Lean port can match against.

Reference materials

  • zigzag-engine/papers/zigzag-normalisation-2205.08952.pdf — Heidemann-Reutter-Vicary, LICS 2022. The algorithm (Construction 17) and correctness (Proposition 19).
  • zigzag-engine/papers/layout-algorithm-2305.06938.pdf — Tataru-Vicary, 2024. Explosion / k-points / layout.
  • zigzag-engine/papers/homotopy-io-2402.13179.pdf — Corbyn et al., FSCD 2024. The parent proof assistant.
  • zigzag-engine/zigzag-engine/src/*.rs — reference Rust implementation (11,003 lines across 13 modules).
  • zigzag-engine/zigzag-engine-spec/zigzag-engine-spec.md — original spec for the reference implementation.

Port destination

All Lean modules land under Topolei/Zigzag/:

Lean module Rust reference Approx. size
Zigzag/Monotone.lean src/monotone.rs (325 LOC) ~150 LOC + proofs
Zigzag/Core.lean src/zigzag.rs (291 LOC) ~150 LOC
Zigzag/Diagram.lean src/diagram.rs (1484 LOC) ~600 LOC
Zigzag/Signature.lean src/signature.rs (200 LOC) ~100 LOC
Zigzag/Degeneracy.lean src/degeneracy.rs (1284 LOC) ~500 LOC + proofs
Zigzag/Normalise.lean src/normalise.rs (849 LOC) ~400 LOC + proofs
Zigzag/Typecheck.lean src/typecheck.rs (597 LOC) ~250 LOC
Zigzag/Explosion.lean src/explosion.rs (1414 LOC) ~500 LOC
Zigzag/Tests.lean tests/ + examples/ ~200 LOC #eval regressions

Intentionally not ported:

  • src/import.rs (1491 LOC) — homotopy.io interop, not needed.
  • src/discover.rs (1981 LOC) — search over diagrams; decide later.
  • src/python.rs (716 LOC) — Python bindings, not needed.
  • src/layout.rs (320 LOC) — geometric layout; deferred to Phase 4 Interaction, may be a Lean module or may defer to a Rust @[implemented_by] optimisation later.

Core port size: roughly 2,5003,000 Lean lines to match the algorithmic core of the Rust implementation, with proofs adding perhaps another 1,0002,000 depending on how far the correctness theorems are pursued (Step 9 below).


Steps

Step 1 — Zigzag/Monotone.lean (foundation)

Content:

  • MonotoneMap (n m : Nat) structure with entries : List (Fin m) and is_monotone proof.
  • Composition, identity, face maps dᵢ.
  • Wraith's R equivalence Δ₊ → Δ₌ᵒᵖ as a pure function.
  • Preimage computation.

Proofs:

  • MonotoneMap.compose_assoc
  • MonotoneMap.wraith_r_involution (R² = id on the nose)
  • MonotoneMap.face_map_image — face maps omit exactly one element.

Deliverable test: #eval the inspect_half_braid example's monotone substructure; compare to the Rust engine's output on the same input.


Step 2 — Zigzag/Core.lean (zigzags themselves)

Content:

  • Zigzag (T : Type) : Type{ regular : Vec T, singular : Vec T, forward : Vec Morphism, backward : Vec Morphism }.
  • ZigzagMap — singular map fˢ : n → m in Δ₊ with regular/singular slices and the commutativity conditions as Prop-valued fields.
  • Composition of zigzag maps.

Proofs:

  • ZigzagMap.compose_respects_commutativity — composition preserves the commutativity predicates.
  • Zigzag.identity_is_length_zero — the identity zigzag is trivially a zero-length zigzag.

Step 3 — Zigzag/Diagram.lean (the main data structure)

Content:

  • Mutual inductive Diagram / DiagramN / Cospan / Rewrite / Cone. Same shape as Rust's pub enum Diagram { Diagram0(Generator) | DiagramN(DiagramN) }.
  • Smart constructors: Diagram.identity, Diagram.attach, Diagram.compose.
  • Dimension predicate Diagram.dimension : Diagram → Nat.
  • Source/target extractors.
  • Regular-slice / singular-slice computation (mirrors DiagramN.regular_slice in the Rust).

Proofs:

  • Diagram.dimension_of_attach — attaching a generator of dimension k produces a diagram of dimension k.
  • Diagram.source_source / Diagram.target_target boundary consistency.
  • Globularity predicate + decidability.

Step 4 — Zigzag/Signature.lean

Content:

  • Generator structure: id, dimension, invertibility.
  • GeneratorData with source / target diagrams.
  • Signature as a list/hashmap of GeneratorData.
  • Well-formedness: every GeneratorData's source / target dimension = generator.dimension - 1.

Proofs:

  • Signature.well_formed is decidable.

Step 5 — Zigzag/Degeneracy.lean

Content:

  • Predicates: IsSimpleDegeneracy, IsParallelDegeneracy, IsDegeneracy (closure under composition of the first two).
  • Constructors: Degeneracy.insert_identity_cospan (the basic simple degeneracy).
  • Factorisation: every degeneracy factors as simple ∘ parallel (Lemma 7 from the paper).

Proofs:

  • Degeneracy.isomorphisms_are_degeneracies (Lemma 6).
  • Degeneracy.factorisation_unique_up_to_iso (Lemma 7).
  • Degeneracy.is_monomorphism (Lemma 8).
  • Degeneracy.left_cancellation (Lemma 10).
  • Degeneracy.finite_subobjects (Lemma 14).

This step is where the bulk of the Phase 1-style proof work sits. Some of these may start as axiom and promote to theorem as the infrastructure firms up — same pattern as how T1/T2/C1/C2 worked in Cubical/TransportLaws.lean.


Step 6 — Zigzag/Pullback.lean (Proposition 13)

Content:

  • Pullback construction for degeneracy maps.
  • pullback_is_degeneracy statement.

Proofs:

  • Degeneracy.pullback_exists — the construction terminates.
  • Degeneracy.pullback_legs_are_degeneracies (Proposition 13).

Note: Proposition 13 is the most algorithmically dense piece. OK to start as an axiom. Pattern to follow: state the axiom, write the construction as a partial def with test-case regression, upgrade to a total def + theorem when the proof is clearer. Exactly how step/eval were handled in Phase 1.


Step 7 — Zigzag/Normalise.lean (Construction 17)

Content:

  • NormalisationResult structure: normal_form, degeneracy, factorisations.
  • Sink structure for relative normalisation.
  • normalise : Diagram → NormalisationResult (absolute case).
  • normalise_sink : Sink → NormalisationResult (relative case).
  • Termination: structural recursion on Diagram.dimension.

Proofs:

  • normalise_idempotent — the headline result (easy, structural).
  • normalise_preserves_globularity (Proposition 23).
  • normalise_correctness (Proposition 19) — relative to the axiom set from Steps 56.

Test: port Rust unit tests from tests/integration_tests.rs to Lean #eval regressions (Eckmann-Hilton dim 3, syllepsis dim 5, Figure 6 dim 4 essential-identity).


Step 8 — Zigzag/Typecheck.lean

Content:

  • SingularContent extraction.
  • Piece decomposition.
  • type_check : Diagram → Signature → Except TypeError Unit.

Proofs:

  • type_check_sound — if type_check D Σ returns ok, then all pieces' normalisations are in Σ.

Step 9 — Zigzag/Tests.lean (regression battery)

Port the Rust test cases:

  • tests/integration_tests.rs — normalisation regressions.
  • tests/nontrivial_constructors.rs — diagram construction.
  • examples/inspect_half_braid.rs — the Eckmann-Hilton braiding.
  • examples/render_braiding.rs — braiding as a 3-diagram.
  • examples/scaffold_analysis.rs / trace_scaffold.rs / trace_merge.rs — reduction traces.

Each becomes a Lean #eval or example proving the expected output. These are the correctness gradient that catches porting errors early.


Step 10 — Cell/Zigzag.lean (bridge to cubical core)

Content:

  • Translator: CType → Option Diagram for the dimensions where both make sense (0-cells, 1-cells, 2-cells-via-Path).
  • Translator: Diagram → Option CType for the inverse.
  • Identity / compose / whisker operations at the Cell layer that dispatch to the right backend: cubical for low dimensions (where univalence matters), zigzag for higher dimensions (where combinatorial composition dominates).

This is where the two formalisms meet. Cubical Phase 1 gives us equivalence and transport; Zigzag gives us higher-composition and normalisation; Cell/ combines them.


Explosion and layout (post-core)

Steps 11+ (not critical for the n-category reasoning goal):

  • Zigzag/Explosion.lean — k-points, poset structure. Lean-native port of src/explosion.rs (1414 Rust LOC).
  • Zigzag/Layout.lean — constraint system. May remain pure Lean or may defer the QP solver to a Rust @[implemented_by] optimisation. Decided later once performance requirements are known.

Axiom discipline (from Phase 1 experience)

The port follows the same axiom-first pattern established in Phase 1:

  1. First pass: data structures pure; algorithm as def (maybe partial def); key correctness statements as axiom.
  2. Second pass: tighten partial def into def with structural termination; promote axioms to theorems where the proof is mechanical.
  3. Third pass: prove the hard theorems (Proposition 13, correctness of Construction 17 relative to the degeneracy axioms).

At every stage, axioms are formal specs for what the algorithm must satisfy, not blanket assumptions. The Rust reference implementation tests each axiom via example; the Lean port must match those tests.


Relationship to topolei's other phases

  • Phase 1 (Cubical Core) — complete. Not touched by this port.
  • Phase 2 (Cells) — the zigzag Lean port is a prerequisite for cells-spec §6.3 "Higher Cells". Cell/Basic.lean can begin using cubical-only semantics for 0/1/2-cells; higher cells use the zigzag backend from Step 10 above.
  • Rust FFI (cubical evaluator) — independent work stream. The zigzag port does not depend on it. When the Rust FFI lands, it backs the cubical axioms; the zigzag Lean code becomes a consumer of the now-computational cubical layer.
  • Numerical layer (NUMERICAL.md) — independent. Schemes can use zigzag diagrams as structural source / target types once the port is complete.

Sizing

  • Steps 14: ~2 weeks (data structures + basic algorithms).
  • Steps 57: ~34 weeks (degeneracy + normalisation + proofs; this is the heart of the port).
  • Step 8: ~3 days.
  • Step 9: ~1 week (regression battery).
  • Step 10: ~1 week (bridge).
  • Total: 68 weeks for the core port with correctness theorems.

Comparable to Phase 1 in size; same single-developer feasibility.


Success criteria

The port is complete when:

  1. All regression tests from zigzag-engine/tests/ pass as Lean #evals or examples.
  2. normalise_idempotent is a theorem (not an axiom).
  3. The Eckmann-Hilton (dim 3), syllepsis (dim 5), and Figure 6 (dim 4) examples type-check and normalise to their documented results.
  4. Cell/Zigzag.lean (Step 10) compiles and bridges to the cubical core without circular dependencies.
  5. STATUS.md can claim "Phase 2 Higher-Cell backend: closed in Lean" with zero Rust dependency (beyond the cubical-evaluator FFI, which is a separate work stream).

At that point, topolei has a Lean-native combinatorial n-category engine, provably correct where proven, with the Rust zigzag engine demoted from reference to archive.