cubical-transport-hott-lean4/docs/KERNEL_BOUNDARY.md
Maximus Gorog d9f952fa6c
Some checks are pending
Lean Action CI / build (push) Waiting to run
Reorganize: move non-README docs into docs/ subfolder
Standard convention: README.md at root, everything else in docs/.

Engine docs/: FFI_DESIGN, FFI_COMPLETENESS, KERNEL_BOUNDARY, NUMERICAL,
PHASE1_HISTORY, ZIGZAG_PORT.  README.md links updated to docs/<name>.
Cross-repo reference in NUMERICAL.md (to topolei's STATUS.md) now
includes the relative path `../topolei/docs/STATUS.md`.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-27 22:57:10 -06:00

28 KiB
Raw Permalink Blame History

KERNEL_BOUNDARY.md — What topolei can achieve in unmodified Lean 4, and what needs kernel work

Stage 4 / post-Phase B artifact, 2026-04-24. The precise boundary between what a deep cubical-HoTT embedding gives us within Lean's current kernel, and what would require a kernel-level intervention. This document is the scope contract: if a feature is below the boundary, topolei can deliver it as a library; if above, it requires upstream Lean work and therefore lies outside topolei's scope.


0. TL;DR

  • Achievable now (the ceiling in unmodified Lean 4): a full embedded cubical-transport HoTT with kernel-speed reductions (via FFI Rust). Univalence, Glue types, Σ/Π/Path, composition, transport, and a partial bridge to Eq on discrete types.
  • Requires kernel work (the floor of the restricted region): making Lean's native Eq cubical, HITs in the native inductive framework, universe-level univalence for Type, and any definitional (not just propositional) computation of cubical axioms.
  • The two regions are disjoint. Nothing topolei does can cross this boundary without forking Lean. Nothing in the restricted region is required for the cells-spec vision.

1. The fundamental obstruction

Lean 4's kernel has three design choices that, together, force UIP (Uniqueness of Identity Proofs):

  1. propext — propositional extensionality. (P ↔ Q) → P = Q for propositions.
  2. Quot.sound — quotient soundness. Relation-equal elements become equal after quotienting.
  3. Proof irrelevance in Prop — all inhabitants of P : Prop are definitionally equal.

By a classical argument (Hofmann-Streicher-style), propext + Quot.sound imply that Lean's Eq on any type is a proposition: ∀ x y : α, ∀ h₁ h₂ : x = y, h₁ = h₂ — i.e. UIP. Univalence contradicts UIP. If ua : A ≃ B → A = B in Lean's Eq world, then applying UIP to ua e vs ua e' for two genuinely distinct equivalences would collapse them, which collapses the equivalences themselves.

Therefore: Lean's native Eq cannot be cubical without rejecting at least one of propext / Quot.sound / Prop-irrelevance. Rejecting any of these is a fundamental change to the kernel — it's not a matter of adding features but of removing guaranteed properties.

This obstruction is what makes HoTT-in-Lean necessarily a deep embedding (as in topolei, cubical-Agda's approach) rather than a native feature (as in Cubical Agda or cubical tt proper).


2. Achievable in unmodified Lean 4 (topolei's delivery region)

Everything below can be done — and most is already done — using nothing but Lean's existing primitives plus FFI.

2.1 Deep-embedded cubical calculus

  • CType / CTerm / CVal / CNeu as mutual inductive Lean declarations. Done (Cubical/Syntax.lean, Value.lean).
  • Face lattice (FaceFormula), de Morgan interval (DimExpr), substitution, dim-absent analysis. Done.
  • Typing judgment HasType : Ctx → CTerm → CType → Prop. Done.

2.2 Computational content via FFI

  • Rust implementation of the evaluator, linked via @[extern] + @[implemented_by]. Done (Phase A+B). At runtime, any call to eval routes through Rust at native speed; kernel-level proof reasoning goes through axioms (unchanged).
  • The FFI boundary is trust-based: Lean's kernel does not verify that the Rust symbol matches the corresponding axiom. We pay for this with audit discipline (FFI_COMPLETENESS.md).

2.3 Reductions as axioms + NbE theorems

  • Every CCHM reduction rule stated as a Lean axiom in Eval.lean / Transport.lean / Readback.lean / Glue.lean / CompLaws.lean.
  • Kernel-level proofs use these axioms via rw / simp / explicit rewriting.
  • NbE theorems promote step-level claims from axioms to theorems (Stream B #2b/c/d, Stage 2.3).

2.4 Univalence as a theorem, not an axiom

  • transp_ua in Soundness.lean is a proved theorem, not posited. Exhibits the forward map of an equivalence as the computational content of Glue transport. Done.

2.5 Σ, Π, Path, Glue as type-formers

  • All four have CType constructors, eval arms, readback arms, and typing rules. Done (Stage 1.2 for Σ).

2.6 Partial bridges to Lean's Eq

  • Always-available direction: Eq a b → Path (embed α) (embed a) (embed b) (Bridge 1 in the cubical-to-eq companion). Forgets UIP information — proof.
  • Discrete-type direction: Path (embed α) (embed a) (embed b) → a = b when [DecidableEq α]. Proves by decidable-equality case analysis.
  • Prop-level bijection: P : Prop → paths and Eq coincide trivially via proof irrelevance.

These bridges let users transport discrete-math lemmas (Nat, Bool, decidable structures, Mathlib-style hypotheses) into cubical proofs and vice versa. An Eq ↔ Path bridge module is planned (not yet written). A different cubical bridge — Topolei/Cubical/Trace.lean in the sibling topolei repo — already exists, but it lifts CTerms into the polymorphic Trace for provenance, not for Eq interop.

2.7 Higher cells via Zigzag Lean port

  • Cell/Higher.lean (2-cells / n-cells) built atop a Lean-native port of the n-category engine at zigzag-engine/. Pure Lean, no kernel changes. Scheduled 68 weeks, not blocked by anything in this document.

2.8 Cell layer / reactive / GPU / shader / interaction

  • Cells-spec §6§10 entirely achievable within this boundary. Cells are Cell Lean structures whose line field is a CType function; composition is path concatenation; interaction is via IO + GPU FFI. No kernel changes needed.

2.9 Summary of "below the boundary"

topolei can deliver: cells-spec Phases 16 in their entirety, the full Rust evaluator, all of cells-spec §14's theorems, the zigzag higher-cell engine, cell graphs, rendering, direct manipulation, GPU shaders. The cells vision does not require kernel work.


3. Above the boundary (requires kernel work)

These items are desirable but cannot be delivered by any library (topolei or otherwise) within Lean 4's current kernel. Each item lists (a) what it is, (b) why it's blocked, (c) what upstream Lean change would unblock it, (d) its topolei consumer (if any).

3.1 Native Eq as cubical identity

What: (a = b) : Type being cubical — non-trivial paths, univalence, proof-relevant equality at the base-types level.

Blocked by: propext + Quot.sound + Prop-irrelevance forcing UIP (§1).

Upstream change: Lean would need to offer a separate universe in which one of the three axioms above is dropped — e.g. a HType universe or --cubical-native mode where Eq is the path type and propext no longer applies. This is a fundamental kernel extension, comparable in scope to adding Cubical Agda's --cubical flag to Agda.

topolei consumer: none critical. Deep embedding already provides cubical equality as Path within CType. Eq / Path interop handled via Bridge.lean.

3.2 Definitional computation of cubical axioms

What: Lean's kernel would compute transp (refl) x to x automatically (definitional equality), instead of requiring explicit rw [eval_transp_top, …] in proofs.

Blocked by: axioms are opaque to Lean's definitional-equality check. The kernel's reduction algorithm only unfolds defs and recursors, not axioms.

Upstream change: Lean would need a user-extensible definitional equality — e.g., a @[compute] attribute that lets axioms participate in reduction. No such mechanism exists. Adding one would require validating that the claimed reductions are confluent and type-preserving — a significant metatheoretic burden the kernel does not currently bear.

topolei workaround: all reductions are propositional; proofs must invoke them via rw / simp / dsimp. We ship a simp set for the common reductions (future work, Phase C+).

topolei consumer: UX. Propositional reductions make cubical proofs verbose. Worth-paying friction for the current scope.

3.3 HITs (Higher Inductive Types) in Lean's native inductive framework

What: inductive S¹ where | base : S¹ | loop : base = base — defining types by point and path constructors directly in Lean's inductive command.

Blocked by: Lean's inductive-type elaborator only generates point constructors; there is no mechanism for path constructors. Lean's recursor generator can't handle the higher-dimensional case-analysis needed for HITs.

Upstream change: A --hit extension to the inductive elaborator that (a) accepts path constructors, (b) generates higher-dimensional recursors with proof obligations, (c) ensures consistency with the remaining kernel axioms. Cubical Agda supports this; Lean does not.

topolei workaround: HITs can be encoded as Glue types + bespoke recursion principles inside CType. Awkward but expressible.

topolei consumer: Color/Space.lean (modular color spaces), future Cell/Higher.lean extensions — but all existing uses encode HITs explicitly via Glue.

3.4 Universe-level univalence for Type

What: Type itself being a univalent universe — A ≃ B → A = B at the Lean-universe level.

Blocked by: Lean's Type satisfies UIP per §1. A = B at the universe level is trivial (only rfl).

Upstream change: same as §3.1 — a separate cubical universe.

topolei has it: CType.univ is a universe inside the embedded cubical calculus, and transp_ua proves univalence there. Users reason about universe paths via uaLine inside the calculus; they don't reason about them on Lean's Type.

3.5 Proof-relevant Prop

What: Prop inhabitants being distinguishable — useful for modelling 2-cells, homotopies, truncation levels.

Blocked by: Prop has proof irrelevance by kernel design.

Upstream change: Remove proof irrelevance from Prop or add a second "relevant" Prop-like universe. Would break a large amount of existing Lean code that assumes Prop-irrelevance.

topolei has it: relevance lives in CTerm / CVal, which are regular Type-level Lean data with no proof irrelevance. Homotopies are CTerms inhabiting .path (.path …) types. Cells-spec's 2-cells and n-cells live entirely in this layer.

3.6 Native transport along Lean Eq

What: Given h : a = b : α in Lean's Eq and P : α → Type, the coercion h ▸ x : P b should be cubical transport on Path- structured types.

Blocked by: Lean's is Eq.ndrec — it computes only when h is rfl, and its one-step unfolding is UIP-dependent.

Upstream change: same kernel work as §3.1.

topolei workaround: users transport in the cubical world using .transp CTerm, then bridge back to Lean via the Bridge module for discrete types. Universe-level / HoTT-relevant transports live exclusively in the embedded calculus.

3.7 Cubical-aware simp/rewrite tactics

What: tactics that understand Path-equality and rewrite cubically — e.g., "rewrite this occurrence using a path, respecting path composition and transport."

Blocked by: tactic framework sees only Lean's Eq. Cubical paths are opaque terms it cannot pattern-match.

Upstream change: either adding cubical-aware tactic primitives (extending the tactic API) or — more cleanly — making cubical paths first-class via §3.1.

topolei workaround: users invoke cubical rewrites by explicit rw [eval_...] / rw [readback_...] calls. Less automation, more bookkeeping. Planned mitigation: a cubical_simp tactic as a pure-Lean extension in Phase 6 (cells-spec §19).

3.8 Kernel-verified FFI

What: Lean's kernel would check that the Rust symbol topolei_cubical_eval actually implements the eval_* axioms before trusting it.

Blocked by: @[extern] is trust-based by design. Verifying a Rust function against a Lean axiom is outside Lean's typechecker.

Upstream change: A certified-compilation story — Lean would either check via translation validation or require a certificate. Nontrivial research topic.

topolei workaround: differential fuzz testing (Phase D) + FFI_COMPLETENESS.md audit + review discipline. The Rust side matches the Lean spec by human inspection; we trust what we build.

3.9 Summary of "above the boundary"

These are long-term targets that would require forking Lean's kernel or submitting substantive upstream proposals. None of them is a blocker for cells-spec's core vision. Each is a UX enhancement on top of what topolei already delivers.


4. Dependency graph of kernel items

                ┌─────────────────────────────────────┐
                │  3.1 Native Eq as cubical identity  │
                │  (requires dropping UIP)            │
                └─────────────────────────────────────┘
                     ▲           ▲         ▲
                     │           │         │
                     │           │         │
        ┌────────────┘           │         └────────────┐
        │                        │                      │
        │                        │                      │
┌───────────────┐    ┌────────────────────┐    ┌─────────────────────┐
│ 3.4 Type-     │    │ 3.6 Native Eq ▸    │    │ 3.7 Cubical-aware    │
│ level univ.   │    │ as cubical transp │    │ tactics              │
└───────────────┘    └────────────────────┘    └─────────────────────┘

┌──────────────────────────────┐
│ 3.2 Definitional computation │  ← independent of 3.1, but benefits
│ of cubical axioms            │     users regardless of cubicality
└──────────────────────────────┘
                ▲
                │ enables
                ▼
┌──────────────────────────────┐
│ 3.7 Cubical-aware tactics    │  ← also benefits from 3.1
└──────────────────────────────┘

┌─────────────────────────────────────────┐
│ 3.3 HITs in native inductive framework  │  ← mostly independent;
│                                          │     better with 3.1 but
└─────────────────────────────────────────┘     not required

┌──────────────────────────────┐
│ 3.5 Proof-relevant Prop      │  ← independent; breaking change
└──────────────────────────────┘

┌──────────────────────────────┐
│ 3.8 Kernel-verified FFI      │  ← independent; cross-cutting
└──────────────────────────────┘

3.1 is the keystone: if Lean adopted a cubical universe, most other items follow or simplify. Until then, everything above is deliverable only via deep embedding (§2).


5. Boundary diagnostic checklist

Use this to quickly determine whether a feature request lies below or above the boundary:

Question If yes, below boundary (topolei can deliver)
Is the feature expressible in CTerm / CType? yes
Does the feature only require propositional (not definitional) reduction? yes
Does the feature live inside CType.univ (the embedded universe)? yes
Does the feature avoid depending on propext being false? yes
Does the feature avoid requiring Lean to pattern-match on paths? yes
Question If yes, above boundary (requires kernel work)
Does the feature make Lean's native Eq non-trivial on a type where DecidableEq fails? blocked
Does the feature require automatic (non-tactic) cubical reduction inside kernel type-checking? blocked
Does the feature need HIT path constructors in the inductive keyword? blocked
Does the feature need 2-cells or higher in Prop? blocked
Does the feature require Lean's typechecker to verify Rust FFI implementations? blocked

6. What topolei commits to delivering

Scope: everything below the boundary (§2), full Rust evaluator, all cells-spec phases 16. Bridges (§2.6) provide partial interop with Eq for discrete types.

Scope: NOT delivering. None of §3. Attempts to deliver them within topolei would either (a) require forking Lean (out of scope), (b) require abandoning soundness (not acceptable), or (c) duplicate effort that upstream Lean is better positioned to handle.

If upstream Lean ever adds a cubical universe (§3.1), topolei's deep embedding can be retired incrementally in favor of the native layer — most of the code (the axiom set, the reductions, the Bridge module) would become theorems rather than primitives. The Rust backend's role shrinks but doesn't disappear: performance wins for cubical reductions remain valuable regardless of kernel native support.


7. Recommendations for upstream Lean

(Not a project dependency — topolei ships without these. Recorded here as a standing list of things worth advocating for in Lean 4 developer discussions.)

  1. Highest value: Definitional-equality extensibility (§3.2). Unblocks cubical UX without breaking any existing invariants. Could be implemented as a restricted rewrite engine attached to kernel reduction.
  2. Medium value: Opt-in universe with reduced axioms (§3.1 via a CubeType universe sibling to Type). Would let cubical HoTT be a native Lean feature for users who opt in, while preserving Type for classical work.
  3. Research-grade: Kernel-verified FFI (§3.8). Out of scope for any near-term Lean release but valuable for production trust.
  4. Lower priority: HITs (§3.3), proof-relevant Prop (§3.5). Both handleable via embedding; native support is UX, not necessity.


8. PR readiness checklist — self-assessment for contributing to Lean 4

A standing checklist for when a kernel-level PR feels within reach. Not a gatekeeping list; a learning roadmap. Work through the items in rough order. The goal is not to pass every item before writing a single word, but to know where you are on each axis, and which axis is the current bottleneck.

8.1 Foundational type-theory literacy (the common base)

  • Can state the inference rules for intensional MLTT (Π, Σ, Eq, inductive types) without reference.
  • Can explain the difference between propositional and definitional equality, and what "reduction" means at each level.
  • Can sketch the Hofmann-Streicher argument that UIP follows from propext + Quot.sound + Prop-irrelevance. Bonus: identify which of the three is actually load-bearing in different presentations.
  • Can explain what makes a universe univalent and why univalence contradicts UIP.
  • Can describe higher inductive types via Cubical Agda's path constructors, and state their recursor correctness condition.
  • Can read a CCHM paper passage (Cohen-Coquand-Huber- Mörtberg 2016) and explain a transp or hcomp rule in your own words.
  • Know what canonicity and normalization mean, and why they are the metatheorems any new reduction rule must preserve.

8.2 Lean 4 kernel literacy

  • Understand how Lean's kernel checks definitional equality: reduction, type-directed elaboration, η, proof irrelevance.
  • Read the Lean 4 kernel source (src/kernel/) at least once: TypeChecker, Reducer, whnf, isDefEq. Know where axioms are stored and how they interact with reduction.
  • Understand Lean's universe hierarchy, cumulativity, and how Prop differs from Type u.
  • Understand the inductive elaborator: Construction, recursor generation, positivity check, nested/mutual inductives.
  • Know how @[extern] and @[implemented_by] work at the compiler level vs. the kernel level (only the former ships Rust/C code; the latter routes at codegen).
  • Have written at least one non-trivial Lean 4 kernel-adjacent piece of code — e.g., a custom elaborator, a tactic, a user-defined reducer extension, or a metaprogram that inspects the reduction behaviour.
  • Read existing Lean 4 RFCs and closed PRs on kernel changes to see the discussion style and rigor expected.

8.3 Item-specific preparation

For each kernel item in §3 that you want to propose, the cliff face gets smaller as you check these boxes.

8.3.1 For §3.2 — definitional-equality extensibility (highest-value)

  • Can articulate the difference between axioms and reduction rules at the kernel level. Why can't axioms currently participate in whnf?
  • Know how rewrite systems interact with type checking — confluence, termination, subject reduction. Read one paper on SMT-style or equality-saturation rewriting inside a proof assistant.
  • Can explain why a naive @[compute] on axioms breaks soundness (axioms can be inconsistent pairs; rewriting would let the kernel derive False).
  • Have a design for how the proposer verifies confluence — e.g., external tool, proof obligation at axiom-declaration time, trusted metatheorem.
  • Know Agda's REWRITE feature and can explain its tradeoffs (convenient, but trust-based; PR would likely propose something analogous or more conservative).
  • Have a concrete use case from topolei — e.g., "after this change, rw [eval_transp_top] becomes rfl in these contexts" — to motivate the proposal with lived experience.

8.3.2 For §3.1 — cubical universe (keystone, hardest)

  • Fluent in CCHM §§26 — interval, face lattice, reductions, Glue, univalence. Can present each from memory.
  • Understand at least one normalization proof for cubical type theory (e.g., Sterling-Angiuli NbE for CTT).
  • Can describe how Cubical Agda integrates cubical type theory into Agda's existing universe hierarchy. What did Agda's kernel change? What stayed the same?
  • Understand the sized-types / inductive-inductive / indexed family interaction with cubical — e.g., why W-types need special treatment.
  • Have a sketch of what coexistence looks like: two universes (Type + CubeType), with what coercions and what restrictions? Does propext apply in CubeType? How does Prop fit in?
  • Read at least one Lean kernel maintainer's statement (Leo's blog, the Lean Zulip, etc.) on their position on HoTT / cubical extensions. Know the political climate.
  • Implementation experience at the kernel level — have you extended some proof assistant's kernel (even a toy one, even just a new type former)? If not, this is the longest-running prerequisite.

8.3.3 For §3.3 — HITs in native inductive framework

  • Can define in Cubical Agda and step through its recursor and computation rules.
  • Understand how HIT elimination differs from ordinary inductive elimination — two-level: point rule + path rule.
  • Know the consistency proof sketch for a fragment (e.g., S¹ with propext-free intensional MLTT).
  • Have a design for how Lean's inductive elaborator would accept path constructors — syntax, positivity, recursor generation.
  • Understand the Lean Community HoTT library (if it exists) as a source of use cases and prior art.

8.3.4 For §3.5 — proof-relevant Prop

  • Can articulate why Prop-irrelevance is load-bearing in Lean 4's practice (tactic automation, decide, Mathlib patterns). This is the PR that most risks being rejected on pragmatic grounds.
  • Understand the squashing / truncation modality — propositional truncation ‖·‖ in HoTT — and whether providing it solves the problem without removing Prop irrelevance.
  • Have concrete use cases that require non-trivial Prop — topolei's 2-cells are handled via CTerm (Type-level), so this is low-priority for this project specifically.

8.3.5 For §3.8 — kernel-verified FFI

  • Have read about CompCert, CakeML, and Alectryon- style certified compilation.
  • Can distinguish translation validation from certified compilation and articulate the tradeoffs.
  • Understand Lean's @[extern] trust model — what exactly is trusted vs. what is verified?
  • Have a concrete attack surface analysis — how could a malicious Rust impl subvert kernel soundness?
  • This is research-grade; realistically a multi-year effort. Not a near-term PR candidate.

8.4 Drafting the PR — when the research has landed

Once you're 80%+ through the relevant §8.3 subsection:

  • Draft a Lean 4 RFC (Request For Comments) in the community repo BEFORE writing code. PRs without RFC discussion rarely land for kernel-level proposals.
  • Include a motivating example that uses topolei. Real lived pain from rw [eval_...] chains beats abstract appeals.
  • Include prior art — Cubical Agda, Coq's SProp, OCaml's extension mechanisms — with accurate descriptions of what they do and don't do.
  • Include a consistency argument — if not a full proof, then at least the proof sketch and citation of the relevant papers.
  • Include a "what existing code breaks?" section. For any PR touching propext / Quot.sound / Prop-irrelevance, the answer is "a lot" — honesty here is respected.
  • Include a backwards-compatibility story: opt-in flag, new universe, deprecation path, whatever applies.
  • Draft a minimal kernel patch — even a 50-line proof of concept helps reviewers understand the proposal concretely.
  • Be prepared for a long feedback cycle (months, not days) and for the outcome to be "close without merge" on the first attempt. Kernel PRs are rarely merged on first try.

8.5 Post-submission — learning from the process

  • Respond to all reviewer questions with precise, cited answers. Vague answers are the main reason kernel PRs stall.
  • Be willing to narrow the proposal: reviewer asks for scope reduction is common and not a rejection.
  • If the PR is closed, extract the specific objections and address them in a future revision. Kernel-level changes often take 23 attempts over 12 years.
  • Whether accepted or not, write up what you learned — the proposal itself becomes documentation, and future proposers benefit.

8.6 Current readiness estimate

Self-assess roughly at the start of each research phase:

Sub-section Your current estimated % ready
§8.1 Foundational type theory ____%
§8.2 Lean 4 kernel literacy ____%
§8.3.1 Def-eq extensibility (§3.2) ____%
§8.3.2 Cubical universe (§3.1) ____%
§8.3.3 HITs (§3.3) ____%
§8.3.4 Proof-relevant Prop (§3.5) ____%
§8.3.5 Verified FFI (§3.8) ____%

Update this table as research advances. When any §8.3 row hits 80%+ AND §8.1 and §8.2 are both above 70%, consider drafting the RFC for that specific item. The gating principle: research depth first, PR second.


End of KERNEL_BOUNDARY.md. Update this document when upstream Lean moves any item between regions, when topolei delivery plans change, or as your readiness estimates in §8.6 advance.