cubical-transport-hott-lean4/docs/FFI_DESIGN.md
Maximus Gorog d9f952fa6c
Some checks are pending
Lean Action CI / build (push) Waiting to run
Reorganize: move non-README docs into docs/ subfolder
Standard convention: README.md at root, everything else in docs/.

Engine docs/: FFI_DESIGN, FFI_COMPLETENESS, KERNEL_BOUNDARY, NUMERICAL,
PHASE1_HISTORY, ZIGZAG_PORT.  README.md links updated to docs/<name>.
Cross-repo reference in NUMERICAL.md (to topolei's STATUS.md) now
includes the relative path `../topolei/docs/STATUS.md`.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-27 22:57:10 -06:00

403 lines
14 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# FFI_DESIGN.md — Lean ↔ Rust C ABI Contract for Cubical HoTT
*Stage 4.5 deliverable (2026-04-23). The design contract Rust
implements to extend Lean 4 with kernel-level cubical-transport HoTT
reduction.*
---
## 1. Purpose and framing
Topolei's Rust backend is **not** a separate application that happens
to use Lean as a library. It is a Lean kernel *extension*: Rust
functions are linked via `@[extern]` / `@[implemented_by]` so that
cubical terms reduce *in Lean's kernel* at native speed.
The contract is bidirectional:
- **Lean-side spec:** every reduction rule Rust must implement is
stated as an axiom (the Rust-discharge axioms catalogued in
STATUS.md's "Axiom provenance spectrum"). The axioms are the
Boolean-valued contract; Rust satisfies them structurally.
- **Rust-side implementation:** Rust functions expose a C ABI that
Lean's compiled code calls for reduction. The FFI boundary passes
`lean_object*` pointers and returns `lean_obj_res` results, following
Lean 4's runtime conventions.
---
## 2. FFI surface — the functions Rust implements
| Lean function | Rust symbol | Arity | Notes |
|----------------------------|------------------------------|-------|-------|
| `eval` | `topolei_cubical_eval` | `CEnv → CTerm → CVal` | Main evaluator entry |
| `vApp` | `topolei_cubical_vapp` | `CVal → CVal → CVal` | Function application |
| `vPApp` | `topolei_cubical_vpapp` | `CVal → DimExpr → CVal` | Dimension application |
| `vTransp` | `topolei_cubical_vtransp` | `DimVar → CType → FaceFormula → CVal → CVal` | Value-level transport |
| `vHCompValue` | `topolei_cubical_vhcomp` | `CType → FaceFormula → CVal → CVal → CVal` | Homogeneous composition |
| `vCompAtTerm` | `topolei_cubical_vcomp_term` | `CEnv → DimVar → CType → FaceFormula → CTerm → CTerm → CVal` | Heterogeneous comp at term |
| `vCompNAtTerm` | `topolei_cubical_vcompn_term`| `CEnv → DimVar → CType → List (FaceFormula × CTerm) → CTerm → CVal` | Multi-clause comp |
| `vFst` | `topolei_cubical_vfst` | `CVal → CVal` | First projection (Σ) |
| `vSnd` | `topolei_cubical_vsnd` | `CVal → CVal` | Second projection (Σ) |
| `readback` | `topolei_cubical_readback` | `CVal → CTerm` | NbE reification |
| `readbackNeu` | `topolei_cubical_readback_neu` | `CNeu → CTerm` | Neutral reification |
| `CTerm.step` (optional) | `topolei_cubical_step` | `CTerm → CTerm` | One-step reduction; see §8 |
---
## 3. Inductive object layout
All Lean inductives cross the boundary as `lean_object*` pointers with
constructor tags accessed via `lean_ctor_get_tag` / `lean_ctor_get_uint8`.
Rust recurses on the inductive shape using these accessors — **no
bindgen-generated structs**, because Lean's runtime layout encodes
small-value optimisations that bindgen cannot capture.
### 3.1 `DimVar`
```
structure DimVar where
name : String
```
Single-field structure. At the C ABI: `lean_ctor_get(obj, 0)` returns
the `String` object. Use `lean_string_cstr` to get a borrowed `*const
c_char`. Hygiene convention: `$`-prefixed names are reserved; Rust
should treat them as fresh-generated binders and avoid shadowing.
### 3.2 `DimExpr`
6 constructors, tagged 05:
| Tag | Constructor | Fields |
|-----|-------------|--------|
| 0 | `.zero` | — |
| 1 | `.one` | — |
| 2 | `.var` | 1 × DimVar |
| 3 | `.inv` | 1 × DimExpr |
| 4 | `.meet` | 2 × DimExpr |
| 5 | `.join` | 2 × DimExpr |
Rust evaluates / normalises via recursive descent. `DimExpr.normalize`
(Stage 4.1, Interval.lean) defines the canonical form; Rust's
implementation must produce `normalize`-equivalent output.
### 3.3 `FaceFormula`
6 constructors:
| Tag | Constructor | Fields |
|-----|-------------|--------|
| 0 | `.bot` | — |
| 1 | `.top` | — |
| 2 | `.eq0` | 1 × DimVar |
| 3 | `.eq1` | 1 × DimVar |
| 4 | `.meet` | 2 × FaceFormula |
| 5 | `.join` | 2 × FaceFormula |
`FaceFormula.normalize` (Stage 4.3, Face.lean) is the canonical form
Rust must produce after `.substDim` for deterministic face dispatch.
### 3.4 `CType`
5 constructors:
| Tag | Constructor | Fields |
|-----|-------------|--------|
| 0 | `.univ` | — |
| 1 | `.pi` | 2 × CType |
| 2 | `.path` | 1 × CType, 2 × CTerm |
| 3 | `.sigma` | 2 × CType |
| 4 | `.glue` | 1 × FaceFormula, 1 × CType, 5 × CTerm, 1 × CType |
### 3.5 `CTerm`
12 constructors (11 after `.step` is derived). See `Cubical/Syntax.lean`
for field order. Constructor tags are assigned in definition order:
`var`, `lam`, `app`, `plam`, `papp`, `transp`, `comp`, `compN`, `glueIn`,
`unglue`, `pair`, `fst`, `snd`.
### 3.6 `CVal` / `CNeu`
Mutual inductives. See `Cubical/Value.lean` for field order. Notable
constructors that carry `CEnv` (closures):
- `vlam`, `vplam` — λ-/dim-closures.
- `vCompFun`, `vPathTransp` — hold their capture env for later
reduction.
The env is itself an inductive cons-list of `(String, CVal)`; Rust
traverses via `lookup_nil` / `lookup_cons` tag dispatch.
---
## 4. Memory ownership conventions
Follows Lean 4's runtime:
- **`b_lean_obj_arg`** (borrowed): Rust reads the pointer but does not
manage its refcount. Default for read-only arguments (CTerm, CType,
FaceFormula, etc.).
- **`lean_obj_arg`** (owned): Rust takes ownership; must release via
`lean_dec` unless returned. Used for arguments that will be
consumed / transformed and returned as the result.
- **`lean_obj_res`** (result): returned value; caller takes ownership.
**Convention for this project:**
- All read-only inputs (CTerm, CType, FaceFormula, CEnv, DimVar, etc.)
are passed as `b_lean_obj_arg`.
- Results are `lean_obj_res` — Rust allocates via `lean_alloc_ctor`
etc. and returns.
- The `Cubical/FFI.lean` module (Stage 4.6) declares every function
with `@&` annotations matching these conventions.
---
## 5. Recursion strategy
Rust implements each FFI entry point natively via tag dispatch.
Examples:
```rust
// topolei_cubical_eval: eval : CEnv → CTerm → CVal
#[no_mangle]
pub extern "C" fn topolei_cubical_eval(
env: *const c_void, // b_lean_obj_arg — borrowed CEnv
t: *const c_void, // b_lean_obj_arg — borrowed CTerm
) -> *mut c_void { // lean_obj_res — owned CVal
let tag = unsafe { lean_ctor_get_tag(t) };
match tag {
0 => eval_var(env, t),
1 => eval_lam(env, t),
2 => eval_app(env, t),
// ... 11 arms total for CTerm
_ => unreachable!(),
}
}
```
**Callback to Lean?** Generally no — Rust implements everything
natively to avoid the FFI round-trip cost. The one exception: helper
functions like `CTerm.substDim`, `FaceFormula.substDim` may be called
back if Rust chooses to keep them Lean-implemented. Recommendation:
Rust implements its own `subst_dim` matching the Lean spec — avoids
FFI overhead and keeps the evaluator self-contained.
---
## 6. Error and divergence conventions
Lean's `partial def` for `eval` produces marker neutrals like
`.vneu (.nvar "<vApp: vplam applied as function>")` for ill-typed
states. Rust mirrors this: on unreachable arms (type errors in input
that should have been caught by the typechecker), allocate a `vneu`
neutral whose `nvar` name carries the error tag. Do **not** panic
or abort — Lean's kernel tolerates stuck forms.
For genuine divergence (the partial def's unrestricted recursion), Rust
uses a configurable fuel counter. Default: 1 000 000 reductions. On
exhaustion: return a `vneu (.nvar "<divergence>")` neutral and log.
Rust never crashes Lean.
---
## 7. `@[extern]` vs `@[implemented_by]`
Lean 4 offers two ways to bind a Lean name to a Rust implementation:
- **`@[extern "symbol"] opaque`** — the Lean declaration has *no* Lean-
native body; the symbol resolves at link time. Best for functions
that have no meaningful fallback (IO primitives, hardware access).
- **`@[implemented_by rustImpl] def original := leanFallback`** — the
Lean has a slow fallback; Rust runs at runtime. Best when the fallback
is useful for testing without Rust linked.
**Our choice: `@[implemented_by]` for all cubical-core functions.**
Rationale:
- The existing `partial def` bodies in `Eval.lean` / `Transport.lean` /
`Readback.lean` are correct Lean implementations (slow but usable for
unit tests and axiom-consistency checks). Keeping them as fallbacks
lets us run the project without the Rust crate (in a fresh clone or
before Rust builds).
- `@[implemented_by]` preserves the ability to test Lean-only behaviour
(e.g. in `EvalTest.lean`) without CI needing Cargo.
- The Rust-implemented path is taken when the compiled artifact links
to the Rust library; otherwise the fallback runs.
**`@[extern]` for types:**
- `GPUContext`, `ShaderHandle` in `GPU/Spec.lean` remain opaque extern
types — no Lean-native representation. Pattern already in use.
---
## 8. `CTerm.step` design choice
Per Stage 4.4 decision, `CTerm.step` remains opaque at the Lean spec
level. Rust has two valid implementation strategies:
### Option A — native step
```rust
#[no_mangle]
pub extern "C" fn topolei_cubical_step(t: *const c_void) -> *mut c_void {
// Direct pattern-match on CTerm constructor; emit CCHM comp-shaped
// body for path-typed transp-of-plam; identity-ish otherwise.
...
}
```
Pros: single FFI call per step; simplest Rust surface. Cons: duplicates
logic with eval + readback.
### Option B — derived step
```rust
#[no_mangle]
pub extern "C" fn topolei_cubical_step(t: *const c_void) -> *mut c_void {
let empty_env = lean_alloc_ctor(0, 0, 0); // CEnv.nil
let v = topolei_cubical_eval(empty_env, t);
topolei_cubical_readback(v)
}
```
Pros: no duplicated logic; step is a pure composition. Cons: extra
allocation overhead per step (negligible in practice).
**Recommendation: Option B.** It's semantically equivalent to the
`readback ∘ eval .nil` definition implied by the step↔eval bridge, and
reduces Rust maintenance surface. The T4 axiom
(`transp_plam_is_plam_path`) is satisfied by Option B via the
`.vPathTransp``.plam j (.compN …)` readback arm.
---
## 9. Lake + Cargo integration
### 9.1 Directory layout
```
topolei/
├── lakefile.toml -- Lean build
├── native/ -- top-level native code dir
│ ├── CMakeLists.txt -- existing C++ GPU/Canvas (glfw, gl, glew)
│ ├── include/topolei/... -- C++ headers
│ ├── src/canvas.cpp -- existing C++ GPU/Canvas
│ └── cubical/ -- Rust cubical backend (THIS CRATE)
│ ├── Cargo.toml -- no_std, dual native+wasm targets
│ ├── README.md
│ ├── include/
│ │ └── topolei_cubical.h -- C header, ABI v1
│ ├── src/
│ │ ├── lib.rs -- #![no_std], panic_handler
│ │ ├── lean_runtime.rs -- hand-rolled Lean C ABI
│ │ ├── ffi.rs -- #[no_mangle] exports
│ │ └── (future Phase BC) -- cubical/, subst/ modules
│ └── target/ -- Cargo output (gitignored)
└── Topolei/
└── Cubical/
├── FFI.lean -- @[extern] + @[implemented_by] (Stage 4.6)
└── (existing files)
```
The Rust crate is isolated at `native/cubical/` — separate build
system from the C++ GPU/Canvas legacy code at `native/` top-level.
Each has its own toolchain (Cargo vs CMake) and own artifacts.
### 9.2 `native/Cargo.toml`
```toml
[package]
name = "topolei-native"
version = "0.1.0"
edition = "2024"
[lib]
crate-type = ["staticlib"] # For Lean to link against.
[dependencies]
lean-sys = { version = "0.1", features = ["runtime"] }
[profile.release]
lto = true
codegen-units = 1
```
### 9.3 `lakefile.toml` additions
```toml
[[external_lib]]
name = "topolei_cubical_native"
extra_link_args = ["-L", "native/target/release", "-ltopolei_native"]
# Build step: invoke `cargo build --release -p topolei-native`
# before Lean compilation; see lakefile script pre-build hook.
```
(Exact lakefile syntax depends on the Lean toolchain version; verify
at implementation time.)
### 9.4 C header
```c
// native/include/topolei_cubical.h
#pragma once
#include <lean/lean.h>
lean_obj_res topolei_cubical_eval(b_lean_obj_arg env, b_lean_obj_arg t);
lean_obj_res topolei_cubical_vapp(b_lean_obj_arg f, b_lean_obj_arg arg);
lean_obj_res topolei_cubical_vpapp(b_lean_obj_arg v, b_lean_obj_arg r);
// ... one declaration per FFI function
```
The header is generated from `Cubical/FFI.lean`'s `@[extern]` /
`@[implemented_by]` declarations (Stage 4.6 produces the Lean side; the
header is hand-maintained in parallel).
---
## 10. Testing strategy
- **Unit tests (Lean-only fallback):** `EvalTest.lean` already tests
the Lean `partial def` implementations. With Rust linked, these
tests run Rust's impl transparently via `@[implemented_by]`.
- **Axiom consistency:** for every Rust-discharge axiom, a CI step
verifies the Rust implementation matches by running representative
term reductions and comparing against the Lean fallback's output.
- **Differential fuzzing:** random CTerm generation → compare
`rust_eval env t` vs `lean_fallback_eval env t` for thousands of
terms. Divergence is a bug.
---
## 11. Versioning and ABI stability
The FFI contract is versioned:
```
// C header
#define TOPOLEI_FFI_ABI_VERSION 1
```
Any change to a function signature, inductive layout, or ownership
convention increments this number. Rust and Lean both embed the
version; link-time mismatch is a hard error.
---
## 12. Non-goals
- The Rust backend does **not** implement any of:
- Zigzag n-category engine (Lean port; `ZIGZAG_PORT.md`).
- Numerical layer (separate stream, `NUMERICAL.md`).
- Cell-layer semantics (Phase 2, pure Lean).
- Boundary / Security modal model (Phase 5b, pure Lean).
- These are all Lean-native; adding Rust to them would violate the
"one Rust component" discipline (STATUS.md architecture section).
---
*End of FFI_DESIGN.md. Companion document: `FFI_COMPLETENESS.md`
(Stage 4.7) — the per-function axiom-completeness audit.*