cubical-transport-hott-lean4/native/cubical/WASM.md
Maximus Gorog c2e3ecb3e3
Some checks are pending
Lean Action CI / build (push) Waiting to run
Initial commit: topolei — cubical-transport HoTT in Lean 4 + Rust FFI
Implements the cells-spec vision: a computation space that preserves
auditability, correctness, interactivity. Phase 1 (Lean kernel +
naga-IR Rust backend) is closed; foundation hypothesis stack
(Selection H1+H2, Subobject H3, Trace H5, Obs.Ctx C2, Cubical.Trace)
landed.

Highlights:
- Cubical-HoTT syntax + value/eval/readback in Lean
- naga-IR pipeline (no GLSL string crosses FFI; 17/17 probes pass)
- Honesty audit: every non-transport (sealed cells, vertex shader,
  Y-flip, presentation conventions) is documented as such
- Polymorphic Trace α as free monoid; Cubical.Trace gives
  CTerm → Trace CTerm by structural fold (homomorphism = definition)
- Selection as Huet zipper; Subobject as Boolean algebra over WCell
- All theorems proven; the proof IS the implementation

See STATUS.md for the resume guide.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-27 20:40:45 -06:00

225 lines
9.2 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# WASM.md — Wasm Target Integration
*Phase D.wasm deliverable, 2026-04-24. Describes the
`topolei-cubical` crate's wasm32 build, its import/export contract, and
how to embed it in a Lean-wasm composite artifact.*
---
## 1. Build
```sh
cd native/cubical
cargo build --release --target wasm32-unknown-unknown
# Output: target/wasm32-unknown-unknown/release/topolei_cubical.wasm
```
Requirements: the `wasm32-unknown-unknown` Rust target installed
(`rustup target add wasm32-unknown-unknown`). No emscripten, no C
toolchain needed for the Rust side — `build.rs` skips the `shim.c`
compilation on wasm targets.
Current artifact size: ~20KB (`no_std`, full cubical evaluator).
---
## 2. Wasm ABI contract
The compiled module has these wasm-level imports and exports.
### 2.1 Exports (14 + memory + globals)
All 14 cubical-evaluator entry points plus wasm runtime bookkeeping:
| Export | Role |
|-----------------------------------------|-----------------------------------------------|
| `topolei_cubical_eval` | Main evaluator: `(CEnv, CTerm) → CVal` |
| `topolei_cubical_vapp` | Function application |
| `topolei_cubical_vpapp` | Dim application |
| `topolei_cubical_vtransp` | Value-level transport |
| `topolei_cubical_vhcomp` | Homogeneous composition |
| `topolei_cubical_vcomp_term` | Hetero comp at term level |
| `topolei_cubical_vcompn_term` | Multi-clause comp |
| `topolei_cubical_vfst` | Σ first projection |
| `topolei_cubical_vsnd` | Σ second projection |
| `topolei_cubical_readback` | NbE reification |
| `topolei_cubical_readback_neu` | Neutral reification |
| `topolei_cubical_step` | `readback ∘ eval .nil` (Option B) |
| `topolei_cubical_dimexpr_normalize` | DimExpr canonicalisation |
| `topolei_cubical_face_normalize` | FaceFormula canonicalisation |
| `memory` | Linear memory (shared with host) |
| `__heap_base` | Address of first allocatable byte |
| `__data_end` | Address of end of static data |
Every `topolei_cubical_*` export takes `lean_object*` (i.e. `i32` wasm
pointers) and returns `lean_object*`. The Lean object layout on
wasm32 is:
```
struct lean_object {
uint32_t m_rc; // offset 0, 4 bytes
uint16_t m_cs_sz; // offset 4, 2 bytes
uint8_t m_other; // offset 6, 1 byte (num_objs for ctors)
uint8_t m_tag; // offset 7, 1 byte
// fields follow at offset 8; each pointer is 4 bytes on wasm32.
};
```
Scalar encoding: `(value << 1) | 1`. Heap objects have low bit 0.
### 2.2 Imports (in `env` namespace)
Eight shim functions the host must provide. These are Lean's
inline-only helpers (`lean_obj_tag`, `lean_ctor_get`, etc.) exposed as
real wasm imports:
| Import | Lean equivalent | Purpose |
|-----------------------------|----------------------|------------------------------------------------------|
| `topolei_shim_obj_tag` | `lean_obj_tag` | `(o: i32) → i32` — constructor tag |
| `topolei_shim_ctor_get` | `lean_ctor_get` | `(o, i) → field pointer` |
| `topolei_shim_ctor_set` | `lean_ctor_set` | `(o, i, v) → ()` |
| `topolei_shim_alloc_ctor` | `lean_alloc_ctor` | `(tag, num_objs, scalar_sz) → new ctor` |
| `topolei_shim_inc` | `lean_inc` | refcount +1 (no-op on scalars) |
| `topolei_shim_dec` | `lean_dec` | refcount 1 |
| `topolei_shim_string_eq` | `lean_string_eq` | `(a, b) → bool` |
| `topolei_shim_mk_string` | `lean_mk_string` | Allocate Lean String from null-terminated C string |
A Lean-wasm runtime must satisfy these imports. Two paths:
1. **Natively via C compilation.** Compile `shim.c` alongside the
Lean runtime in the wasm build (emscripten, say, or clang
wasm32). The shim delegates to Lean's inlines; Lean's own runtime
provides `lean_alloc_small`, `lean_dec_ref_cold`, etc.
2. **Via runtime binding.** When instantiating the wasm module in a
host (Node/browser/wasmtime), provide JS/Rust implementations of
the 8 shim functions that interact with the shared wasm memory.
The harness at `test/wasm_harness.mjs` uses approach (2) with a
minimal JS implementation — enough to demonstrate the module works.
---
## 3. Integration with a Lean-wasm composite artifact
The cells-spec vision is a Lean 4 wasm binary with topolei cubical
support, running in a browser or wasmtime. The composite artifact
comprises:
```
lean-wasm-composite/
├── lean-kernel.wasm # Lean's own runtime compiled to wasm
├── topolei-cubical.wasm # This crate
├── glue.js / glue.wasm # Import/export wiring
└── main.{js,html} # Entry point
```
Two approaches to wiring them:
### 3.1 Host-side multiplexing (recommended)
Instantiate both modules in the host (JS or another wasm runtime).
The host holds a single `WebAssembly.Memory` instance and shares it
between modules. Imports on `topolei-cubical` resolve to functions
exported by `lean-kernel.wasm` (the shim wrappers that call Lean's
inlines).
**Advantages:** no linker required, modules are swappable, each can
be updated independently. Clean separation of concerns.
**Pseudo-code:**
```js
const lean = await WebAssembly.instantiate(leanKernelWasm, {});
const topolei = await WebAssembly.instantiate(topoleiCubicalWasm, {
env: {
topolei_shim_obj_tag: lean.instance.exports.topolei_shim_obj_tag,
topolei_shim_ctor_get: lean.instance.exports.topolei_shim_ctor_get,
// ... etc
},
});
// Share memory: either use lean's memory (topolei declares memory import)
// or bridge via copy routines.
```
### 3.2 Static link via wasm-ld
Link both modules at build time using `wasm-ld --whole-archive` or
similar. Produces a single wasm file. Simpler deployment but less
flexible for updates.
**Requirements:** both modules must be compiled with compatible
calling conventions and share a single memory segment. wasm-ld
handles this automatically for reactors (non-main wasm).
---
## 4. Verification harness
`test/wasm_harness.mjs` is a standalone Node.js script that loads the
wasm module and runs 10 correctness tests:
- **Scalar passthrough** (4 tests) — `.zero`, `.one`, `.bot`, `.top`
are tagged-pointer scalars; their normalizers return identity.
- **Heap reductions** (5 tests) — `.inv .zero → .one`,
`.inv .one → .zero`, double-negation, face absorption
(`.meet .top .bot → .bot`), recursive `.meet` normalization.
- **Structural inspection** (1 test) — verifies a `.meet` of two
`.inv`s normalizes to a `.meet` of scalars (Rust traverses and
allocates correctly).
Run:
```sh
cd native/cubical
node test/wasm_harness.mjs
```
Expected: `── 10 / 10 tests passed ──`.
The harness provides its own minimal allocator (bump-pointer from
`__heap_base`) and Lean-layout helpers — it is NOT a full Lean
runtime, but it is sufficient to exercise every Rust code path that
doesn't require strings.
---
## 5. Limitations and TODOs
### 5.1 String handling
The harness does not implement `lean_string_eq` or `lean_mk_string`.
Tests that exercise env lookup (`eval .nil (.var "x")`) or string
messages would require a full String-object implementation in the
host. A real Lean-wasm runtime provides these automatically.
### 5.2 Memory growth
The wasm module declares an initial memory size. Complex
computations may exceed it; the host should monitor
`memory.buffer.byteLength` and call `memory.grow()` as needed.
`topolei_shim_alloc_ctor`'s allocator implementation decides when to
grow.
### 5.3 Pointer size (wasm32 vs wasm64)
Currently targeting `wasm32-unknown-unknown` — 32-bit pointers. A
future `wasm64` build would change the field offset calculations in
the harness (`FIELD_SIZE = 8` instead of `4`). The Rust code adapts
automatically via `core::mem::size_of::<*const _>()`, but consumers
of `WASM.md` should be aware.
### 5.4 Refcount correctness
The JS harness stubs `topolei_shim_inc` / `_dec` as no-ops (we don't
GC). This is fine for one-shot test runs but would leak in a
long-running application. A real host must implement refcounting.
---
## 6. Related documents
- `FFI_DESIGN.md` — full C ABI contract (includes wasm as a target).
- `FFI_COMPLETENESS.md` — per-function axiom-coverage audit.
- `KERNEL_BOUNDARY.md` — what this delivers vs. Lean kernel limits.
- `native/cubical/README.md` — crate-level build + development notes.