Compile signature changes from `Source.Expr -> Code` to
`Nat -> Source.Expr -> Code`, where the Nat is the absolute address
where the output is placed. For v0.4's constructs (intLit, boolLit,
add, sub, mul) the offset is just threaded — no compile output
depends on it. The infrastructure is now ready for control-flow
constructs (ifte) that need absolute jump addresses.
All 5 cases of compile_correct updated for the offset-aware signature.
Each case now references compile pre.size e instead of compile e.
For binops, the second sub-expression is compiled at offset
pre.size + (compile pre.size e1).size.
Generic helpers introduced:
getElem_at_offset (pre X suf : Code) (i)
: (pre ++ X ++ suf)[pre.size + i] = X[i]
- bridges outer-array lookup to inner-array lookup at index.
Compile_X_get_op now parameterized by offset.
Engineering notes for ifte (deferred to v0.5):
- The compile output for ifte is
cc ++ #[jmpFalse else_at] ++ ct ++ #[jmp end_at] ++ cf
where else_at, end_at are absolute addresses.
- Looking up the .jmpFalse or .jmp instructions inside this
nested-append structure requires 4-deep getElem_append_left/
getElem_append_right rewrites. Each rewrite has a bound proof
that must thread through dependent type matching. Lean's `rw`
fails with "motive not type correct" on the natural proof.
- The path forward is likely a different decomposition that
avoids dependent-rewrite chains - perhaps using Array.getElem?
(Option-form) which doesn't carry bounds, then bridging back to
Array.getElem at the leaf.
Zero sorries / axioms / admits in v0.4. Full build clean (26 jobs).
|
||
|---|---|---|
| TsmLean | ||
| .gitignore | ||
| lake-manifest.json | ||
| lakefile.toml | ||
| lean-toolchain | ||
| Main.lean | ||
| README.md | ||
| TsmLean.lean | ||
tsm-lean
A Lean 4 formalization of a Tiny Stack Machine — third concrete kernel parallel to golang-lean (TGC) and octive-lean (TOC).
The substrate-level asymmetry: TGC and TOC have named variables. TSM has values living by position on a stack. Forces the cross-language abstraction to factor over "operand-access mechanism" instead of baking name-lookup into the framework. Maps directly to real bytecode targets — WebAssembly, JVM, CPython, .NET CIL, SECD.
Build
lake build
Run the demo
lake exe tsm-lean
# → final stack: [TsmLean.Core.Value.vInt 16] ((5 + 3) * 2)
# → final pc: 5
Layout
| Path | What's there |
|---|---|
TsmLean/Core/Syntax.lean |
Instr, Value, Code |
TsmLean/Core/Semantics.lean |
State, step (function), MultiStep (relation) |
TsmLean/Core/Determinism.lean |
step_deterministic, MultiStep.deterministic |
TsmLean/Core/Eval.lean |
fuel-bounded run + run_sound |
TsmLean/Core/Types.lean |
Ty, StackTy, HasTypeInstr |
TsmLean/Core/TypeSoundness.lean |
HasTypeV, HasTypeStack |
TsmLean/Core/Preservation.lean |
stack_preservation, progress |
Main.lean |
demo program |
Theorems proven
step_deterministic— single-step is functional.MultiStep.deterministic— multi-step paths to halted states are unique.run_sound— successful fuel-bounded execution corresponds to aMultiStepderivation ending at a halted state.stack_preservation— if the stack matches an instruction's input type and the step succeeds, the post-stack matches its output type.progress— well-typed non-halt instructions always make a step.
The first three are the operational counterparts of the big-step theorems in TGC and TOC. The last two are the small-step type-soundness theorems (Pierce-style), which TGC/TOC's big-step formulations don't have direct analogues for.
Zero sorries, axioms, or admits.
Status
v0.1: per-instruction (local) preservation. Global program-level type soundness — the JVM-style stackmap that ensures all reachable PCs have consistent stack types — is the next layer up.
Instruction set
push n pushB b pop dup swap
add sub mul eq lt
jmp k jmpFalse k halt
Twelve instructions. No call / ret yet — direct jumps only. Adding function-call frames is a future extension.