Third concrete kernel, parallel to golang-lean's TGC and octive-lean's
TOC. The substrate-level asymmetry: TSM has values living by *position*
on a stack, not by name. This breaks the named-variable assumption that
TGC and TOC silently share.
Maps onto real bytecode targets: WebAssembly, JVM, CPython, .NET CIL,
SECD. Anything proved here transfers.
TsmLean/Core/ — seven files, parallel structure to TGC/TOC:
Syntax.lean - Instr (12 opcodes), Value (int/bool), Code
Semantics.lean - State, step (function), MultiStep (rel'n)
Determinism.lean - step_deterministic, MultiStep.deterministic
Eval.lean - fuel-bounded run + run_sound
Types.lean - Ty, StackTy, HasTypeInstr
(per-instruction stack-type transitions)
TypeSoundness.lean - HasTypeV, HasTypeStack
Preservation.lean - stack_preservation, progress
(canonical Pierce-style small-step type soundness)
Theorems proven, zero sorries / axioms / admits:
step_deterministic single-step is functional
MultiStep.deterministic multi-step paths to halt are unique
run_sound successful run -> MultiStep derivation
stack_preservation stack typing preserved by step
progress well-typed non-halt instructions step
Demo (Main.lean): (5 + 3) * 2 evaluated on the stack machine.
push 5; push 3; add; push 2; mul; halt
-> stack [vInt 16] at pc 5.
The structural asymmetry from TGC/TOC: TSM uses small-step semantics
with a function `step : State -> Option State`, where TGC/TOC used
big-step inductive relations `Env -> Term -> Value -> Env`. The
canonical type-soundness theorems also flip: TGC/TOC proved
preservation under big-step (which has no progress analogue);
TSM proves both progress AND preservation, each per-instruction.
This is the third datapoint that the cross-language factoring needs.
|
||
|---|---|---|
| TsmLean/Core | ||
| .gitignore | ||
| lake-manifest.json | ||
| lakefile.toml | ||
| lean-toolchain | ||
| Main.lean | ||
| README.md | ||
| TsmLean.lean | ||
tsm-lean
A Lean 4 formalization of a Tiny Stack Machine — third concrete kernel parallel to golang-lean (TGC) and octive-lean (TOC).
The substrate-level asymmetry: TGC and TOC have named variables. TSM has values living by position on a stack. Forces the cross-language abstraction to factor over "operand-access mechanism" instead of baking name-lookup into the framework. Maps directly to real bytecode targets — WebAssembly, JVM, CPython, .NET CIL, SECD.
Build
lake build
Run the demo
lake exe tsm-lean
# → final stack: [TsmLean.Core.Value.vInt 16] ((5 + 3) * 2)
# → final pc: 5
Layout
| Path | What's there |
|---|---|
TsmLean/Core/Syntax.lean |
Instr, Value, Code |
TsmLean/Core/Semantics.lean |
State, step (function), MultiStep (relation) |
TsmLean/Core/Determinism.lean |
step_deterministic, MultiStep.deterministic |
TsmLean/Core/Eval.lean |
fuel-bounded run + run_sound |
TsmLean/Core/Types.lean |
Ty, StackTy, HasTypeInstr |
TsmLean/Core/TypeSoundness.lean |
HasTypeV, HasTypeStack |
TsmLean/Core/Preservation.lean |
stack_preservation, progress |
Main.lean |
demo program |
Theorems proven
step_deterministic— single-step is functional.MultiStep.deterministic— multi-step paths to halted states are unique.run_sound— successful fuel-bounded execution corresponds to aMultiStepderivation ending at a halted state.stack_preservation— if the stack matches an instruction's input type and the step succeeds, the post-stack matches its output type.progress— well-typed non-halt instructions always make a step.
The first three are the operational counterparts of the big-step theorems in TGC and TOC. The last two are the small-step type-soundness theorems (Pierce-style), which TGC/TOC's big-step formulations don't have direct analogues for.
Zero sorries, axioms, or admits.
Status
v0.1: per-instruction (local) preservation. Global program-level type soundness — the JVM-style stackmap that ensures all reachable PCs have consistent stack types — is the next layer up.
Instruction set
push n pushB b pop dup swap
add sub mul eq lt
jmp k jmpFalse k halt
Twelve instructions. No call / ret yet — direct jumps only. Adding function-call frames is a future extension.