The CompCert-style substrate-projection theorem at miniature scale:
source-level evaluation and TSM-bytecode execution agree on the value
produced.
TsmLean/Compile/ — three files:
Source.lean - small expression language. v0.1 covers integer
literals only; the framework is structured so
arithmetic, comparison, control flow, and
variables extend mechanically.
Compile.lean - compile : Source.Expr -> TSM.Code
v0.1: intLit n -> #[push n]
Correctness.lean - theorem compile_correct:
Source.Eval e v ->
forall pre suf rest,
MultiStep
{ code := pre ++ compile e ++ suf,
pc := pre.size, stack := rest }
{ code := same,
pc := pre.size + (compile e).size,
stack := v :: rest }
Plus a standalone corollary for the no-prefix case.
The infrastructure is in place for compositional extension:
MultiStep.trans - transitive closure of multi-step
MultiStep.single - lift single step to multi-step
step_push - per-instruction step lemma (push)
getElem_compile - lookup-in-larger-code helper
Adding a constructor to Source (e.g., add) requires:
- one constructor in Source.Expr
- one rule in Source.Eval
- one match arm in compile
- one step_X helper (one-liner)
- one case in compile_correct's induction
Demonstrates the pipeline:
- Source language with big-step semantics
- Compiler producing TSM bytecode
- Correctness theorem bridging the two
Zero sorries / axioms / admits across the entire project.
|
||
|---|---|---|
| TsmLean | ||
| .gitignore | ||
| lake-manifest.json | ||
| lakefile.toml | ||
| lean-toolchain | ||
| Main.lean | ||
| README.md | ||
| TsmLean.lean | ||
tsm-lean
A Lean 4 formalization of a Tiny Stack Machine — third concrete kernel parallel to golang-lean (TGC) and octive-lean (TOC).
The substrate-level asymmetry: TGC and TOC have named variables. TSM has values living by position on a stack. Forces the cross-language abstraction to factor over "operand-access mechanism" instead of baking name-lookup into the framework. Maps directly to real bytecode targets — WebAssembly, JVM, CPython, .NET CIL, SECD.
Build
lake build
Run the demo
lake exe tsm-lean
# → final stack: [TsmLean.Core.Value.vInt 16] ((5 + 3) * 2)
# → final pc: 5
Layout
| Path | What's there |
|---|---|
TsmLean/Core/Syntax.lean |
Instr, Value, Code |
TsmLean/Core/Semantics.lean |
State, step (function), MultiStep (relation) |
TsmLean/Core/Determinism.lean |
step_deterministic, MultiStep.deterministic |
TsmLean/Core/Eval.lean |
fuel-bounded run + run_sound |
TsmLean/Core/Types.lean |
Ty, StackTy, HasTypeInstr |
TsmLean/Core/TypeSoundness.lean |
HasTypeV, HasTypeStack |
TsmLean/Core/Preservation.lean |
stack_preservation, progress |
Main.lean |
demo program |
Theorems proven
step_deterministic— single-step is functional.MultiStep.deterministic— multi-step paths to halted states are unique.run_sound— successful fuel-bounded execution corresponds to aMultiStepderivation ending at a halted state.stack_preservation— if the stack matches an instruction's input type and the step succeeds, the post-stack matches its output type.progress— well-typed non-halt instructions always make a step.
The first three are the operational counterparts of the big-step theorems in TGC and TOC. The last two are the small-step type-soundness theorems (Pierce-style), which TGC/TOC's big-step formulations don't have direct analogues for.
Zero sorries, axioms, or admits.
Status
v0.1: per-instruction (local) preservation. Global program-level type soundness — the JVM-style stackmap that ensures all reachable PCs have consistent stack types — is the next layer up.
Instruction set
push n pushB b pop dup swap
add sub mul eq lt
jmp k jmpFalse k halt
Twelve instructions. No call / ret yet — direct jumps only. Adding function-call frames is a future extension.