Commit graph

21 commits

Author SHA1 Message Date
Maximus Gorog
4bf2dd38cd Add 'common-lean/' from commit 'a0b719e170f66a736093d0d1c51b9d5883471db2'
git-subtree-dir: common-lean
git-subtree-mainline: bd2e14214d
git-subtree-split: a0b719e170
2026-05-12 02:59:14 -06:00
Maximus Gorog
bd2e14214d Add 'tsm-lean/' from commit '2e9061abead6f2daa464b39a79c17a949db30785'
git-subtree-dir: tsm-lean
git-subtree-mainline: 6592cd058d
git-subtree-split: 2e9061abea
2026-05-12 02:59:14 -06:00
Maximus Gorog
6592cd058d Add 'octive-lean/' from commit '4b6fcec565a170d7029d4ccba21be2ecd0512d13'
git-subtree-dir: octive-lean
git-subtree-mainline: fd3d42ae33
git-subtree-split: 4b6fcec565
2026-05-12 02:59:14 -06:00
Maximus Gorog
fd3d42ae33 Add 'golang-lean/' from commit 'f5f17019224c6a6c319387214ceb8e29d09251c6'
git-subtree-dir: golang-lean
git-subtree-mainline: 6487c7046f
git-subtree-split: f5f1701922
2026-05-12 02:59:14 -06:00
Maximus Gorog
6487c7046f Initial commit: crosslang monorepo skeleton.
Umbrella repo gathering golang-lean (TGC), octive-lean (TOC),
tsm-lean (TSM), and common-lean (apex with cross-language
typeclasses). Subprojects arrive next via `git subtree add` so
per-file history survives the move.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-12 02:59:08 -06:00
Maximus Gorog
4b6fcec565 Add SymPy bridge and migrate DSL to brace-block syntax.
Symbolic computation via a persistent Python subprocess: new `.sym`
Value variant carries (srepr, pretty), `OctiveLean.SymPyBridge` owns
the subprocess and round-trips expressions, and `evalBinOp`/unary
negation route through SymPy when either operand is `.sym`.  Corpus
adds sym_basic, sym_solve_simplify, sym_calc; demos add Lorenz,
Van der Pol, gravity, SymToolboxDemo, Lab7Interp.

DSL surface changes from `octave! ... octave_end` to `octave! { ... }`.
RosettaStone rewritten against the new syntax; PlotDemo updated.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-12 02:58:50 -06:00
Maximus Gorog
a0b719e170 Initial commit: Common — cross-language abstraction over TGC, TOC, TSM.
A new Lake project with path dependencies on the three sibling kernels
(golang-lean, octive-lean, tsm-lean). Provides typeclass abstractions
that capture what's structurally shared.

Common/ — three files:

  Lang.lean — two typeclasses:
    BigStepLang: Term, Value, State, Eval, deterministic.
                 Captures TGC's and TOC's big-step shape.
    SmallStepLang: State, step.
                   Captures TSM's small-step shape (determinism is
                   automatic from `step` being a function).
    Generic theorems exported once:
      BigStepLang.unique_value, BigStepLang.unique_state
      SmallStepLang.step_deterministic
    Plus a `CompileCorrect` structure for substrate-projection
    theorems (the categorical kernel of CompCert-style results).

  Instances.lean — three instances:
    TGCLang : BigStepLang   (State := Heap × Env)
    TOCLang : BigStepLang   (State := Env)
    TSMLang : SmallStepLang (State := TSM.State)
    Each instance bridges the local kernel's existing
    `BigStep`/`step` + `deterministic` proof into the typeclass.

  Demo.lean — three application examples:
    The SAME `BigStepLang.unique_value` theorem applied to TGC and
    TOC; `SmallStepLang.step_deterministic` to TSM. One abstract
    definition, three concrete payoffs.

The factoring is real: any new theorem stated over the typeclasses
applies to all instances. Future theorems (compilation correctness,
type soundness, simulation) can be expressed abstractly and proved
once.

Zero sorries / axioms / admits across all four projects.
2026-05-10 06:52:25 -06:00
Maximus Gorog
f5f1701922 Add RosettaStone.lean — Go ↔ Lean translation reference.
Side-by-side comparisons of Go syntax with two Lean encodings:

  * Surface AST (GolangLean.Expr) — mirrors go/ast.Node from upstream.
  * Tiny Go Core (GolangLean.Core.Term) — kernel calculus with proven
    operational + type semantics.

Twelve sections covering: literals, arithmetic, comparison, conditional,
variables/let-binding, functions/application, references/deref/assign,
sequencing. Each example includes a `#eval` running the term through
the proven Core.eval; output matches what's expected at run time.

Demonstrated programs:
  - `(5 + 3) * 2` → 16
  - `let x = 3 in let y = 7 in x < y` → true
  - `(λ x. x*2) 21` → 42
  - `let double = λ x. x*2 in double (double 5)` → 20
  - `let p = &42 in *p = 100; *p` → 100 (heap: [vInt 100])
  - higher-order: `(λ f. λ x. f (f x)) double 3` → 12

Each #eval is a runtime witness for a Core.BigStep derivation. Apply
Core.eval_sound to get a proof of the BigStep relation, and
Core.preservation to get the value's type.

Supporting changes:
  - Added Repr instances for Core.Value and Core.EnvList (manual,
    handling the mutual inductive — derive doesn't compose across the
    mutual block).
  - Added RosettaStone as a lean_lib in lakefile.toml.
2026-05-10 06:49:22 -06:00
Maximus Gorog
2e9061abea Refactor compile to be offset-aware (v0.4).
Compile signature changes from `Source.Expr -> Code` to
`Nat -> Source.Expr -> Code`, where the Nat is the absolute address
where the output is placed. For v0.4's constructs (intLit, boolLit,
add, sub, mul) the offset is just threaded — no compile output
depends on it. The infrastructure is now ready for control-flow
constructs (ifte) that need absolute jump addresses.

All 5 cases of compile_correct updated for the offset-aware signature.
Each case now references compile pre.size e instead of compile e.
For binops, the second sub-expression is compiled at offset
pre.size + (compile pre.size e1).size.

Generic helpers introduced:
  getElem_at_offset (pre X suf : Code) (i)
    : (pre ++ X ++ suf)[pre.size + i] = X[i]
  - bridges outer-array lookup to inner-array lookup at index.

Compile_X_get_op now parameterized by offset.

Engineering notes for ifte (deferred to v0.5):
  - The compile output for ifte is
    cc ++ #[jmpFalse else_at] ++ ct ++ #[jmp end_at] ++ cf
    where else_at, end_at are absolute addresses.
  - Looking up the .jmpFalse or .jmp instructions inside this
    nested-append structure requires 4-deep getElem_append_left/
    getElem_append_right rewrites. Each rewrite has a bound proof
    that must thread through dependent type matching. Lean's `rw`
    fails with "motive not type correct" on the natural proof.
  - The path forward is likely a different decomposition that
    avoids dependent-rewrite chains - perhaps using Array.getElem?
    (Option-form) which doesn't carry bounds, then bridging back to
    Array.getElem at the leaf.

Zero sorries / axioms / admits in v0.4. Full build clean (26 jobs).
2026-05-10 06:12:03 -06:00
Maximus Gorog
f9ed9ec4c7 Extend compiler with bool literals + sub + mul (v0.3).
Source.Expr: intLit, boolLit, add, sub, mul.
Each new constructor follows the pattern from v0.2:
  - one Source constructor + one Eval rule
  - one compile arm
  - one step_X lemma (push, pushB, add, sub, mul) - one line each
  - one compile_X_get_op lemma, all delegating to a generic
    getElem_at_op_boundary helper that handles the c1++c2++[op] shape
  - one case in compile_correct (~50 lines, mostly mechanical)

The pattern is fully grooved: each new arithmetic op is now a copy-
paste of the add case with substituted constructor names. The
substantive proof work happened once (in v0.2 for add); subsequent
arithmetic ops add no new proof shapes.

Generic helpers introduced:
  getElem_at_op_boundary (c1 c2 : Code) (op : Instr) (h) :
    (c1 ++ c2 ++ #[op])[c1.size + c2.size]'h = op
  - applies to add/sub/mul interchangeably; specialized via
    one-line wrappers (compile_add_get_op etc.)

Zero sorries / axioms / admits. Full build: 26 jobs clean.
2026-05-10 06:01:44 -06:00
Maximus Gorog
ec65229050 Extend source-to-TSM compiler with addition (v0.2).
Source.Expr now has intLit and add. Compile and correctness theorem
both extend.

The add case of compile_correct exercises the compositional structure:
  - IH on e1 (with extended suffix) gives the multistep for the first
    operand's evaluation.
  - IH on e2 (with extended prefix) gives the multistep for the second.
  - A single .add step at the boundary closes the trace.
  - Each intermediate state's PC is computed via array-size arithmetic
    threaded through omega.

New supporting lemmas:
  step_add               - per-instruction step for .add
  compile_add_get_op     - the instruction at the end of compile (.add e1 e2)
                           is .add. Extracted so the dependent-rewrite issue
                           with array bound proofs is contained in one place.

Engineering knowledge gained (recurring patterns when extending):
  - Array.getElem_append_left/right take the bound as an explicit positional
    arg, not via (h := ...).
  - rw on indices that appear in dependent bound proofs fails with "motive
    not type correct"; factor the lookup into a separate lemma.
  - convert tactic appears not to be available; rw + exact substitutes.
  - simp + omega closes most arithmetic on Array.size after expansion.
  - step lemmas with implicit args (a, b) need explicit (a := _) in calls
    where context doesn't determine them.

Adding a constructor still follows the v0.1 recipe — one Source
constructor, one Eval rule, one compile arm, one step_X helper, one
compile_X_get_op lemma, one case in compile_correct's induction. Each
case is ~25-40 lines of proof.

Zero sorries / axioms / admits.
2026-05-10 05:53:39 -06:00
Maximus Gorog
fff0091f89 Add source-to-TSM compiler with proven correctness (v0.1).
The CompCert-style substrate-projection theorem at miniature scale:
source-level evaluation and TSM-bytecode execution agree on the value
produced.

TsmLean/Compile/ — three files:

  Source.lean       - small expression language. v0.1 covers integer
                      literals only; the framework is structured so
                      arithmetic, comparison, control flow, and
                      variables extend mechanically.

  Compile.lean      - compile : Source.Expr -> TSM.Code
                      v0.1: intLit n -> #[push n]

  Correctness.lean  - theorem compile_correct:
                        Source.Eval e v ->
                        forall pre suf rest,
                          MultiStep
                            { code := pre ++ compile e ++ suf,
                              pc := pre.size, stack := rest }
                            { code := same,
                              pc := pre.size + (compile e).size,
                              stack := v :: rest }
                      Plus a standalone corollary for the no-prefix case.

The infrastructure is in place for compositional extension:

  MultiStep.trans       - transitive closure of multi-step
  MultiStep.single      - lift single step to multi-step
  step_push             - per-instruction step lemma (push)
  getElem_compile       - lookup-in-larger-code helper

Adding a constructor to Source (e.g., add) requires:
  - one constructor in Source.Expr
  - one rule in Source.Eval
  - one match arm in compile
  - one step_X helper (one-liner)
  - one case in compile_correct's induction

Demonstrates the pipeline:
  - Source language with big-step semantics
  - Compiler producing TSM bytecode
  - Correctness theorem bridging the two

Zero sorries / axioms / admits across the entire project.
2026-05-10 05:38:01 -06:00
Maximus Gorog
987f205ce5 Initial commit: Tiny Stack Machine (TSM) in Lean 4.
Third concrete kernel, parallel to golang-lean's TGC and octive-lean's
TOC. The substrate-level asymmetry: TSM has values living by *position*
on a stack, not by name. This breaks the named-variable assumption that
TGC and TOC silently share.

Maps onto real bytecode targets: WebAssembly, JVM, CPython, .NET CIL,
SECD. Anything proved here transfers.

TsmLean/Core/ — seven files, parallel structure to TGC/TOC:

  Syntax.lean        - Instr (12 opcodes), Value (int/bool), Code
  Semantics.lean     - State, step (function), MultiStep (rel'n)
  Determinism.lean   - step_deterministic, MultiStep.deterministic
  Eval.lean          - fuel-bounded run + run_sound
  Types.lean         - Ty, StackTy, HasTypeInstr
                       (per-instruction stack-type transitions)
  TypeSoundness.lean - HasTypeV, HasTypeStack
  Preservation.lean  - stack_preservation, progress
                       (canonical Pierce-style small-step type soundness)

Theorems proven, zero sorries / axioms / admits:

  step_deterministic           single-step is functional
  MultiStep.deterministic      multi-step paths to halt are unique
  run_sound                    successful run -> MultiStep derivation
  stack_preservation           stack typing preserved by step
  progress                     well-typed non-halt instructions step

Demo (Main.lean): (5 + 3) * 2 evaluated on the stack machine.
  push 5; push 3; add; push 2; mul; halt
  -> stack [vInt 16] at pc 5.

The structural asymmetry from TGC/TOC: TSM uses small-step semantics
with a function `step : State -> Option State`, where TGC/TOC used
big-step inductive relations `Env -> Term -> Value -> Env`. The
canonical type-soundness theorems also flip: TGC/TOC proved
preservation under big-step (which has no progress analogue);
TSM proves both progress AND preservation, each per-instruction.

This is the third datapoint that the cross-language factoring needs.
2026-05-10 05:12:10 -06:00
Maximus Gorog
d86d500b4f Prove preservation theorem for TOC big-step semantics.
OctiveLean/Core/Preservation.lean — the TOC analogue of TGC's
preservation. Statement:

  HasType Γ e T  ∧  HasTypeEnv env Γ  ∧  BigStep env e v env'
    ⟹  HasTypeV v T  ∧  HasTypeEnv env' Γ

No heap-typing extension (Octave has no heap). Γ is unchanged across
big-steps (assign requires x already typed).

Three structural changes were required to make preservation provable
under env-mutation, all small:

  * letIn semantics shifted to scope-restoring: BigStep.letInR now
    returns env1 (the env after evaluating the bound expression)
    rather than env2 (after the body). This drops body's mutations
    at scope-end, matching the lambda-calculus tradition. Determinism
    and Eval updated to match.

  * HasTypeV.vClos uses a two-part premise (he_dom + he_typed)
    instead of nested ∃ — the kernel rejects nested inductive
    parameters with locally-bound vars. The two parts are equivalent
    to HasTypeEnv via the new HasTypeV.vClos_to_env inversion lemma.

  * Inversion via HasTypeV.vClos_to_env exposes the closure's typing
    context as an existential — preservation's appR case uses this
    to construct the body's HasTypeEnv via extend_letIn.

The cross-language symmetry that emerged:

  TGC preservation  : threads heap-typings, weakens via extension.
  TOC preservation  : threads env directly, no extension needed.

In both, the rule cases collapse into the same three structural
shapes — terminal, IH-chain, contradiction-collapse. The case bodies
differ in HOW state is propagated (heap-typing for TGC, env for TOC)
but the SHAPE of each case is identical. That's the cross-language
abstraction speaking.

Zero sorries / axioms / admits across both projects.
2026-05-10 04:32:52 -06:00
Maximus Gorog
567d3d1902 Add Tiny Octave Core (TOC) parallel to golang-lean's TGC.
OctiveLean/Core/ — the kernel-level formal layer that octive-lean's
existing surface BigStep didn't have. Six files:

  Syntax.lean       — Term, BinOp. Twelve constructors: ten shared
                      with TGC plus assign (var mutation in env) and
                      whileT (loop). No refs (Octave has no &/*).
  Semantics.lean    — Value, EnvList, BigStep. Signature
                      `BigStep : Env -> Term -> Value -> Env -> Prop`
                      threading env (vs TGC's `Heap x Env`).
  Determinism.lean  — BigStep.deterministic. Same case shapes as TGC
                      for the shared ten constructors. The whileFR/whileTR
                      cross-case mirrors TGC's ifTR/ifFR via Bool.noConfusion.
  Eval.lean         — fuel-bounded eval + eval_sound. whileT recursion
                      consumes one fuel unit per iteration. Function
                      calls discard the body's post-env (Octave/MATLAB
                      local-scope semantics).
  Types.lean        — Ty (no ref), TyEnv, HasType. Twelve typing rules
                      mirroring the constructors.
  TypeSoundness.lean — HasTypeV inductive, function-form HasTypeEnv.
                      vClos uses two-part formulation (he_dom + he_typed)
                      instead of nested existential — Lean's kernel rejects
                      the natural ∃-form due to nested-inductive parameter
                      restrictions. extend_typed (assign preservation) and
                      extend_letIn (letIn preservation) lemmas.

Cross-language symmetry surfaced this iteration:
  * Determinism proof: ten cases mirror TGC line-for-line. Two new
    cases (assign, while) follow the same three structural shapes.
  * Eval signature: state type differs (Env vs Heap×Env), constructors'
    eval recurrences are otherwise isomorphic.
  * Type system: Ty diverges (no ref vs no whileT/assign), but the
    typing-rule shapes are identical — variables, let, app, if, binop, seq.
  * Typing of runtime data is where the languages most diverge:
    TGC's structural HasTypeEnv works because env is scoped; TOC needs
    function-form because env mutates. This is the *real* asymmetry that
    a cross-language layer would have to abstract over.

Preservation deferred — has a soundness gap with letIn-shadowing-then-
assign that needs a freshness premise on letIn typing. To follow.

Zero sorries / axioms / admits. Full lake build clean (73 jobs).
2026-05-10 04:26:12 -06:00
Maximus Gorog
ed636a1aa2 Prove preservation theorem for TGC big-step semantics.
GolangLean/Core/Preservation.lean:

theorem preservation:
  HasType Γ e T -> HasTypeH ht h -> HasTypeEnv ht env Γ ->
  BigStep h env e v h' ->
    ∃ ht', HeapTy.extends ht ht' /\ HasTypeH ht' h' /\
            HasTypeV ht' v T /\ HasTypeEnv ht' env Γ

The standard big-step type-soundness result: well-typed terminating
programs produce well-typed values, with heap conformance preserved.
Proof is by induction on the big-step derivation, fourteen cases.

Supporting infrastructure:
  HeapTy.extends_push    - heap-typing extends across a push
  HasTypeH.push          - heap conformance preserved by push
  HasTypeH.setIfInBounds - heap conformance preserved by in-bounds update
  binop_apply_sound      - operator typing matches operator semantics

The closure case (appR) uses the mutual HasTypeV/HasTypeEnv weakening
lemmas from TypeSoundness to thread heap-typings across the three
sub-derivations. The assign case (assignR) uses the heap-update
lemma to preserve conformance. The if-cases collapse the cross-rule
(ifT vs ifF) ambiguity via Bool.noConfusion on the condition's IH.

Zero sorries / axioms / admits across the project. Full lake build clean.
2026-05-10 04:05:08 -06:00
Maximus Gorog
3174918193 Add executable evaluator (Phase A) and type system + soundness infra (Phase B).
Phase A — GolangLean/Core/Eval.lean:
  def eval : Nat -> Heap -> Env -> Term -> Option (Value x Heap)
  Fuel-bounded recursive evaluator, total over the fuel.
  theorem eval_sound: eval succeeds => BigStep holds.
  Bridges executable computation to inductive specification.

Phase B — GolangLean/Core/Types.lean:
  inductive Ty {unit, int, bool, arrow, ref}
  TyEnv := List (String x Ty); BinOp.typeOf
  inductive HasType : TyEnv -> Term -> Ty -> Prop
  Standard simply-typed lambda calculus + ML-style references.

Phase B — GolangLean/Core/TypeSoundness.lean:
  abbrev HeapTy := Array Ty
  mutual inductive HasTypeV / HasTypeEnv  (value & env typing under heap-typing)
  def HasTypeH                            (heap conforms to heap-typing)
  def HeapTy.extends                       (prefix-extension of heap-typings)
  thm HeapTy.extends_refl, extends_trans
  thm HasTypeV.weaken, HasTypeEnv.weaken   (mutual; under heap-typing extension)
  thm HasTypeEnv.lookup_correspondence    (well-typed env yields well-typed values)

The preservation theorem itself
  HasType /\ HasTypeH /\ HasTypeEnv /\ BigStep
    ==> ∃ ht', extends /\ HasTypeH' /\ HasTypeV' /\ HasTypeEnv'
is the next deliverable; the infrastructure here is what its proof
case-analysis depends on.

Zero sorries / axioms / admits across the project. Full lake build clean.
2026-05-10 03:51:14 -06:00
Maximus Gorog
878a84e072 Add Tiny Go Core (TGC): kernel calculus with proven determinism.
GolangLean/Core/ holds a small calculus that surface Go is intended to
desugar into. Three files:

  Syntax.lean       - Term, BinOp; thirteen syntactic forms covering
                      let-binding, lambda, application, references
                      (Go's & / *), conditionals, sequencing.

  Semantics.lean    - Value, EnvList, Heap, BinOp.apply, BigStep relation.
                      Heap is Array Value; references are indices.
                      Closures capture EnvList lexically, as in Go.
                      Fourteen big-step constructors, one per syntactic form
                      (with ifte split into ifTR / ifFR).

  Determinism.lean  - theorem BigStep.deterministic:
                        BigStep h env e v1 h1 -> BigStep h env e v2 h2 ->
                        v1 = v2 /\ h1 = h2
                      Proof by induction on the first derivation, case
                      analysis on the second. The ifTR/ifFR cross-cases
                      close by contradiction via Bool.noConfusion.

No sorries, no axioms, no admits. The kernel is small enough to extend
compositionally: each new syntactic form adds one constructor and one
case to each proof. Type system and concurrency layer come later.

Strategic note: this kernel is shaped so the same construction will
work for any sequential calculus. When octive-lean grows a parallel
Tiny Octave Core, the determinism proof's structure will line up
case-for-case where the languages share constructors. That alignment
is the seed of the cross-language layer.
2026-05-10 02:23:58 -06:00
Maximus Gorog
2173e196fe Initial scaffold: Lean 4 reimplementation of Go.
Mirrors the layout of octive-lean: lakefile, justfile, gitignored
upstream clone, a top-level library module, and a Main entry point
that switches between REPL and file execution.

Module skeleton in GolangLean/:
  Token, AST           — real ports of go/token and go/ast
  Scanner, Parser      — stubs throwing notImpl, point at upstream
  Value, Env, Error    — runtime data shapes
  Eval, Builtins, REPL — stubs that compile and run as a placeholder
  PureEval, BigStep,
  ValueEquiv           — formal-semantics layer (mirroring octive-lean)
                         where cross-language proof eventually lives.

The proof layer is shaped identically to octive-lean's so that
theorems about Go semantics will share their form with the Octave
ones — that shared shape is the candidate for the future
cross-language core.

Upstream reference go-upstream/ (shallow clone of golang/go) is
gitignored.
2026-05-10 02:12:19 -06:00
Maximus Gorog
23162fb93a Add README, CONTRIBUTING, editorconfig, gitattributes, justfile
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 09:44:24 -06:00
Maximus Gorog
db79eb3fde Initial commit: Lean 4 reimplementation of GNU Octave
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 09:40:46 -06:00