History

Max Gorog 4d29b7236d PIPELINE §5 step 3: target VM build infrastructure + containment posture §4.2 calls for target VMs we BUILD, not VMs we fetch. §4.13 demands every target ship the same isolation posture (no upstream egress, no host-shared FS, unprivileged QEMU, fresh snapshot per episode). This commit lands the infrastructure for both. New surface: * orchestrator/target_spec.py Loads + validates `vm/targets/<name>/spec.toml`. Containment fields are not knobs — each has exactly ONE safe value, and a spec asserting the unsafe value is rejected at load time. There's no `--containment-override`; weakening §4.13 requires amending PIPELINE.md and operator sign-off. * tools/build_target.py Orchestrates build → verify → publish for a single target. Spec invalid → exit 78 (sysadmin error). build.sh failure → image not published. verify.sh failure → image discarded; that's the §4.2 acceptance gate. Publishes sha256 + the manifest.toml stanza the operator copies in to admit the image (§16 substantive amendment with sign-off per §15). * vm/targets/<name>/{spec.toml,build.sh,verify.sh} Template structure. spec.toml is the contract; build.sh produces $OUT_PATH; verify.sh boots the produced image under the §4.13 containment posture and asserts every promise. * vm/targets/shellshock/ First real working target. CVE-2014-6271 (Apache mod_cgi + bash 4.2 mis-parsing function-export environment values). Replaces the SourceForge Metasploitable2 path that §3 evidence proved unverifiable. Bash 4.2 is built from sha256-pinned GNU source inside an Alpine 3.21 cloudinit guest; the build script asserts the produced bash actually triggers shellshock; the verifier re-asserts it under restrict=on with a real CVE-2014-6271 probe. * vm/targets/README.md How operators add a target. Walks the spec → build → verify → manifest amendment loop. Containment regression tests (tests/test_containment.py) — 20 new assertions, parameterized over every target with a build/verify trio: * verify.sh MUST contain `restrict=on` on its netdev (§4.13) * verify.sh MUST contain `snapshot=on` on the boot drive (§4.13) * verify.sh + build.sh MUST NOT contain -virtfs / -fsdev / 9pfs * verify.sh + build.sh MUST NOT wrap qemu-system in `sudo` * Every target must ship the complete spec.toml + build.sh + verify.sh trio — no half-built targets (§1 default-to-removal) Spec validation tests (tests/test_target_spec.py): 13 new tests over spec parse, name/dir mismatch, missing fields, out-of-range port, and the §4.13 containment field validators (each unsafe value rejected with a clear error). The shellshock target's image is NOT yet published to manifest.toml's [[targets.images]] — that's the §15 sign-off amendment that lands after a successful operator-driven build_target.py run on a lab host with KVM. Building takes ~10 min on x86_64; cannot run on the Pi under TCG. Operator drives the first build, verifies the sha256, then amends manifest.toml in a follow-up commit. 261 tests passing. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>		2026-05-04 01:31:40 -05:00
..
__init__.py	Add v0 orchestrator + first oracle collector (host /proc)	2026-04-28 23:40:25 -06:00
__main__.py	Add v0 orchestrator + first oracle collector (host /proc)	2026-04-28 23:40:25 -06:00
episode.py	PIPELINE §5 step 2: canonical manifest at <repo>/manifest.toml	2026-05-04 01:25:01 -05:00
fleet.py	PIPELINE §5 step 2: canonical manifest at <repo>/manifest.toml	2026-05-04 01:25:01 -05:00
manifest.py	PIPELINE §5 step 2: canonical manifest at <repo>/manifest.toml	2026-05-04 01:25:01 -05:00
README.md	Scaffold project: docs, repo skeleton, transport + deploy design	2026-04-28 23:21:00 -06:00
target_spec.py	PIPELINE §5 step 3: target VM build infrastructure + containment posture	2026-05-04 01:31:40 -05:00
ulid.py	Add v0 orchestrator + first oracle collector (host /proc)	2026-04-28 23:40:25 -06:00

README.md

orchestrator/

The state machine that drives a single episode:

snapshot_load → clean → armed → infecting → infected_running → dormant → reverting

Responsibilities:

Bring up the host-only bridge and verify isolation before the guest starts.
Boot the guest from a named snapshot.
Spawn the five telemetry collectors (collectors/) with a shared episode id and shared monotonic clock origin.
Drive the Metasploit Framework over RPC to fire the configured exploit module.
Upload + execute the configured malware sample once a session is open.
Emit phase transitions to labels.jsonl at the moment the action is taken.
Revert the snapshot at episode end.
Write meta.json with the result summary.

Implementation lives in this directory and is imported as orchestrator.*.