diff --git a/AGENTS.md b/AGENTS.md
index 36eff3e..c4dc238 100644
--- a/AGENTS.md
+++ b/AGENTS.md
@@ -7,536 +7,118 @@ The conventions below also apply to sibling repos (`wg-enroll`,
 
 ---
 
-## ⚡ FIRST: are you an on-device agent on a lab host?
+## ⚡ FIRST: read PIPELINE.md
 
-**Read this table before doing anything else.** Find your symptom in
-the left column and run the EXACT command in the right column.
-Do not invent variations. Do not paraphrase.
+**[PIPELINE.md](./PIPELINE.md) is the canonical plan for this repo.**
+Read it before changing anything in the data-collection / emulation /
+labelling pipeline. If anything in this file or any other doc
+contradicts PIPELINE.md, PIPELINE.md wins and the other doc is wrong.
 
-| Symptom (what you see) | What to run (verbatim) |
-|---|---|
-| `journalctl -u cis490-shipper` shows `412 commit-rejected` or `code commit rejected` | `cd /opt/cis490 && sudo -u cis490 git pull origin main && sudo /opt/cis490/scripts/install-lab-host.sh` |
-| `journalctl -u cis490-shipper` shows `400` and `missing X-Cis490-Code-Commit` | Same command as above. |
-| `systemctl status cis490-shipper` shows `inactive (dead)` after install | `sudo systemctl enable --now cis490-shipper cis490-orchestrator` |
-| Shipper logs `waiting on mTLS material` | **Do nothing** — `cis490-cert-fetch.timer` polls bootstrap.wg every 5 min; once the cert lands, the script restarts the shipper automatically. To force a fetch right now: `sudo systemctl start cis490-cert-fetch.service` then `sudo journalctl -u cis490-cert-fetch -n 20`. |
-| Receiver index isn't growing (Pi side: `tail -f /var/lib/cis490/index.jsonl`) | `/opt/cis490/.venv/bin/python /opt/cis490/tools/cis490_doctor.py --role lab-host` — read the **red** rows; each one prints the exact fix command. |
-| `bootstrap.wg fetch failed` after install | `echo '10.100.0.1 bootstrap.wg collector.wg' \| sudo tee -a /etc/hosts && sudo /opt/cis490/scripts/install-lab-host.sh` |
-| `cis490-doctor` says VERSION stamp missing | `sudo /opt/cis490/scripts/install-lab-host.sh` |
-| You're not sure what's wrong | `/opt/cis490/.venv/bin/python /opt/cis490/tools/cis490_doctor.py --role lab-host` |
-| The canonical pull-and-reinstall command above doesn't fix it (auto-update timer keeps failing, `git pull` errors, you have a local commit, etc.) | **Read [FIXYOURSELF.md](./FIXYOURSELF.md).** It's the explicit decision tree for stuck states the auto-update timer can't recover from. |
-
-**HARD RULES — do not break these even if they "feel like they'd help":**
-
-- **Do not run `openssl`, `step-cli`, mint keys, or write CSRs.** Cert
-  delivery is fully automated by `install-lab-host.sh`. If you're
-  about to touch a private key on a lab host, stop.
-- **Do not set `verify_tls = false`.** Skipping TLS verification is not
-  "skipping it for now"; it's leaving a hole.
-- **Do not invent a `code_version` field in old episode `meta.json`
-  files.** That field exists to keep buggy data out of the index.
-  The drain helper handles them properly.
-- **Do not disable, mask, or `systemctl stop` `cis490-shipper` to silence
-  log noise.** Fix the cause via the table above; the noise stops
-  on its own once the host has the new code.
-- **Do not `git pull` and skip running `install-lab-host.sh`.** A bare
-  `git pull` updates files on disk but does NOT restart the systemd
-  units, so the running daemons keep executing the old code.
-- **Do not file a Forgejo issue without first running `cis490-doctor`
-  and pasting its output.**
-
-If your symptom isn't in the table or the canonical command above
-doesn't fix it, file a Forgejo issue (see the "File an issue" section
-near the bottom).
+This file is for general engineering conventions. The pipeline
+correctness story lives in PIPELINE.md.
 
 ---
 
-## How a lab host gets to "shipping data" — the canonical bring-up
+## What this project is
 
-If you (an AI agent OR a human) are looking at a fresh lab host and
-asking "is this thing actually generating data for the central
-collector?", run this in order. **Cloning the repo by itself does
-nothing. Running launchers from a manual clone bypasses the
-systemd services that do the actual work.**
+CIS490 trains a behavioral malware-detection model from labelled
+episodes captured on lab-host VMs running real or mimic workloads,
+optionally driven into infected states by Metasploit modules. The
+producer is the orchestrator on each lab host; the consumer is the
+receiver on the Pi (`office-print`, `10.100.0.1`).
 
-```sh
-# 0. (One-time, on the Pi only.) Initialize the CIS490 client CA + a
-#    leaf cert for THIS lab host. Get its WG IP from `wg-enroll-admin
-#    show <usb>` first.
-sudo /home/max/.env/wg-pki/scripts/init-cis490-client-ca.sh   # idempotent
-sudo /home/max/.env/wg-pki/scripts/deploy-cis490-cert.sh \
-     <host_id> <wg_ip>           # mints + scp's + extracts + chmods
+The producer must ship only ground-truth episodes. The receiver must
+reject anything that doesn't meet the bar. See PIPELINE.md.
 
-# 1. (On the lab host.) Install the lab-host role. ONE COMMAND DOES
-#    EVERYTHING — repo to /opt/cis490, venv build, systemd units,
-#    Alpine baseline qcow2, cidata ISO, VERSION stamp, mTLS cert
-#    auto-fetch from bootstrap.wg, Tier-3+4 deploy (msfrpcd +
-#    Metasploitable2 + theZoo malware samples + bridge), pre-stamp
-#    queue drain, and a `daemon-reload + systemctl restart` of the
-#    shipper + orchestrator on re-runs. Idempotent — safe to re-run.
-sudo /opt/cis490/scripts/install-lab-host.sh
-# (or, if running from a clone elsewhere:)
-#   sudo ./scripts/install-lab-host.sh
+## Hard rules — do not break these
 
-# 2. Edit /etc/cis490/lab-host.toml — set host_id (the only required
-#    edit). Then re-run step 1 so the cert auto-fetch can resolve
-#    bootstrap.wg/v1/cert/<host_id>.
+- **Do not silently downgrade a host.** If a collector is silent, an
+  exploit doesn't land, or a dependency is missing, the host produces
+  zero episodes and says so loudly. There is no "ship what we can"
+  fallback.
+- **Do not write a label that an event didn't justify.** Phase
+  labels come from observed events, not from the schedule clock. See
+  PIPELINE.md §4.5.
+- **Do not add a module to the catalog without verifying it lands a
+  session against its declared target.** See PIPELINE.md §4.3.
+- **Do not add per-host config overrides.** One canonical manifest;
+  hosts that can't run it produce nothing. See PIPELINE.md §4.1.
+- **Do not bypass the dirty-tree gate** except via the
+  `CIS490_ALLOW_DIRTY=1` env var (logged, stamped, audited). No
+  "skip preflight," no `verify_tls=false`, no other override knobs.
+- **Do not run `openssl`, `step-cli`, mint keys, or write CSRs.**
+  Cert delivery is automated. If you find yourself touching a
+  private key on a lab host, stop.
+- **Do not file a Forgejo issue without first running
+  `cis490-doctor` and pasting its output.**
 
-# 3. Verify everything before enabling the timer-driven services:
-/opt/cis490/.venv/bin/python /opt/cis490/tools/cis490_doctor.py \
-    --role lab-host
-# → green/yellow rows means READY; red rows print the exact fix
-#   command. Re-run until clean.
+## How a lab host gets to "shipping data"
 
-# 4. Turn on the services. From this moment on, the orchestrator runs
-#    one fleet wave on each Restart= cycle, and the shipper picks up
-#    completed episodes and PUTs them to https://collector.wg over mTLS.
-sudo systemctl enable --now cis490-shipper cis490-orchestrator
+This will be rewritten as PIPELINE.md §4 lands. The current
+`scripts/install-lab-host.sh` does most of the right things but does
+not yet enforce the canonical manifest, target-VM build, catalog
+verification, or preflight. Until those land, treat the install
+script as in-flight and assume a fresh lab host will produce nothing
+until the bar is met.
 
-# 5. (On the Pi.) Watch the index grow:
-sudo tail -f /var/lib/cis490/index.jsonl
-```
+The bar (when in place) will be:
 
-**There is no manual Tier-3 step.** Steps 1 + 2 deploy msfrpcd,
-Metasploitable2 (auto-fetched from a public mirror with TOFU sha256
-pinning — no Rapid7 registration), and Tier-4 real-malware samples
-from theZoo (no API key, no signup). The orchestrator switches to
-Tier-3 episodes automatically once the prereqs are on disk.
+1. Repo cloned to `/opt/cis490`, working tree clean, HEAD on
+   `origin/main`.
+2. Every binary in the active collector + module catalog set on
+   `PATH`.
+3. Every target VM image built from the in-repo spec, sha256-pinned.
+4. Every module in the catalog passes `scripts/verify-catalog.sh`
+   against its target.
+5. Every collector in the active set passes its emit-test.
+6. `orchestrator/preflight.py` exits 0.
 
-**Hosts self-update.** `install-lab-host.sh` enables
-`cis490-autoupdate.timer`, which runs every 30 min (with up to 10 min
-of randomized delay) and does `git fetch + git pull --ff-only +
-install-lab-host.sh` whenever origin/main has moved. So once a host
-has done the canonical bring-up ONCE, it self-heals on every
-subsequent maintainer push — you don't need to remember to pull. The
-timer logs to `journalctl -u cis490-autoupdate.service`. If the
-host's checkout has diverged from origin (operator hand-edits,
-half-applied changes), auto-update bails rather than guessing — that
-shows up as a unit failure with a clear log message.
-
-If `index.jsonl` doesn't grow within a wave-interval (~60 s after
-`systemctl enable --now`), run `cis490-doctor` again. The most
-common silent failures it catches:
-
-- `*.wg` DNS missing (wg-enroll provisions it; manual workaround is
-  one line in `/etc/hosts`)
-- mTLS cert chain not installed under `/etc/cis490/certs/`
-- `cis490-shipper` service inactive (forgot step 4)
-- `qemu-system-x86_64` not on PATH
-
-`cis490-doctor --json` is machine-readable for use by other agents.
-
-## Shipper says "400 missing" or "412 commit-rejected": pull and reinstall
-
-If `journalctl -u cis490-shipper` shows a steady stream of
-`-> fatal (400)` or `-> 412 commit-rejected` lines, the receiver is
-rejecting episodes because their `meta.json::code_version.commit`
-isn't in the receiver's allow-list (or isn't being sent at all). This
-happens when this lab host is running code older than the receiver
-will accept.
-
-The fix is always the same — pull main and re-run the installer:
-
-```sh
-cd /opt/cis490
-sudo -u cis490 git pull origin main
-sudo /opt/cis490/scripts/install-lab-host.sh
-```
-
-`install-lab-host.sh` does the rest:
-1. Re-stamps `/opt/cis490/VERSION` to the new HEAD.
-2. Drains pre-stamp episodes via
-   `tools/quarantine_unstamped.py` so the queue stops looping on
-   them. Drained episodes go to `/var/lib/cis490/data/quarantine/`
-   with a `quarantine_reason.json` per-episode for triage.
-3. Restarts `cis490-shipper` and `cis490-orchestrator` so the new code
-   takes effect.
-
-Do **not** disable the shipper to silence the log noise — once a host
-has the new code, traffic resumes immediately. Do **not** mint a fake
-`code_version` field in old episodes to bypass the gate; that field
-exists specifically to keep buggy pre-fix data out of the training
-index.
-
-If the receiver is rejecting *new* episodes too (you've pulled and
-restarted, but still see 412), the receiver's allow-list window may
-not yet include your commit — wait 5s for its Forgejo refresh, or
-push your commit to `origin/main` first if you're testing
-unmerged work.
-
-## Tier 3 + Tier 4 deploy (zero-touch via install-lab-host.sh)
-
-`install-lab-host.sh` runs Tier-3 deploy automatically on its second
-pass (after the mTLS cert lands). No operator interaction is needed:
-metasploit-framework auto-installs via the Rapid7 omnibus, the
-Metasploitable2 image auto-fetches from a public mirror with TOFU
-sha256 pinning, the host-only bridge auto-comes-up, and a live
-exploit fire is verified before the script returns.
-
-To re-run the deploy by hand or on a host where Tier 3 was skipped:
-
-```sh
-sudo /opt/cis490/scripts/install-tier-3-4.sh
-```
-
-It's idempotent — re-running on an already-deployed host is a no-op
-except for the verify step. Inputs are all optional env vars:
-
-| var | effect |
-|---|---|
-| `SKIP_VERIFY` | skip the live `vsftpd_234_backdoor` smoke run |
-| `SKIP_BRIDGE` | skip `br-malware` setup (limits to 2 of 5 modules) |
-| `SKIP_TIER4` | skip the Tier-4 auto-fetch (DEPRECATED — leaves you with mimic-only data, defeats the project) |
-
-The fleet runner auto-detects Tier-3 readiness via
-`orchestrator/fleet.py::_msfrpcd_available()`. Once
-`cis490-msfrpcd.service` is up and `metasploitable2.qcow2` is on
-disk, the next wave produces Tier-3 episodes (`meta.exploit.module_name`
-populated). No orchestrator restart is required, but a restart speeds
-up the switch.
-
-### Tier-4 (real malware execution) is mandatory, fully automated
-
-**Real-binary episodes are the project's training target — Tier-4 is
-NOT optional.** A lab-host deploy that lands without real samples
-fails loudly; mimic-only data does not answer the research question.
-
-There is **no operator step**. No API key, no signup, no manual
-provisioning. `install-tier-3-4.sh` runs `tools/auto_fetch_samples.py`
-which:
-
-1. Clones (or pulls) `theZoo` from
-   `https://github.com/ytisf/theZoo` to `/var/lib/cis490/theZoo`
-   (~500 MB shallow clone, public, GPL-3.0, security-research repo)
-2. For each `[[sample]]` in `manifest.toml` without a sha256, locates
-   a directory in `theZoo/malware/Binaries/` whose name matches
-   the entry's `family` (case-insensitive substring + prefix priority)
-3. Extracts the password-protected `.zip` (well-known password
-   `infected`)
-4. Picks the largest non-text payload as the binary, computes its
-   sha256, copies to `/opt/cis490/samples/store/<sha256>`
-5. Rewrites `manifest.toml` in place, atomically (tempfile +
-   `os.replace` preserving stat), adding `source = "theZoo"`,
-   `sha256 = "<hex>"`, and the upstream URL
-
-If `auto_fetch_samples.py` lands zero binaries (theZoo layout drift,
-git clone failure, or a family has no matching directory),
-`install-tier-3-4.sh` exits non-zero. **No silent mimic-only fallback.**
-
-The orchestrator's next selection that picks a sample with
-`kind == "real"` runs the real binary via the chunked-upload path
-(`exploits.driver._resolve_workload`). The mimic profile remains the
-fallback for episodes that select a sample whose binary isn't on
-disk. Trainers filter on `meta.sample.kind ∈ {"real", "mimic"}`.
-
-### Confirm Tier 3+4 are flowing
-
-```sh
-# On the Pi maintainer side:
-sudo python3 -c "
-import json, glob, subprocess, tarfile, io
-from collections import Counter
-mods = Counter(); kinds = Counter()
-for tar in glob.glob('/var/lib/cis490/episodes/*/*.tar.zst'):
-    z = subprocess.check_output(['zstd','-q','-d','--stdout',tar],stderr=subprocess.DEVNULL)
-    with tarfile.open(fileobj=io.BytesIO(z)) as t:
-        for m in t.getmembers():
-            if m.name.endswith('meta.json') and m.isfile():
-                meta = json.load(t.extractfile(m))
-                mods[(meta.get('exploit') or {}).get('module_name','<none>')] += 1
-                kinds[(meta.get('sample') or {}).get('kind','<none>')] += 1
-                break
-print('exploit modules used:', dict(mods))
-print('sample kinds:', dict(kinds))
-"
-```
-
-If `mods` is `{'<none>': N}` and `kinds` is `{'mimic': N}`, Tier 3
-hasn't kicked in yet on any lab host — re-run
-`install-tier-3-4.sh` there. If `mods` shows
-`{'vsftpd_234_backdoor': N, ...}` and `kinds` shows a non-zero
-`'real'` count, both tiers are live.
-
-### Don't shortcut
-
-- DO NOT install `metasploit-framework` system-wide outside
-  `install-msfrpcd.sh`. The script wires the systemd unit + creds;
-  a manual install bypasses the orchestrator's
-  `_msfrpcd_available()` probe.
-- DO NOT add bogus sha256 entries to `manifest.toml` —
-  `auto_fetch_samples.py` hash-verifies every binary it stages.
-- DO NOT add real-binary entries by hand when `auto_fetch_samples.py`
-  exists. Hand-edits are racy with the auto-fetcher's tempfile
-  rewrite.
+Once that's true, `systemctl enable --now cis490-shipper
+cis490-orchestrator` brings the host online. The orchestrator runs
+the canonical experiment; the shipper PUTs sealed episodes to the
+receiver. Episodes that don't pass the acceptance gate go to
+`data/rejected/<id>/` locally and are never shipped.
 
 ## Securing the connection (mTLS) — DO NOT mint your own certs
 
-The lab-host ↔ Pi connection is mTLS over WireGuard. **The cert
-delivery is fully automated.** You should never run `openssl`, write
-a CSR, edit a Caddyfile, or generate a private key on the lab host.
-If you find yourself doing any of that, you're off the runbook.
+The lab-host ↔ Pi connection is mTLS over WireGuard. Cert delivery
+is automated via `bootstrap.wg/v1/cert/<host_id>`. You should never
+run `openssl`, write a CSR, edit a Caddyfile, or generate a private
+key on the lab host. If you find yourself doing any of that, you're
+off the runbook.
 
-**The actual cert flow:**
-
-1. The lab host comes up on WireGuard via `wg-enroll` (USB-driven,
-   one-time, separate project). After this, the lab host can reach
-   `bootstrap.wg` and `collector.wg` on the `10.100.0.0/24` overlay.
-2. `scripts/install-lab-host.sh`, on its way through, pulls the leaf
-   cert + CA bundle from `https://bootstrap.wg/v1/cert/<host_id>`
-   over plain TLS (CA bundled in `etc/caddy-root.crt`). Trust
-   boundary is "this peer is on the WG mesh" — `iptmonads` already
-   gates the bootstrap port to enrolled peers.
-3. The fetch step is a no-op if `host_id` is still the default
-   `REPLACE_ME` in `/etc/cis490/lab-host.toml`. **This is the most
-   common reason agents think cert delivery is broken.**
-
-**The one fix that resolves 95 % of "cert/TLS/connection" reports:**
-
-```sh
-# 1. Make sure host_id is set:
-sudo grep '^host_id' /etc/cis490/lab-host.toml
-# If it says "REPLACE_ME", edit it to the real host_id you registered.
-
-# 2. Re-run the installer. It will fetch the cert from bootstrap.wg.
-sudo /opt/cis490/scripts/install-lab-host.sh
-
-# 3. Confirm certs landed:
-ls -l /etc/cis490/certs/   # expect lab-host.pem, lab-host.key, wg-ca.pem
-
-# 4. Smoke-test the pipe:
-sudo -u cis490 /opt/cis490/.venv/bin/python -m shipper \
-    --config /etc/cis490/lab-host.toml --ping
-# {"ok": true, ...}  → done.
-```
-
-If step 2 prints `WARN: bootstrap.wg fetch failed`, the cause is
-almost always one of:
-
-- `bootstrap.wg` DNS not resolving → add to `/etc/hosts`:
-  `echo '10.100.0.1 bootstrap.wg collector.wg' | sudo tee -a /etc/hosts`
-- `wg0` interface not up → `sudo wg show` should list a peer; if not,
-  re-run wg-enroll.
-- The Pi's `cis490-bootstrap.service` is down → file an issue against
-  the receiver-side host, not against this repo.
-
-**What you should NOT do, even if it feels like it would help:**
-
-- Generate certs with `openssl` or `step-cli` on the lab host.
-- Copy certs from another lab host.
-- Set `verify_tls = false` in `lab-host.toml` to "skip TLS for now."
-- Restart the shipper repeatedly hoping it self-heals — it already
-  retries on every request without restart.
-- File a Forgejo issue titled "shipper can't connect" without first
-  running the four-line block above and pasting its output.
+The most common reason cert fetch appears to fail is `host_id` still
+being `REPLACE_ME` in `/etc/cis490/lab-host.toml`. Check that first.
 
 The shipper's `waiting on mTLS material` log line is **expected**
-during first-boot until the cert lands. It is not an error to fix.
-The transport rebuilds the SSL context on each request, so the
-moment certs land in `/etc/cis490/certs/`, the next ping/ship
-attempt succeeds — no restart needed.
+during first-boot until the cert lands. It is not an error. The
+transport rebuilds the SSL context on each request, so the moment
+certs land in `/etc/cis490/certs/`, the next attempt succeeds — no
+restart needed.
 
-## Common bring-up gotchas (read this before debugging an install)
+## Filing issues
 
-Smaller models acting as on-device agents have hit these traps. Each
-one is now fixed in main, but if you're on an older clone you may
-still see the symptom — pull `origin/main` first, then re-read.
+When you run into an issue you cannot fully resolve in the current
+turn, file it as a Forgejo issue on the relevant repo. Do not
+silently log a TODO comment, leave a partial workaround, or assume
+someone else will remember.
 
-### Run tools from `/opt/cis490`, not from a manual clone
+File issues for:
+- A build / test / typecheck failure you can't fix in scope.
+- A bug you discover but aren't tasked with fixing.
+- A missing dep, missing config, or env-only failure that blocks
+  E2E.
+- A design gap you've worked around but want a follow-up to fix
+  properly.
 
-When you run `cis490-doctor` from a clone like `~/.env/CIS490/`,
-Python prepends the clone path to `sys.path`. Subprocesses spawned
-by the doctor (e.g., `python -m shipper --ping`) inherit the calling
-CWD and pick up the clone's `shipper/` package instead of the
-service venv at `/opt/cis490/`. Symptom: tracebacks reference the
-clone path, or `No module named exploits` despite `package = false`.
+Don't file when:
+- The user is in the conversation and you can just tell them.
+- It's already filed (search first:
+  `GET /api/v1/repos/<owner>/<repo>/issues?state=open&q=<keyword>`).
+- It's truly a non-issue (a one-line edit you're about to make this
+  same turn).
 
-**Fix already in main:** the doctor passes `cwd=/opt/cis490` to the
-shipper subprocess and inserts `repo_root` into `sys.path` itself.
-**Operator action:** always invoke either as
-`/opt/cis490/.venv/bin/python /opt/cis490/tools/cis490_doctor.py`
-or via `cd /opt/cis490 && ./tools/cis490_doctor.py`. Don't run from a
-clone unless you know what you're doing.
-
-### Shipper logs "waiting on mTLS material" — this is expected, not a bug
-
-The `cis490-shipper` unit is enabled by `install-lab-host.sh` *before*
-the Pi has issued the host's mTLS leaf. The transport pre-flights the
-configured `ca_bundle` / `client_cert` / `client_key` paths and, if
-any are missing, defers building the SSL context. You'll see one
-warning per process lifetime:
-
-```
-shipper waiting on mTLS material (client_cert path missing: …); will retry each request
-```
-
-The unit stays up. Each ping/ship attempt re-tries the build. Once
-the Pi runs `deploy-cis490-cert.sh <host_id> <wg_ip>` and the leaf
-lands at `/etc/cis490/certs/`, the next request succeeds and the
-transport logs `mTLS material now on disk; shipper transport ready`.
-
-**Do not** try to "fix" the warning by restarting the unit, deleting
-the config, or hand-rolling certs — just confirm the Pi-side step
-ran and wait one scan interval.
-
-### Outdated clone? Pull main first.
-
-A long list of install-time bugs (cp self-copy, missing service
-restart, fatal-loop quarantine, ca_bundle pointing at the wrong
-chain, busybox pgrep flags, pycdlib in the wrong dep group, missing
-vm/images/ symlink target, doctor sys.path) have been fixed and are
-all resolved in main. **If you hit any "this used to work" symptom
-on a host that hasn't pulled in a while, the canonical command is
-always the same:**
-
-```sh
-cd /opt/cis490 && sudo -u cis490 git pull origin main && \
-    sudo /opt/cis490/scripts/install-lab-host.sh
-```
-
-That one command:
-
-- Re-stamps `/opt/cis490/VERSION` so episodes get a valid
-  `code_version.commit` — required by the receiver's gate.
-- Drains pre-stamp episodes from `data/episodes/` to
-  `data/quarantine/` via `tools/quarantine_unstamped.py` so the queue
-  stops looping on them.
-- Runs `daemon-reload` and `systemctl restart cis490-shipper
-  cis490-orchestrator` so the live daemons pick up the new code
-  (a bare `git pull` does NOT do this — Python module objects in the
-  running process are frozen at last service start).
-- Re-runs the Tier-3+4 deploy idempotently if the cert is on disk.
-
-After it returns, the shipper will be running as `Type=notify` with
-`WatchdogSec=180` — systemd kills + restarts it if a scan pass hangs.
-
-### The classifier is multi-source — don't gut episodes on /proc alone
-
-`tools/prune_episodes.py` cross-checks four telemetry sources before
-flagging an episode as flat:
-
-- `telemetry-proc.jsonl` — host qemu-system /proc CPU%
-- `netflow.jsonl` — bridge_pcap byte counters (network profiles)
-- `telemetry-qmp.jsonl` — virtio blockstats per-phase delta (io-walk,
-  ransomware-shape)
-- `telemetry-guest.jsonl` — in-guest agent load_1m (low-and-slow,
-  any host with a working agent)
-
-An episode flags as `flat-cpu` only when EVERY available source
-shows no inter-phase variation. If `/proc` is flat but qmp blockstats
-show 90 MB written during `infected_running`, the episode is kept —
-the host /proc collector loses signal under contention but qmp sees
-through. This is essential on laptop-class lab hosts (e.g.
-elliott-thinkpad) where the guest is co-scheduled with 13 other VMs
-and the per-VM /proc CPU% gets buried.
-
-All four sources stamp `t_wall_ns`; phase mapping uses that, not
-`t_mono_ns`, because /proc and labels are orchestrator-relative
-while netflow/guest are wall-clock-anchored. If you add a new
-collector, emit `t_wall_ns` from CLOCK_REALTIME on every row or your
-data will silently bucket into "(pre)".
-
-### Don't trust the in-guest probe alone — cross-check host CPU
-
-The `pre_kill_probe.yes` / `pre_kill_probe.sh` fields in
-`workload_killed` events are produced by `pgrep` running inside an
-Alpine guest. busybox's pgrep does NOT support the `-c` flag. Older
-versions of `VMLoadController._probe()` used `pgrep -c yes`, which
-exits 1 with a usage banner on busybox; the `|| echo 0` fallback then
-always reported `yes=0` regardless of whether the workload was
-running. This caused 244 episodes from `elliott-thinkpad` and
-`k-gamingcom` to be incorrectly labelled `workload-silent`.
-
-The fix landed in main (probe now uses `pgrep yes | wc -l`); episodes
-shipped after that commit have correct probe values. For older
-episodes still on disk, the prune classifier now requires `flat-cpu`
-(host-side CPU envelope confirms no signal) AND the probe to flag
-workload-silent — a probe-only zero is no longer trusted. So you can
-safely run `cis490-prune --archive` against the existing data without
-losing valid episodes.
-
-If you write any new in-guest diagnostic that runs commands via
-SerialClient, assume busybox/ash semantics: no `disown` builtin, no
-GNU `pgrep -c`, no bash `/dev/tcp`, no `[[ ]]`. Always pair an
-in-guest signal with the host-side `/proc` measurement before you
-declare an episode bad.
-
-### One traceback at a time
-
-When the doctor lights up multiple red rows, fix the topmost one and
-re-run rather than batching attempts. Each red row prints the exact
-operator command it expects you to run. Don't paraphrase or invent
-adjacent commands; the doctor is the source of truth for what's
-missing.
-
-## How an agent generates data on demand (without waiting for the timer)
-
-```sh
-# One labeled episode (90 s) with a chosen sample profile:
-sudo -u cis490 /opt/cis490/.venv/bin/python \
-    /opt/cis490/tools/run_real_vm_demo.py \
-    --data-root /var/lib/cis490/data \
-    --sample mirai-class-bot
-
-# Force the shipper to run one pass:
-sudo systemctl start cis490-shipper.service   # (if disabled)
-# or:
-sudo -u cis490 /opt/cis490/.venv/bin/python -m shipper \
-    --config /etc/cis490/lab-host.toml --once
-
-# Confirm on the Pi:
-ssh <pi> 'sudo tail -3 /var/lib/cis490/index.jsonl'
-```
-
-Any tool that wants to *count* what's been shipped:
-
-```sh
-ssh <pi> 'sudo /opt/cis490/.venv/bin/python /opt/cis490/tools/index_reader.py \
-    --index /var/lib/cis490/index.jsonl --count-by host_id'
-```
-
-## Quick test for "is the pipe alive at all"
-
-The `--ping` mode of the shipper does an authenticated POST to
-`/v1/ping` and writes nothing to the database. Use it as the
-smallest E2E test:
-
-```sh
-sudo -u cis490 /opt/cis490/.venv/bin/python -m shipper \
-    --config /etc/cis490/lab-host.toml --ping
-# {"ok": true, "status_code": 200, ...} on success.
-# Any other status is the operator's signal to read the error string.
-```
-
----
-
-## File an issue when you hit a problem
-
-**When you run into an issue you cannot fully resolve in the current
-turn, file it as a Forgejo issue on the relevant repo before moving
-on.** Do not silently log a TODO comment, leave a partial workaround,
-or assume someone else will remember. The issue tracker is the
-durable record.
-
-This applies to:
-
-- a build / test / typecheck failure you can't fix in scope
-- a bug you discover but aren't tasked with fixing
-- a missing dep, missing config, or env-only failure that blocks E2E
-- a design gap you've worked around but want a follow-up to fix
-  properly
-- a scope-out you made (e.g. "deferred Tier 4 sample fetch") that
-  needs an owner so it doesn't get lost
-
-Don't file an issue when:
-
-- the user is in the conversation and you can just *tell* them
-- it's already filed (search first: `GET /api/v1/repos/<owner>/<repo>/issues?state=open&q=<keyword>`)
-- it's truly a non-issue (a one-line edit you're about to make this
-  same turn)
-
-## How to file (Forgejo API)
-
-The local Forgejo at `http://10.100.0.1:3000` accepts API calls with a
-token-bearer header:
+### How to file (Forgejo API)
 
 ```sh
 curl -s -X POST \
@@ -552,19 +134,19 @@ curl -s -X POST \
 The token comes from the user's session — never embed one in code or
 commits.
 
-### What a good issue body contains
+### Good issue body
 
 1. **Context** — one sentence on what was being attempted.
-2. **What happened** — the actual error, log line, or unexpected
-   behavior. Paste exact output.
+2. **What happened** — the actual error or unexpected behavior. Paste
+   exact output.
 3. **What was tried** — every workaround you attempted and why it
    didn't stick.
 4. **Suggested next step** — the smallest change that would resolve
-   it, if you have a guess. "Unknown" is a fine answer.
+   it, if you have a guess. "Unknown" is fine.
 5. **Related** — link the commit / PR / file:line where the issue
    surfaced.
 
-### What a good title looks like
+### Good titles
 
 | Bad | Good |
 |---|---|
@@ -572,25 +154,22 @@ commits.
 | `caddy thing` | `Caddy: client_auth requires absolute path; relative trusted_ca_cert_file silently fails` |
 | `fix later` | `shipper: 5xx backoff cap is 5min, doc says 1min — pick one` |
 
-## After filing
-
-- Reference the issue number in the next commit message:
-  `Refs spectral/<repo>#<n>` or `Closes spectral/<repo>#<n>` if your
-  current change actually fixes it.
-- If the issue is on a different repo than the one you're committing
-  to, fully qualify: `spectral/wg-pki#3`.
+After filing, reference the issue in the next commit message:
+`Refs spectral/<repo>#<n>` or `Closes spectral/<repo>#<n>`. Fully
+qualify cross-repo: `spectral/wg-pki#3`.
 
 ## Other conventions
 
-- **Don't put off the hard parts.** Frame "deferred-with-reason" only
+- **Don't put off the hard parts.** "Deferred-with-reason" is only
   for genuine blockers (binary not present on this machine, external
   service unreachable). For anything you *could* do but find awkward
-  — bridge setup, cross-arch quirks, fleet concurrency — do it. The
-  user has flagged this twice when work was scoped down prematurely.
-  When something genuinely is blocked by an operator artifact, file
-  the Forgejo issue and *automate the bring-up* (e.g., installer
-  script + sha256-verifying fetcher) so the moment the artifact lands
-  it Just Works.
+  — bridge setup, cross-arch quirks, fleet concurrency — do it.
+- **No architectural bandaids in the pipeline.** Compensating layers
+  (auto-update timers, fix-yourself decision trees, prescriptive
+  symptom→command tables, trainer-side prune scripts that paper over
+  silent collectors) are not allowed in the data-collection /
+  emulation / labelling path. Fix the producer instead. See
+  PIPELINE.md.
 - **Naming:** never coin USB / device / service names on the user's
   behalf. Ask first. Reusing an old name is especially bad.
 - **`/etc` configs:** `Read` first, copy second. Never overwrite a
diff --git a/FIXYOURSELF.md b/FIXYOURSELF.md
deleted file mode 100644
index ae41593..0000000
--- a/FIXYOURSELF.md
+++ /dev/null
@@ -1,249 +0,0 @@
-# FIXYOURSELF.md — when the lab host can't unstick itself automatically
-
-**You are reading this because you (an on-device AI agent) are running
-on a CIS490 lab host and one of these is true:**
-
-- `cis490-autoupdate.timer` keeps failing.
-- The shipper logs show steady `412 commit-rejected` or `400 missing`.
-- `cis490-doctor --role lab-host` flagged you red.
-
-The normal "pull and reinstall" path (AGENTS.md row 1) sometimes can't
-recover on its own — most often because someone made a local commit
-on this host, or `/opt/cis490` got into a state `git pull --ff-only`
-refuses to fix. This doc is the explicit decision tree for those
-cases. Run the commands literally. Do not paraphrase.
-
----
-
-## Step 1 — Identify your state
-
-Run all of these. Save the output. Match the result against the
-table.
-
-```sh
-cd /opt/cis490
-sudo -u cis490 git rev-parse HEAD            # → LOCAL
-sudo -u cis490 git rev-parse origin/main 2>&1 # → REMOTE (or error)
-sudo -u cis490 git fetch origin main 2>&1    # → fresh fetch result
-sudo -u cis490 git log -1 --format='%H %ci %s'  # → what HEAD is
-sudo -u cis490 git status --porcelain        # → uncommitted changes?
-```
-
-Then:
-
-| If you see… | Your state | Go to |
-|---|---|---|
-| LOCAL == REMOTE and `git status` empty | Not stuck — run §Z to be safe | §Z |
-| LOCAL != REMOTE and `git merge-base --is-ancestor HEAD origin/main` returns 0 | Behind main, no local commits | §A |
-| LOCAL != REMOTE and the merge-base check returns 1 (NON-zero) | **You have a local commit not on origin/main** | §B |
-| `git fetch` prints a network error | Connectivity broken | §C |
-| `/opt/cis490/.git` is missing | No git checkout — populated via `cp -aT` originally | §D |
-| `git status` shows tracked files modified | Uncommitted edits on this host | §E |
-
-If multiple match: §C blocks everything else (fix network first), then
-§D, then §E, then §B, then §A.
-
----
-
-## §A — Behind main, clean tree
-
-```sh
-cd /opt/cis490
-sudo -u cis490 git pull --ff-only origin main
-sudo /opt/cis490/scripts/install-lab-host.sh
-```
-
-`install-lab-host.sh` re-stamps VERSION, drains the pre-stamp queue,
-and restarts the daemons. Verify with §Z.
-
----
-
-## §B — You have a local commit not on origin/main
-
-This is the elliott-thinkpad case (2026-05-01..02). You committed
-something locally, the maintainer's `origin/main` doesn't have it,
-and the receiver's allow-list rejects every episode you ship. Pick
-ONE of B.1, B.2, B.3 — read all three first.
-
-### B.1 — Push your commit to origin/main (RECOMMENDED if your change is real)
-
-```sh
-cd /opt/cis490
-sudo -u cis490 git log -3 --stat HEAD      # what you're about to push — read it
-sudo -u cis490 git push origin HEAD:main
-```
-
-If `git push` succeeds: the receiver allow-list picks it up within 5
-seconds, new episodes start landing immediately. Verify with §Z.
-
-If `git push` fails with "permission denied" or "auth required": you
-don't have push credentials. Skip to B.2 or B.3.
-
-If `git push` fails with "non-fast-forward": your local commit
-diverged AROUND origin/main (you committed against an older base).
-Try:
-
-```sh
-sudo -u cis490 git pull --rebase origin main
-# If the rebase has conflicts you can't resolve in 60 seconds, abort:
-#   sudo -u cis490 git rebase --abort
-# Then go to B.2 or B.3.
-sudo -u cis490 git push origin HEAD:main
-```
-
-### B.2 — Discard your local commit, reset to origin
-
-⚠ **This deletes your local commit.** Only do this if you know it
-wasn't a real fix or the maintainer told you to discard it.
-
-```sh
-cd /opt/cis490
-sudo -u cis490 git fetch origin main
-sudo -u cis490 git log --oneline HEAD ^origin/main  # what you're about to lose
-sudo -u cis490 git reset --hard origin/main
-sudo /opt/cis490/scripts/install-lab-host.sh
-```
-
-### B.3 — Stop, file an issue, wait
-
-If you can't decide between B.1 and B.2 (e.g. you don't know if your
-commit is a real fix), do this:
-
-```sh
-cd /opt/cis490
-HOST_ID=$(grep '^host_id' /etc/cis490/lab-host.toml | cut -d'"' -f2)
-LOCAL_SHA=$(sudo -u cis490 git rev-parse HEAD)
-DETAIL=$(sudo -u cis490 git log -3 --stat HEAD | head -100)
-
-# File the issue (replace <TOKEN> with the operator's Forgejo token —
-# do NOT embed yours in commits)
-curl -sS -X POST \
-  -H "Authorization: token <TOKEN>" \
-  -H "Content-Type: application/json" \
-  http://10.100.0.1:3000/api/v1/repos/spectral/CIS490/issues \
-  -d "$(python3 -c "import json,os; print(json.dumps({
-    'title': f\"$HOST_ID: stuck on local commit ${LOCAL_SHA:0:12}\",
-    'body':  f\"### What's at HEAD\n\n\`\`\`\n${DETAIL}\n\`\`\`\n\nNeed maintainer to choose: push HEAD to main, or reset --hard origin/main here?\"
-  }))")"
-```
-
-Then leave the daemons running. The shipper will keep auto-quarantining
-the 412s — backlog grows but doesn't crash anything. Wait for a
-maintainer comment.
-
----
-
-## §C — Network broken
-
-```sh
-ping -c 1 10.100.0.1                # the Pi
-sudo wg show                         # is wg0 up?
-sudo systemctl restart wg-quick@wg0  # bring it back up
-sudo systemctl restart cis490-shipper cis490-orchestrator
-```
-
-If `ping 10.100.0.1` still fails after a `wg-quick` restart, this is
-a WireGuard / wg-enroll / iptmonads problem outside this repo. File
-an issue at `spectral/wg-enroll` or `spectral/iptmonads` and stop.
-
----
-
-## §D — `/opt/cis490/.git` missing
-
-The host was originally set up with `cp -aT` (no `.git/`). That makes
-auto-update impossible. Re-clone:
-
-```sh
-# Stop services so we don't race with the orchestrator mid-episode
-sudo systemctl stop cis490-shipper cis490-orchestrator
-
-# Preserve config/data — only /opt/cis490 (the code) gets replaced.
-# /etc/cis490/ and /var/lib/cis490/ are NOT touched.
-sudo mv /opt/cis490 /opt/cis490.pre-fix
-sudo git clone http://maxgit.wg:3000/spectral/CIS490.git /opt/cis490
-sudo chown -R cis490:cis490 /opt/cis490
-
-sudo /opt/cis490/scripts/install-lab-host.sh
-# Once verified, you can drop the backup:
-#   sudo rm -rf /opt/cis490.pre-fix
-```
-
----
-
-## §E — Uncommitted edits on tracked files
-
-```sh
-cd /opt/cis490
-sudo -u cis490 git status --short    # see what's modified
-sudo -u cis490 git diff              # see exactly what changed
-```
-
-If the changes are intentional (e.g. you fixed a bug), commit them
-first and then go to §B:
-
-```sh
-sudo -u cis490 git add <files>
-sudo -u cis490 git commit -m "<short description>"
-# Now go to §B.
-```
-
-If the changes are accidental / left over from debugging, discard
-them:
-
-```sh
-sudo -u cis490 git checkout -- .
-# Now go to §A.
-```
-
----
-
-## §Z — Verify you're unstuck
-
-```sh
-# 1. Daemons up?
-systemctl is-active cis490-shipper cis490-orchestrator
-# Both should say "active".
-
-# 2. VERSION present and matches HEAD?
-cat /opt/cis490/VERSION
-sudo -u cis490 git -C /opt/cis490 rev-parse HEAD
-# The "commit" in VERSION should equal the rev-parse output.
-
-# 3. Recent ship results — should NOT be 412/400-fatal-only:
-sudo journalctl -u cis490-shipper --since '2 minutes ago' | tail -20
-
-# 4. Maintainer-side confirmation (if you can reach the Pi):
-#   sudo tail -5 /var/lib/cis490/index.jsonl
-# Should show recent received_at_wall timestamps with your host_id.
-
-# 5. Run the doctor — every red row prints its own fix:
-/opt/cis490/.venv/bin/python /opt/cis490/tools/cis490_doctor.py --role lab-host
-```
-
----
-
-## Hard rules (don't break these)
-
-These are repeated from AGENTS.md because the situations covered in
-this doc are exactly when an agent is most tempted to break them.
-
-- **Do NOT** edit `meta.json` files to add a fake `code_version`. The
-  gate exists to keep buggy data out of the index.
-- **Do NOT** disable / mask / `systemctl stop` `cis490-shipper` to
-  silence log noise. Fix the cause.
-- **Do NOT** set `verify_tls = false` in `lab-host.toml`. That's a
-  bypass, not a fix.
-- **Do NOT** `git commit` on `/opt/cis490` to "make the gate
-  happy" — committing without pushing is exactly how the
-  elliott-thinkpad incident started. If you commit, push (§B.1) or
-  don't commit at all.
-- **Do NOT** delete `/var/lib/cis490/data/quarantine/`. Those are
-  forensic records of what the gate rejected; they're capped at 30
-  days by the cleanup pass.
-- **Do NOT** clobber `/etc/cis490/certs/`. Cert delivery is
-  automated; rerun `install-lab-host.sh` if certs are missing.
-
-If you find yourself wanting to do any of the above, stop and file
-an issue (§B.3 has the curl command). The maintainer would much
-rather resolve a stale lab host by reading an issue than by
-reverse-engineering what an agent did to escape a stuck state.
diff --git a/PIPELINE.md b/PIPELINE.md
new file mode 100644
index 0000000..04d7949
--- /dev/null
+++ b/PIPELINE.md
@@ -0,0 +1,900 @@
+# PIPELINE.md — the CIS490 generative pipeline honesty plan
+
+**This document is canonical.** It supersedes any guidance in
+`AGENTS.md`, `FIXYOURSELF.md`, `README.md`, or other repo docs that
+contradicts it. If another doc says something different, this doc wins
+and the other doc is wrong (file an issue or fix it).
+
+This is not an architecture overview. This is a fix list. Read it,
+implement it, do not split it into phases.
+
+**Before proposing any change to the pipeline, re-read §1, §7, and §8
+and run your proposal against §8's checklist.** Then proceed.
+
+---
+
+## 1. Principle
+
+Every episode that reaches the dataset must be ground-truth. Every
+host runs the same experiment with the same configured catalog. Every
+exploit module and every collector in the catalog has been proven to
+work end-to-end before it is eligible to run. There are no
+compensating layers — no auto-update timers that drag stale peers
+forward, no "fix-yourself" decision trees, no per-host divergence
+absorbed by trainer-side filters, no labels written by clock when the
+event they describe didn't happen.
+
+If a host can't meet the bar, it produces zero episodes and says so
+loudly. A small honest dataset beats a large dishonest one.
+
+**Default to removal, not addition.** If a problem can be fixed by
+deleting code or removing a layer, prefer that. Adding a layer is
+the suspect default and should be justified against §7 and §8 before
+proceeding.
+
+---
+
+## 2. What the experiments are for
+
+CIS490 trains a behavioral malware-detection model. The dataset is
+the ground-truth labelled record of what the host looked like during
+known-clean, known-armed, known-infecting, and known-infected phases
+of a real exploit chain against a real target service. The model
+learns to distinguish those phases from in-deployment
+behavior. **Every dishonest label is a poisoned training example.**
+
+This is why the producer's job is not "ship lots of episodes." It is
+"ship episodes whose labels are true."
+
+---
+
+## 3. What is currently broken (evidence)
+
+Numbers from the 200-episode quality probe on 2026-05-03:
+
+1. **Labels lie.** 0 of 67 Tier-3 exploit fires resulted in a
+   `session_open` event. All 67 logged `session_open_timeout`. Yet
+   every one of those 67 episodes is labelled
+   `phase=infected_running` because the schedule-driven labeller
+   transitions on a clock, not on observed events. The
+   `infected_running` label in the dataset means "the schedule said
+   so," not "an attacker session was actually open on this host."
+2. **Collectors are silent.**
+   - `perf` produces 0 rows on 100% of episodes on both hosts.
+   - `guest-agent` produces 0 rows on 100% of episodes on both hosts.
+   - `qmp`, `netflow`, and `pcap` produce 0 rows on 100% of
+     k-gamingcom episodes (different config from elliott).
+   - The host `tcpdump` is missing on k-gamingcom; `pcap_unavailable`
+     is logged then ignored.
+3. **The catalog is unverified.** Modules are added to the rotation
+   without a per-module verification that the module actually lands a
+   session against its declared target. `samba_usermap_script` has a
+   100% failure rate against the configured Metasploitable2 target
+   and was still in the rotation.
+4. **Hosts run divergent experiments.** elliott and k-gamingcom have
+   different per-host manifests, different collector coverage,
+   different qemu invocations. The dataset is a union of two
+   different experiments, not 200 samples from one.
+5. **Working trees are dirty.** 200/200 episodes report `dirty=true`,
+   so `code_version.commit` is unverifiable provenance.
+
+Each of these is a failure of the producer. Receiver-side filtering
+and trainer-side prune scripts are bandaids that hide them.
+
+---
+
+## 4. The fix — line items
+
+Every item below must land. They are not phases. They are parts of
+one cohesive correctness story; any of them missing leaves the
+pipeline half-honest. Each item names its acceptance test.
+
+### 4.1 Canonical manifest
+
+There is exactly one manifest, version-pinned in the repo at
+`manifest.toml`. Every lab host loads the same manifest. There is no
+per-host manifest override, no per-host collector enable/disable
+flag, no per-host qemu argument list. Hosts that cannot run the
+canonical manifest exit 78 at orchestrator startup.
+
+**Acceptance:** `find . -name manifest.toml -not -path './.git/*'`
+returns exactly one path. There is no `--manifest` CLI flag on the
+orchestrator that takes a different path; the path is hard-coded.
+Removing this line item would re-create the host divergence we just
+exited.
+
+### 4.2 Target VMs we build, not VMs we fetch
+
+Every target VM image is built from a declarative spec checked into
+the repo (Packer, mkosi, debootstrap, whatever — declarative). The
+image build produces a sha256-pinned artifact. The build script
+verifies, before producing the artifact, that:
+
+- The vulnerable service is up after first boot.
+- The service is on the port the module catalog declares.
+- The service version matches the version the module catalog
+  declares.
+
+`Metasploitable2` from a SourceForge mirror is removed. We don't
+ship episodes targeting black-box images.
+
+**Acceptance:** `scripts/build-target-<name>.sh` exists for every
+target referenced by an exploit module. Running it produces an image
+whose post-boot state passes the spec's verification step. The
+verification step's exit code gates the build's exit code.
+
+### 4.3 Module catalog admission criteria
+
+A module is in the catalog *only if* it passes a recorded end-to-end
+verification run against its declared target. The verification is:
+
+1. Boot the target snapshot.
+2. Fire the module via msfrpcd.
+3. Observe a `session_open` event (not `session_open_timeout`).
+4. Observe at least one shell command round-trip on the session.
+5. Confirm guest-side artifact (file written, process spawned —
+   per-module).
+
+If any step fails, the module does not enter the catalog. There is
+no "tentatively included" tier. Modules already in the catalog are
+re-verified by `scripts/verify-catalog.sh` (new) on every release;
+failures remove the module from the catalog.
+
+**Acceptance:** every entry in `exploits/modules/*.toml` has a
+companion `verified_against = "<target_name>"` and
+`last_verified = "<commit_sha>"` field. `scripts/verify-catalog.sh`
+re-runs every entry and exits 0 only if every one passes.
+
+### 4.4 Collector admission criteria
+
+A collector is in the active set *only if* it passes a recorded
+end-to-end verification run that confirms it emits non-zero rows
+against a known-busy probe workload.
+
+For each of the six collectors (`proc`, `qmp`, `netflow`, `perf`,
+`guest`, `pcap`):
+
+1. Diagnose the current zero-row failure (read the code, run
+   standalone, find the actual cause). Fix the cause.
+2. Add a unit-or-integration test that runs the collector for N
+   seconds against a synthesized workload (a busy-loop process for
+   `proc`/`perf`, a packet generator for `netflow`/`pcap`, a QMP
+   blockstats query for `qmp`, a guest heartbeat for `guest`) and
+   asserts ≥1 row.
+3. The test must run in CI and on every install via the install
+   script.
+
+A collector that cannot pass admission is removed from the active
+set with a recorded reason — not silently included with zero rows.
+
+**Acceptance:** `pytest tests/test_collectors_emit.py -k <name>`
+passes for each name. The CI run gates merges.
+
+### 4.5 Event-driven labelling
+
+Phase labels are written from observed events, never from the
+schedule clock. The schedule becomes a *time budget* — maximum time
+the orchestrator will wait in each phase — not a label source.
+
+Specifically:
+
+- `clean` is written at episode start.
+- `armed` is written when the orchestrator instructs the driver to
+  fire (this is observable in code).
+- `infecting` is written when the `exploit_fire` event is observed.
+- `infected_running` is written **only** when the `session_open`
+  event is observed.
+- If `session_open_timeout` is observed instead, the episode
+  terminates with a `failed` label and is rejected (see §4.6).
+- `dormant` and subsequent `infected_running` transitions are
+  written from observed in-session idle / activity, not from clock.
+
+Per-module timeouts replace the global 30s timeout. Default 120s,
+configurable per module in `exploits/modules/*.toml`.
+
+**Acceptance:** for every shipped episode, every entry in
+`labels.jsonl` has a corresponding event in `events.jsonl` with a
+matching `t_mono_ns` within ±100ms. An invariant test asserts this.
+
+### 4.6 Episode acceptance gate at finalization
+
+Before sealing meta and writing `done.marker`, the orchestrator
+verifies:
+
+- Every collector in the active set produced ≥1 row.
+- Every label has a matching event (§4.5 invariant).
+- For Tier-3 episodes: a `session_open` event exists.
+- `dirty=true` is absent OR `dirty_override=true` is present (see
+  §4.9).
+
+If any check fails, the episode goes to `data/rejected/<id>/` with a
+`rejected_reason.json` describing which check failed. `done.marker`
+is not written. The shipper never sees it.
+
+**Acceptance:** `tests/test_acceptance_gate.py` covers each rejection
+condition. A passing test asserts a clean episode is accepted; for
+each failure mode, the test asserts the episode is moved to
+`rejected/` with the expected reason.
+
+### 4.7 Producer preflight
+
+`orchestrator/preflight.py` runs at orchestrator startup. One bar
+(no light/deep split). Checks:
+
+- Every binary required by the active collector set + active module
+  catalog is on `PATH`.
+- `/dev/kvm` accessible by the service user.
+- `kernel.perf_event_paranoid <= 2`.
+- `cfg.bridge_iface` exists; `tcpdump` can capture on it.
+- `msfrpcd` reachable; `auth.login` returns a token.
+- For every module in catalog: `module.info` is fetchable.
+- For every sample in catalog: file present on disk; sha256 matches.
+- Probe-boot baseline-v1 snapshot; observe guest-agent heartbeat
+  within N seconds.
+- `git status --porcelain` empty (or `CIS490_ALLOW_DIRTY=1`).
+- HEAD is on a commit currently in `origin/main`.
+
+Failures are collected (every failed check logged with diagnosis +
+remediation), then `sys.exit(78)`.
+
+**Acceptance:** `tests/test_preflight.py` covers each check
+individually with mocked subprocess/filesystem. `python -m
+orchestrator.preflight` runs the checks and prints a structured
+report. Exit codes: 0 ok, 78 sysadmin error.
+
+### 4.8 Receiver-side rejection (defense in depth)
+
+**The receiver is defense-in-depth, NOT the primary correctness
+mechanism.** The producer is. Receiver rejection exists to catch
+peers running stale or broken code; it is never a substitute for
+fixing the producer. A change that strengthens receiver rejection
+without strengthening the producer is the defensive-instead-of-
+corrective pattern (§7.9).
+
+The receiver enforces the same correctness invariants the
+orchestrator does. A peer running stale code that produces dishonest
+episodes still gets rejected at ingest:
+
+- Reject any meta with `dirty=true` and no `dirty_override=true`.
+- Reject any meta where `phases_observed` contains `infected_running`
+  but `events.jsonl` (extracted from the tarball) lacks
+  `session_open`.
+- Reject any meta where any configured-collector row count is zero.
+- Existing commit-allow-list gate continues.
+
+Rejections return 422 with a JSON body naming the failed check.
+Rejected tarballs are not written to the index.
+
+**Acceptance:** `tests/test_receiver_rejects.py` covers each new
+rejection condition.
+
+### 4.9 Override discipline
+
+The only escape hatch from the dirty-tree gate is the
+`CIS490_ALLOW_DIRTY=1` environment variable. When set:
+
+- Orchestrator logs `WARN: dirty tree override active`.
+- meta.json gains `dirty_override: true`.
+- Receiver accepts the episode only if `dirty_override` is also
+  `true`.
+- Every override use is auditable from the dataset.
+
+There are no other override knobs. No `verify_tls=false`, no "skip
+preflight," no "include this collector even if it emits zero rows."
+
+### 4.10 Regression-test discipline
+
+Every fix in this plan lands with a test that would have caught the
+regression at PR time. Tests are not a follow-up. A PR that fixes
+the perf collector without a perf-emit test is incomplete and gets
+sent back.
+
+CI runs:
+- All unit tests.
+- `scripts/verify-catalog.sh` against a smoke target subset (catalog
+  verification full run is gated to release commits — too expensive
+  for every PR).
+- The collector-emit integration tests (§4.4) on real binaries.
+
+### 4.11 systemd integration
+
+- `cis490-orchestrator.service` adds
+  `RestartPreventExitStatus=78`. A preflight failure stays loud and
+  stuck instead of cycling restarts.
+- On preflight failure, orchestrator writes
+  `/var/lib/cis490/preflight.failed.json` with the failed checks +
+  timestamps. Doctor surfaces this in its next report. The
+  fleet-health alert distinguishes "preflight failed" from "host
+  silent."
+
+### 4.12 Cleanup of compensating layers
+
+The following are deleted as part of this change. Their existence
+was load-bearing for the dishonest pipeline; the honest one doesn't
+need them.
+
+- `FIXYOURSELF.md` — entire file deleted. Stuck states no longer
+  exist as a class because the gates make them impossible.
+- `cis490-autoupdate.timer` + `scripts/auto-update.sh` — deleted.
+  Hosts run pinned commits. New code is rolled out by the operator,
+  not auto-pulled.
+- `cis490-cert-fetch.timer` — replaced by a one-shot first-boot
+  fetch in `install-lab-host.sh`. No periodic re-fetch.
+- `tools/quarantine_unstamped.py` — deleted. Pre-stamp episodes
+  cannot exist because no episode is written without a valid stamp.
+- `tools/check_fleet_health.py` — keep, but delete the "fatal-only"
+  alert branch (that branch existed because we were shipping fatals;
+  with the gate, we don't).
+- `tools/prune_episodes.py`'s "kept episode despite flat /proc
+  because qmp showed write" cross-check logic — deleted. Episodes
+  that don't pass the producer-side gate don't reach the trainer.
+- AGENTS.md "symptom→fix table" — deleted (the
+  symptoms it covers are now impossible).
+- AGENTS.md "Hosts self-update" section — deleted.
+
+### 4.13 Containment bar
+
+Real malware execution requires explicit containment. Target VMs
+exist in an isolation context that is part of the canonical
+experiment, not a deployment detail. A future change that weakens
+any of the items below is a containment regression and is rejected
+regardless of what experimental realism it claims to add.
+
+For every target VM in the catalog (§4.2):
+
+- **Network:** target attaches to a bridge with NO upstream egress.
+  No NAT to the host network, no internet route, no DNS resolution
+  beyond what the experiment provides. Outbound C2 callbacks
+  resolve to a sinkhole inside the experiment, never to the
+  internet.
+- **Filesystem:** no shared mount with the host. No 9p, no
+  virtio-fs with host paths. The target's disk is the snapshot it
+  was booted from, period.
+- **Privilege:** QEMU runs as the unprivileged service user. KVM
+  access is via group membership only; no setuid wrappers, no
+  privileged TUN ownership transfer, no passthrough of host
+  devices not explicitly required by the catalog.
+- **Lifetime:** every target boots from a fresh snapshot. State
+  from one episode never crosses into the next. The snapshot is
+  reverted at episode end, not "cleaned."
+- **Escape monitoring:** any QEMU exit that is not a clean shutdown
+  is logged with full QMP state and the episode is marked `failed`.
+  Two unclean exits on the same target image within a release
+  window trigger admission-criteria re-verification (§4.3) for
+  every module targeting that image.
+
+**Acceptance:** `tests/test_containment.py` asserts each target
+build (a) has no upstream egress route from inside the guest,
+(b) has no host-shared filesystem mount, (c) runs QEMU as the
+unprivileged service user, (d) reverts to snapshot at episode end.
+The test runs in CI and on every install.
+
+---
+
+## 5. Build order
+
+There is no half-honest intermediate state. The order below
+sequences the work; it does not phase the deployment. Everything
+lands to `main` in one merge.
+
+1. Fix the four root-cause defects:
+   - Diagnose + fix the perf collector (read code, run standalone,
+     find why it's silent, fix).
+   - Diagnose + fix the guest-agent collector (mount baseline image,
+     verify agent installed, fix build).
+   - Diagnose + fix k-gamingcom's missing qmp/netflow/pcap (compare
+     configs, eliminate divergence — §4.1).
+   - Diagnose + fix `samba_usermap_script` against its target
+     (manual msfconsole drive, find why the bind shell never
+     connects, fix or remove from catalog — §4.3).
+2. Land the canonical manifest (§4.1).
+3. Land the target-VM build pipeline (§4.2) and containment
+   tests (§4.13) together — target VMs are not in the catalog
+   without containment.
+4. Land the catalog admission criteria + verifier (§4.3).
+5. Land the collector admission criteria + tests (§4.4).
+6. Land the event-driven labeller (§4.5).
+7. Land the acceptance gate (§4.6).
+8. Land the preflight (§4.7).
+9. Land the receiver-side rejection (§4.8).
+10. Land the override discipline + cleanup (§4.9, §4.12).
+11. Land systemd integration + alert distinguishing (§4.11).
+
+After merge: lab hosts pull the new manifest, run preflight, fail
+loudly if they don't meet the bar, produce zero episodes until they
+do. The operator brings each host to bar — fixing one root cause at
+a time, loudly. The dataset goes quiet, then comes back honest.
+
+---
+
+## 6. Out of scope (and why)
+
+- **Schedule jitter for label-leakage resistance.** Real concern,
+  but doesn't affect honesty — only generalization. Address after
+  honest data is flowing.
+- **New collectors (audit, ftrace, etc.).** Adding collectors before
+  the existing six are honest is putting more weight on a broken
+  floor.
+- **Trainer changes.** This plan stops at the dataset boundary. The
+  trainer no longer needs to filter dishonest episodes because they
+  don't exist.
+- **Multi-architecture targets.** All target VMs are x86_64 for now.
+
+Each of these is fine to defer because they don't paper over a
+correctness defect. They add value on top of an honest pipeline; the
+pipeline isn't honest yet.
+
+---
+
+## 7. Anti-patterns (named — match every proposal against this list)
+
+Each pattern below is a shape a proposal can take that has been
+rejected as architectural sleight-of-hand. **Match every proposal
+against this list before submitting it.** A proposal that matches
+a named pattern is rejected; abandon it and propose a corrective
+fix instead.
+
+The patterns are named so future sessions can recognize them in
+their own work. A bandaid with a nice name (preflight, acceptance
+gate, retry layer, fleet-health) is still a bandaid.
+
+**§7 is non-exhaustive.** New sleight-of-hand patterns will exist
+that aren't named here. The §8 decision tests are the actual
+filter; a proposal that fails §8 is rejected even if it matches
+no named pattern. Do not read §7 as a closed taxonomy and conclude
+"my proposal isn't on the list, so it's fine." If §8 says no, the
+answer is no, regardless of whether a named match exists.
+
+### 7.1 Compensating-layer pattern
+
+**Definition.** Adding a layer (timer, watcher, retry, alert,
+recovery doc) that absorbs a failure mode upstream of itself
+instead of fixing the upstream cause.
+
+**Example from session 2026-05-02..03.** `cis490-autoupdate.timer`
+to drag stale peers forward. The actual fix was the operator's
+deploy process; the timer existed because deployment was unreliable
+and we patched around the unreliability instead of fixing it.
+
+**Test.** If I removed this layer right now, would the original
+problem reappear immediately? If yes, the layer is a compensating
+bandaid for an unfixed root cause.
+
+**What to do instead.** Fix the upstream cause. If you cannot in
+this change, fail loudly (§9) and stop.
+
+### 7.2 Phasing-as-deferral pattern
+
+**Definition.** Splitting a correctness fix into "phase 1, phase 2,"
+"light vs deep," or "land this now, the harder part later." Any
+sequencing that ships a half-honest intermediate state.
+
+**Example from session 2026-05-02..03.** "Land preflight first,
+labeller refactor later." The intermediate state ships dishonest
+data because the labeller is still clock-driven.
+
+**Test.** Does each intermediate merge ship dishonest data, or
+rely on a layer that won't exist yet? If yes, no phasing.
+
+**What to do instead.** Reduce scope (drop a feature, narrow the
+active set) until the change is small enough to land in one merge.
+Do not defer the hard part.
+
+### 7.3 Single-instance-fix pattern
+
+**Definition.** Fixing one item from a class while leaving the
+other items as future work.
+
+**Example from session 2026-05-02..03.** "I'll diagnose perf and
+samba in parallel" while guest-agent, qmp, netflow, and the rest
+of the module catalog stay broken.
+
+**Test.** Is this a class of N items, of which I'm fixing < N? If
+yes, fix all or remove the unfixed from the active set.
+
+**What to do instead.** Either fix every member of the class, or
+shrink the active catalog to just the verified members. Unverified
+members do not ship.
+
+### 7.4 Per-host-divergence pattern
+
+**Definition.** Accepting that two hosts behave differently as a
+working assumption.
+
+**Example from session 2026-05-02..03.** "Which host should I
+investigate samba on, elliott or k-gamingcom?" — implying the
+answer matters because hosts are different.
+
+**Test.** Given identical workloads on identical canonical-manifest
+hosts, would the produced episodes be identical? If no, the
+divergence is the bug.
+
+**What to do instead.** Eliminate the divergence (one canonical
+manifest, one canonical target VM build, one canonical collector
+set — §4.1). If a host can't run the canonical experiment, it
+produces zero episodes.
+
+### 7.5 Black-box-trust pattern
+
+**Definition.** Treating an externally-built artifact as if it
+behaves correctly under our experiments without a verifiable spec
+for what it should do.
+
+**Example from session 2026-05-02..03.** Metasploitable2 from a
+SourceForge mirror — we don't know what version of Samba is
+running, whether the service is up, or whether the image has been
+altered. We were shipping modules targeting it anyway.
+
+**Test.** Do we have a verifiable spec for this artifact's
+behavior? If no, we don't trust it.
+
+**What to do instead.** Build the artifact from a declarative spec
+we control (§4.2). If we can't, remove modules targeting it from
+the catalog.
+
+### 7.6 Investigation-as-deferral pattern
+
+**Definition.** Proposing investigation when a verifiable gate
+would suffice. The investigation itself becomes the deferred work.
+
+**Example from session 2026-05-02..03.** "I need to diagnose why
+perf is silent before I can write the gate." A gate of the form
+"perf must produce ≥1 row" works without knowing the cause; it
+forces the diagnosis to happen as part of the fix.
+
+**Test.** Can the gate be expressed as an assertion ("X must
+produce > 0 rows" / "X must observe Y event") without knowing the
+root cause? If yes, write the gate first.
+
+**What to do instead.** Write the strictest possible gate first.
+The investigation is the work of making the gate pass.
+
+### 7.7 Speculation-as-evidence pattern
+
+**Definition.** Asserting a claim as fact without measurement.
+
+**Example from session 2026-05-02..03.** "30s vs 120s won't change
+this — if the exploit were almost working, we'd see occasional
+opens." No data was gathered; the claim was projected.
+
+**Test.** Do I have a measurement that supports this claim? If no,
+I am speculating.
+
+**What to do instead.** Say "I don't know yet." Either gather data
+or design the fix to be correct under both possibilities.
+
+### 7.8 Out-of-scope-for-correctness pattern
+
+**Definition.** Naming a correctness-affecting item as "out of
+scope" to avoid the harder problem.
+
+**Example from session 2026-05-02..03.** "Manifest canonicalization
+is out of scope, flagged as known issue." Per-host config divergence
+is the source of half the data quality problems; excluding it from
+scope was a deferral.
+
+**Test.** Does excluding this item leave the system half-honest?
+If yes, it is in scope.
+
+**What to do instead.** Reduce other scope (drop a feature, narrow
+the active set) to fit. Correctness items cannot be deferred.
+
+### 7.9 Defensive-instead-of-corrective pattern
+
+**Definition.** Building rejection logic at the consumer instead of
+fixing the producer that produces the rejected output.
+
+**Example from session 2026-05-02..03.** Receiver-side rejection of
+dishonest episodes without fixing why the producer produces them.
+Defense-in-depth (both ends gated) is good; defense-without-
+corrective (only consumer gated) is a bandaid.
+
+**Test.** Does this fix make the dishonest behavior IMPOSSIBLE
+upstream, or only unobservable downstream? If only unobservable,
+the producer is still broken.
+
+**What to do instead.** Fix the producer first. The consumer-side
+gate is defense-in-depth on top of a corrected producer, never a
+substitute.
+
+### 7.10 Recovery-layer pattern
+
+**Definition.** Building documentation, scripts, timers, or
+runbooks for "what to do when X is stuck." Applies anywhere in
+the pipeline — producer, receiver, trainer, dashboard, install
+scripts, on-device agents, anywhere a "recovery from a state
+that shouldn't exist" layer is contemplated. Producer-side is
+just the most common location.
+
+**Example from session 2026-05-02..03.** `FIXYOURSELF.md` — a
+250-line decision tree for recovering hosts whose auto-update
+timer couldn't fix them. The states it covered shouldn't have been
+possible if the producer were correct.
+
+**Test.** Can the stuck state happen at all if the relevant
+component is correct? If no, delete the recovery layer and fix
+the component.
+
+**What to do instead.** Make the stuck state impossible. If you
+can't, fail loudly (§9) and stop.
+
+---
+
+## 8. Decision tests before proposing a change
+
+Before adding any code, doc, layer, or feature, answer all of the
+following. **Any uncomfortable answer means stop and re-evaluate.**
+
+1. Does this change make the dishonest behavior IMPOSSIBLE, or
+   only less likely / less observable?
+2. Does this change scale to every instance of the problem class,
+   or only one?
+3. If I removed this change, would the underlying problem return
+   immediately?
+4. Am I adding a layer? If yes, can I instead remove the layer
+   that allowed the failure?
+5. Does this proposal match any pattern in §7? If yes, abandon it
+   and propose a corrective fix.
+6. Is the change complete in one merge? If not, why is the
+   intermediate state honest?
+7. Am I doing this because it's correct, or because it's the
+   easiest thing that looks like progress?
+
+If you cannot answer all seven cleanly, stop. Ask the operator.
+Do not proceed.
+
+---
+
+## 9. What to do when blocked
+
+When you cannot fix something cleanly in scope:
+
+- **Fail loudly.** Exit with a distinguishable code (e.g., 78).
+  Write a structured failure record. Do not retry silently.
+- **Stop.** Do not continue producing output as if the failure
+  didn't happen.
+- **Ask the operator.** Tell the user what's blocked, what you
+  tried, and what you need to proceed.
+- **Do not build a recovery layer.** That is the recovery-layer
+  pattern (§7.10).
+- **Do not propose phased fixes.** That is the phasing-as-deferral
+  pattern (§7.2).
+- **Do not narrow scope silently.** If the active set must shrink
+  to make the change tractable, name it explicitly and get sign-off.
+
+The operator prefers a small honest system that fails loudly over a
+large half-broken one that limps. A loud failure is more useful
+than a silent bandaid.
+
+---
+
+## 10. Definitions of ground truth
+
+For each collector, "real row" means the row was actually emitted
+by the underlying mechanism for *this episode*, not synthesized,
+defaulted, or carried over from a previous run.
+
+| Collector | Ground truth means |
+|---|---|
+| `proc` | Row read from `/proc/<qemu_pid>/{stat,io,status}` for the live qemu PID of this episode's target VM, while that PID is alive. |
+| `qmp` | Row obtained from a successful QMP `query-status` / `query-blockstats` round-trip on `cfg.qmp_socket` for this episode's qemu PID. |
+| `netflow` | Row computed from packet capture on `cfg.bridge_iface` for traffic involving this episode's target VM during the episode wall-clock window. |
+| `perf` | Row produced by `perf` (or equivalent) sampling this episode's qemu PID. Not from a previous run, not from a different PID. |
+| `guest` | Row received from the in-guest agent over the virtio-serial channel during the episode wall-clock window. The agent must be running in *this episode's* guest, not a stale one. |
+| `pcap` | Bytes captured from `cfg.bridge_iface` during the episode wall-clock window, written to `network.pcap`. |
+
+For each phase, "label justified" means the corresponding event was
+observed:
+
+| Phase | Justified by |
+|---|---|
+| `clean` | Episode start (orchestrator-emitted). |
+| `armed` | Orchestrator instructs the driver to fire (orchestrator-emitted). |
+| `infecting` | `exploit_fire` event observed in `events.jsonl`. |
+| `infected_running` | `session_open` event observed in `events.jsonl`. **Not** `session_open_timeout`, **not** schedule-clock. |
+| `dormant` | Observed in-session idle (no traffic / no command activity for N seconds). |
+| `failed` | `session_open_timeout` or other terminal driver failure. Episode is rejected (§4.6). |
+
+A row that doesn't meet the ground-truth bar is not a row. A label
+that isn't justified is not a label. The acceptance gate (§4.6)
+enforces both.
+
+---
+
+## 11. Honest reporting
+
+When you (a future session) report status to the operator:
+
+- **Distinguish merged from verified.** "Code merged" is not
+  "behavior verified in production." A passing test on a CI host
+  is not the same as a working system on a lab host.
+- **Distinguish proposed from implemented.** "I proposed X" is not
+  "X is in the repo."
+- **Audit your cumulative pattern.** At the end of a session,
+  re-read your own changes against §7. It is possible to add three
+  reasonable-looking layers in sequence that cumulatively form a
+  compensating-layer pattern, even if no individual one looks like
+  a bandaid.
+- **Name compensating layers you've built.** If §7 audit finds
+  matches, name them and propose their removal.
+- **Don't summarize cumulative changes as "fixes" without
+  auditing.** "I shipped 12 commits this session" is not the same
+  as "the pipeline is honest now."
+- **Verify before agreeing or refuting.** When the operator says
+  something is done that you can verify, verify it before agreeing.
+  When they say something is broken that you can verify, verify it
+  before refuting.
+
+---
+
+## 12. Glossary
+
+Terms used throughout this document, pinned to one definition.
+
+| Term | Definition |
+|---|---|
+| **Canonical manifest** | The single, version-pinned `manifest.toml` at the repo root. Every host loads this exact file. There is no per-host override (§4.1). |
+| **Active set** | The collectors enabled in the canonical manifest for a given run. A collector is in the active set only if it has passed admission criteria (§4.4). |
+| **Catalog** | The set of exploit modules in `exploits/modules/*.toml` that have passed admission (§4.3). Modules not in the catalog do not run. |
+| **Ground truth** | A row or label is ground truth when it was emitted by the underlying mechanism for *this* episode, with the justifying event observed. See §10. |
+| **Episode boundary** | An episode begins when the orchestrator emits the first `clean` label and ends when `done.marker` is written or the episode is moved to `rejected/`. All collector rows must fall inside this wall-clock window. |
+| **Configured collector** | A collector listed as enabled in the canonical manifest. Distinct from "running collector" (the process actually started) and "active set" (the manifest-listed plus admission-passing intersection). For acceptance purposes, only the configured set matters. |
+| **Admission criteria** | The bar a module / collector / target / override knob must pass to be in the active pipeline. See §4.3, §4.4, §13. |
+| **Honest** | Of an episode: every label justified by an observed event, every configured collector emitted ≥1 ground-truth row, working tree was clean (or override-stamped), HEAD on `origin/main`. Of the pipeline: every accepted episode is honest. |
+| **Bandaid / compensating layer** | A layer that absorbs a failure mode upstream of itself instead of fixing the upstream cause. See §7.1. |
+| **Override** | A knob that loosens an admission criterion or gate. There is exactly one — `CIS490_ALLOW_DIRTY` (§14). |
+| **Operator** | The human maintainer with sign-off authority. Distinct from agents that propose changes. See §15. |
+| **Containment regression** | A change that weakens any of the §4.13 isolation requirements. Rejected regardless of claimed experimental value. |
+
+---
+
+## 13. Admission scope (what triggers the bar)
+
+Any change to the following is in admission scope and must pass §4
+admission criteria + §15 operator sign-off:
+
+- Any module in `exploits/modules/*.toml`.
+- Any collector in the active set.
+- Any field of `manifest.toml`.
+- Any phase rule or label-emission code in the labeller.
+- Any gate in the producer or receiver.
+- Any schedule entry (phase budget, per-module timeout).
+- Any target VM build spec or its containment posture (§4.13).
+- Any override knob (the closed list in §14).
+
+The following are NOT admission scope and can be changed without
+admission ceremony, but must still pass §8 decision tests:
+
+- Internal refactors that do not change observable behavior of
+  any of the above.
+- Test code, fixtures, CI configuration.
+- Documentation that does not contradict §1.
+- Build/install scripts, insofar as they don't change what gets
+  shipped or how it's labelled.
+
+A future session that argues "this is just infrastructure" or
+"this is just tooling" to dodge admission scope: re-read this
+section. Anything that touches what gets shipped, how it's
+labelled, what runs on the host, the containment posture, or
+how the gate decides — is in scope. The "infrastructure /
+tooling" framing is a recurring sleight-of-hand vector and
+triggers automatic rejection.
+
+---
+
+## 14. Override knobs (closed list)
+
+The complete list of override knobs in CIS490, version-pinned to
+this document:
+
+| Knob | Effect | Where audited |
+|---|---|---|
+| `CIS490_ALLOW_DIRTY=1` (env var, orchestrator) | Allows the orchestrator to start with a dirty git tree. Stamps `dirty_override: true` in every `meta.json` produced. Receiver accepts only with matching stamp. | per-episode in `meta.json` |
+
+That is the entire list. Adding a knob to this list is itself an
+admission event (§13) requiring operator sign-off (§15) and an §8
+review.
+
+**Knobs that have been considered and rejected** (do not propose
+again without re-reading the rationale):
+
+- `verify_tls=false` — TLS verification is a correctness boundary;
+  bypassing it is the defensive-instead-of-corrective pattern
+  (§7.9).
+- `skip_preflight=1` — preflight is the gate; bypassing it makes
+  the gate non-functional.
+- `experimental_collector=true` — bypassing collector admission
+  is the single-instance-fix pattern (§7.3) wearing a flag.
+- `diagnostic_mode=true` — generic bypass; in practice would be
+  applied to hide failures, not investigate them.
+- `dry_run` for the producer — episodes that aren't shipped go to
+  `rejected/`; no dry-run flag needed.
+
+If a future session proposes a new override knob, the burden is on
+the proposal: pass §8, get operator sign-off, amend §14 in the
+same merge. "Add the knob now and amend §14 later" is the
+phasing-as-deferral pattern (§7.2) applied to documentation.
+
+---
+
+## 15. Sign-off discipline
+
+Admission decisions are made by the operator, not by agents acting
+alone. Specifically:
+
+- **Adding a module to the catalog** requires operator sign-off.
+  An agent runs `scripts/verify-catalog.sh`, presents the
+  verification result, and the operator decides whether the module
+  enters the catalog.
+- **Adding a collector to the active set** requires operator
+  sign-off. Agent runs the emit-test, operator decides.
+- **Promoting a target VM build** requires operator sign-off after
+  §4.2 verification and §4.13 containment tests pass.
+- **Adding an override knob** (§14) requires operator sign-off.
+- **Amending PIPELINE.md** requires operator sign-off (§16).
+
+**Removing** anything from the catalog or active set does NOT
+require operator sign-off — the bar is asymmetric. Tightening
+is always permitted; loosening requires sign-off.
+
+The operator is the human with maintainer credentials on the
+repository. Agents propose, run verification, and present results;
+the operator decides admission.
+
+If an agent is acting in a non-interactive context (CI run,
+scheduled job) where no operator is available to sign off, the
+agent does not admit anything. It produces verification output
+and stops.
+
+---
+
+## 16. Amending PIPELINE.md
+
+This document is not immutable, but it is the canonical statement
+of the bar. Amendments are governed by the same discipline as
+admission decisions:
+
+1. Any change to §1 (principle), §4 (fix items), §7 (anti-patterns),
+   §8 (decision tests), §10 (ground truth), §13 (admission scope),
+   §14 (override list), or §15 (sign-off) is a substantive
+   amendment.
+2. Substantive amendments require operator sign-off (§15) and must
+   pass §8 decision tests applied to the amendment itself.
+3. The amendment lands in the same merge as the code change it
+   justifies. "Amend the doc later" is the phasing pattern (§7.2).
+4. Editorial changes (typos, formatting, link fixes, glossary
+   wording) do not require sign-off but should be flagged in the
+   commit message.
+
+A future session that wants to add a feature or layer the document
+forbids: the path is to amend the document, not to work around it.
+"This isn't covered by PIPELINE.md, so I'll just do it" is the
+out-of-scope-for-correctness pattern (§7.8) applied to the
+meta-document. Anything that touches admission scope (§13) is
+covered even if not named explicitly.
+
+If you find the document is wrong — internally inconsistent,
+contradicts observed reality, prescribes something impossible —
+file a Forgejo issue against the repo with the contradiction
+documented. Do not silently work around the doc.
+
+---
+
+## 17. What this plan supersedes
+
+The following docs are deleted or rewritten as part of landing this
+plan:
+
+| Doc | Action |
+|---|---|
+| `FIXYOURSELF.md` | Deleted. Compensating-layer doc; the states it covers don't exist after §4.6. |
+| `AGENTS.md` "symptom→fix table" | Deleted. Bandaid-driven. |
+| `AGENTS.md` "Hosts self-update" section | Deleted. Hosts run pinned commits. |
+| `AGENTS.md` "Tier 3+4 deploy zero-touch" claim | Rewritten. Targets are built locally now, not auto-fetched. |
+| `AGENTS.md` "trust the in-guest probe alone, cross-check host CPU" | Deleted. The producer-side gate makes this fictional cross-check unnecessary. |
+| `TIER3-BRINGUP.md` | Kept as historical record — labelled bug report, not current guidance. |
+| `README.md` Tier-3+4 narrative | Reviewed and aligned. |
+
+If you are a future session reading this and find another doc that
+contradicts §1–§6 of this file: this file is right and the other
+doc is wrong. Fix the other doc.