CIS490/tests
max eda6164897 fix: lab-host install loop after commit-gate cutover
Why services weren't starting after the gate went live:

1. install-lab-host.sh self-copy. The receiver's 400 remediation tells
   the agent to `cd /opt/cis490 && git pull && sudo
   ./scripts/install-lab-host.sh`. That makes REPO_ROOT==INSTALL_ROOT
   and `cp -aT $REPO_ROOT $INSTALL_ROOT` errors with "are the same
   file"; `set -e` aborts before the systemd units install or anything
   restarts. Detect the same-dir case and skip the cp; chown still
   runs.

2. Services never restart. install-lab-host.sh and install-tier-3-4.sh
   both ended by *telling the operator* to restart, then exiting. The
   running shipper/orchestrator kept executing pre-gate code from the
   old module objects, so new `code_version` stamping never reached an
   episode. Both scripts now `systemctl restart` the units they own
   when those units are enabled.

3. Shipper queue fatal-loop. queue.py incremented `fatal++` but didn't
   move the episode out of `data/episodes/`. Next scan re-tarred and
   re-PUT the same dir, getting 400 again. With 4465+ pre-stamp
   episodes on k-gamingcom this burned ~1 PUT/sec for 5+ hours of
   receiver log. Fatal episodes now move to data/quarantine/<id>/ with
   a quarantine_reason.json beside them; the outbox tarball is
   deleted.

4. Pre-stamp backlog drain. tools/quarantine_unstamped.py is a
   one-shot that scans data/episodes/ and quarantines anything without
   a 40-char-hex code_version.commit. Wired into install-lab-host.sh
   step 9 so a re-install drains the queue automatically. Idempotent;
   safe to run while the shipper is active.

Tests cover the queue's new fatal-quarantine path and every drain
behaviour (kept/quarantined/dry-run/idempotent/missing-meta/collision).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 11:36:21 -05:00
..
__init__.py Add receiver: PUT /v1/episodes ingest with sha256 verify and idempotency 2026-04-28 23:34:04 -06:00
test_auto_fetch_samples.py auto_fetch_samples: pick Linux i386 ELF; manifest matches theZoo 2026-05-01 03:28:26 -05:00
test_episode.py meta.json: stamp code_version (commit, branch, dirty) per episode 2026-05-01 01:29:01 -05:00
test_exploits.py fleet: rotate exploit modules per (host, slot, ep); Tier 3 by default 2026-04-30 02:22:49 -05:00
test_fleet.py Solvable Tier-3 holes: callback payloads, busybox workloads, bridge by default 2026-04-30 02:32:52 -05:00
test_guest_agent.py Collectors 2/4/5 + fleet runner + sample manifest + Tier-3 setup scripts 2026-04-30 00:02:27 -05:00
test_pcap.py Collectors 2/4/5 + fleet runner + sample manifest + Tier-3 setup scripts 2026-04-30 00:02:27 -05:00
test_perf_qemu.py Close out the open issues: bridge pcap wiring, perf collector, Tier-4 2026-04-30 00:17:49 -05:00
test_proc_qemu.py Add v0 orchestrator + first oracle collector (host /proc) 2026-04-28 23:40:25 -06:00
test_prune.py Multi-signal prune classifier: rescue valid episodes /proc misses 2026-04-30 19:10:01 -05:00
test_qmp.py Close out the deployment-readiness gaps 2026-04-30 00:31:55 -05:00
test_quarantine_unstamped.py fix: lab-host install loop after commit-gate cutover 2026-05-01 11:36:21 -05:00
test_receiver.py Add receiver: PUT /v1/episodes ingest with sha256 verify and idempotency 2026-04-28 23:34:04 -06:00
test_shipper.py fix: lab-host install loop after commit-gate cutover 2026-05-01 11:36:21 -05:00
test_tier3_local_verify.py tools/verify_tier3_local.py: Pi-runnable Tier-3 verifier 2026-05-01 03:41:21 -05:00
test_tier4.py Close out the deployment-readiness gaps 2026-04-30 00:31:55 -05:00
test_ulid.py Add v0 orchestrator + first oracle collector (host /proc) 2026-04-28 23:40:25 -06:00
test_version_gate.py version_gate: Forgejo as canonical commit source (no fs perms needed) 2026-05-01 01:42:45 -05:00
test_vm_load_controller.py workload audit trail: meta.sample + per-phase events + pre-kill probe 2026-04-30 02:12:34 -05:00