CIS490

History

Max Gorog d9f913fc97 PIPELINE §5 step 6: event-driven labeller (§4.5) Phase labels are written ONLY when justifying events arrive. The schedule clock is now a budget — an upper bound — never a label source. This is the core honesty fix the §3 evidence demanded: Before: every Tier-3 episode wrote `infected_running` from the schedule clock regardless of whether session_open ever fired. Per §10 every dishonest label is a poisoned training example. 67/67 of the §3 probe episodes were poisoned this way. After: `infecting` writes ONLY when exploit_fire is observed in events.jsonl. `infected_running` writes ONLY when session_open is observed. Either timing out or seeing session_open_timeout terminates the walker with a `failed` label that the §4.6 acceptance gate will reject. PHASE_JUSTIFYING_EVENTS in orchestrator/episode.py declares which events justify which phases: "clean": None # orchestrator-emitted "armed": None # orchestrator-emitted "infecting": ("exploit_fire",) "infected_running": ("session_open",) TERMINAL_FAILURE_EVENTS = {"session_open_timeout"} short-circuit any event-driven wait into a `failed` label. `dormant` is intentionally OFF the canonical schedule. §4.5 calls for dormant to be event-driven (session_idle / session_active) too, but the driver doesn't emit those yet. Per §1 default-to-removal we ship without dormant rather than label it from the clock; when the driver gains those emits, dormant re-enters the schedule with proper justification. EpisodeRunner now owns: * `_event_log` — every emit_event appends here * `_event_cv` — condition variable for waiters * `_wait_for_event(names, since_t_mono_ns, timeout_s)` — returns the first matching event in the log with t_mono >= threshold; threshold catches events that fired during the previous on_phase callback. When an event-driven phase's justifier already arrived (e.g. exploit_fire emitted by driver._fire() inside on_phase("armed")), the walker uses the EVENT's t_mono on the label — not the time the walker noticed. The label means "this is when this thing actually happened." manifest.toml: dropped the dormant cycle from the canonical schedule. Episode is shorter (~30s) but every label is event-justified. 14 new tests in tests/test_event_driven_labeller.py covering: justifier mapping invariants, _wait_for_event semantics (already-arrived, future, timeout, since-threshold, first-of-multiple-names), walker behavior (orchestrator-emitted phases, event-driven phases, missing event → failed, terminal-failure-event short-circuit, stop event, event-t_mono on label, phase_transition events with justified_by). 286 tests passing. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>		2026-05-04 01:43:16 -05:00
..
__init__.py	Add receiver: PUT /v1/episodes ingest with sha256 verify and idempotency	2026-04-28 23:34:04 -06:00
test_auto_fetch_samples.py	auto_fetch_samples: pick Linux i386 ELF; manifest matches theZoo	2026-05-01 03:28:26 -05:00
test_collectors_emit.py	PIPELINE §5 step 5: collector admission emit tests (§4.4)	2026-05-04 01:37:40 -05:00
test_containment.py	PIPELINE §5 step 3: target VM build infrastructure + containment posture	2026-05-04 01:31:40 -05:00
test_doctor_shipping.py	shipper: systemd watchdog, quarantine cleanup; doctor surfaces ship errors	2026-05-01 12:02:59 -05:00
test_episode.py	meta.json: stamp code_version (commit, branch, dirty) per episode	2026-05-01 01:29:01 -05:00
test_event_driven_labeller.py	PIPELINE §5 step 6: event-driven labeller (§4.5)	2026-05-04 01:43:16 -05:00
test_exploits.py	catalog: remove samba_usermap_script — never landed sessions in prod	2026-05-03 22:48:03 -05:00
test_fleet.py	PIPELINE §5 step 2: canonical manifest at <repo>/manifest.toml	2026-05-04 01:25:01 -05:00
test_fleet_health.py	fleet-health: exit 0 when alerts found (don't mark unit failed)	2026-05-02 13:51:20 -05:00
test_guest_agent.py	Collectors 2/4/5 + fleet runner + sample manifest + Tier-3 setup scripts	2026-04-30 00:02:27 -05:00
test_host_health.py	fleet-health: proactive alerts on the Pi + per-host doctor reports	2026-05-02 13:48:31 -05:00
test_manifest.py	PIPELINE §5 step 2: canonical manifest at <repo>/manifest.toml	2026-05-04 01:25:01 -05:00
test_pcap.py	Collectors 2/4/5 + fleet runner + sample manifest + Tier-3 setup scripts	2026-04-30 00:02:27 -05:00
test_perf_qemu.py	Close out the open issues: bridge pcap wiring, perf collector, Tier-4	2026-04-30 00:17:49 -05:00
test_proc_qemu.py	Add v0 orchestrator + first oracle collector (host /proc)	2026-04-28 23:40:25 -06:00
test_prune.py	Multi-signal prune classifier: rescue valid episodes /proc misses	2026-04-30 19:10:01 -05:00
test_qmp.py	Close out the deployment-readiness gaps	2026-04-30 00:31:55 -05:00
test_quarantine_unstamped.py	fix: lab-host install loop after commit-gate cutover	2026-05-01 11:36:21 -05:00
test_receiver.py	Add receiver: PUT /v1/episodes ingest with sha256 verify and idempotency	2026-04-28 23:34:04 -06:00
test_shipper.py	shipper: systemd watchdog, quarantine cleanup; doctor surfaces ship errors	2026-05-01 12:02:59 -05:00
test_target_spec.py	PIPELINE §5 step 3: target VM build infrastructure + containment posture	2026-05-04 01:31:40 -05:00
test_tier3_local_verify.py	tools/verify_tier3_local.py: Pi-runnable Tier-3 verifier	2026-05-01 03:41:21 -05:00
test_tier4.py	Close out the deployment-readiness gaps	2026-04-30 00:31:55 -05:00
test_ulid.py	Add v0 orchestrator + first oracle collector (host /proc)	2026-04-28 23:40:25 -06:00
test_verify_catalog.py	PIPELINE §5 step 4: catalog admission verifier (§4.3)	2026-05-04 01:35:32 -05:00
test_version_gate.py	robustness: gate falls back to local git, queue sweeps stale tarballs	2026-05-01 11:49:38 -05:00
test_vm_load_controller.py	workload audit trail: meta.sample + per-phase events + pre-kill probe	2026-04-30 02:12:34 -05:00