Diagnoses + fixes for the silent-collector / never-lands-session
failures that the 200-episode quality probe surfaced (§3 evidence).
All four address the producer; no compensating layers added.
perf collector (rows_perf=0 on 100% of episodes):
- perf stat -j writes to stderr by default with -p; we read stdout.
Add --log-fd 1 so JSON reaches stdout where the parser sees it.
- Event names come back annotated with the privilege scope perf
actually measured ("cycles:u" under perf_event_paranoid=2). Strip
the suffix so _build_row's plain-name lookups hit. Without this
every metric was None even when perf reported real numbers.
- tests/test_collectors_emit.py covers the regression with a real
busy-loop fixture; emit-test discipline per §4.4.
guest-agent collector (rows_guest=0 on 100% of episodes):
- Alpine cloud image doesn't ship python3, so the in-guest agent's
`#!/usr/bin/env python3` shebang silently fails. Add packages:
[python3] to cidata user-data so cloud-init installs it before
the OpenRC service starts.
- Guest agent now exits nonzero (was: silent stdout fallback) when
/dev/virtio-ports/cis490.guest.agent is missing, so OpenRC
reports the failure to /var/log/cis490-agent.log instead of the
bytes vanishing into the void. Refs §1.
- Host-side collector emits guest_agent_connected /
guest_agent_first_byte / guest_agent_silent_window into the
orchestrator's events.jsonl. Future episodes show the in-guest
failure mode per-episode instead of inferring from rows_guest=0.
k-gamingcom missing qmp/netflow/pcap (also affected elliott on
Tier-3 episodes — was misclassified as host divergence):
- tools/run_tier3_demo.py was building EpisodeConfig WITHOUT
qmp_socket / guest_agent_socket / bridge_iface — even though
launch_target.sh creates the underlying chardevs and BRIDGE
supplies the iface. tools/run_real_vm_demo.py wires them
correctly; Tier-3 had a copy-paste gap.
- tests/test_collectors_emit.py adds a source-grep regression so
the wiring stays honest.
samba_usermap_script never lands session (0/67 in §3 probe):
- Bind handler default WfsDelay (~5s) gives up before bind_perl on
Metasploitable2 has finished forking + binding LPORT under
SLIRP+hostfwd. Bump to 30s; matches session_open_timeout_s in
exploits/driver.py so framework + driver agree on the wait
budget. Add ConnectTimeout=15 so the handler's bind connect has
retry budget instead of one-shot.
orchestrator/fleet.py: usable_modules + BRIDGE handling were both
unconditional, so:
- With BRIDGE set, requires_bridge modules were still being
dropped — picker only ever returned samba_usermap_script across
every slot/episode (the test_fleet_uses_all_modules_when_bridge_set
failure on HEAD).
- env.pop("BRIDGE") fired even when BRIDGE was the operator's
explicit setup, breaking modules that need bridge mode (vsftpd
backdoor on hardcoded port 6200, distccd, etc.).
Both made conditional on bridge_set so the picker walks the full
catalog under bridge mode and SLIRP-only modules still get a
clean SLIRP env when BRIDGE is unset.
receiver/app.py: half-pregnant v2 schema state in HEAD — calling
store.ingest_stream(episode_type=..., benign_profile=...) with
kwargs the matching store.py change was in the WIP stash. Removed
v2 awareness from app.py so v1 episodes (what the producer ships
today) get accepted again. SCHEMA_VERSION default reset to 1 to
match.
229 passed, 0 failed. (HEAD had 15 failures, all linked to the
half-pregnant v2 state above.)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|---|---|---|
| .. | ||
| modules | ||
| __init__.py | ||
| driver.py | ||
| modules.py | ||
| msfrpc.py | ||
| README.md | ||
| workloads.py | ||
exploits/
The Tier-3 exploit driver — fires a Metasploit module against a
vulnerable target VM, watches for the resulting session, and stamps the
session-open transition into the episode's events.jsonl so the
labeler can mark armed → infecting honestly.
Layout
exploits/
msfrpc.py tiny msgpack-over-HTTPS client for msfrpcd
driver.py MSFExploitDriver — plugged in as EpisodeRunner.on_phase
modules.py ModuleConfig + TOML loader
modules/
vsftpd_234_backdoor.toml first canned module (Metasploitable2)
...
Module configs
Each modules/*.toml describes one Metasploit module — its path, the
options to set, and the payload to use. The driver reads these files
to drive module.execute over msfrpc.
description = "..."
[module]
type = "exploit" # exploit | auxiliary | post
path = "unix/ftp/vsftpd_234_backdoor"
[module.options]
RHOSTS = "{{ target_ip }}" # placeholder substituted at runtime
RPORT = 21
[payload]
path = "cmd/unix/interact"
[payload.options] # optional
# LHOST = "{{ target_ip }}"
[session]
type = "shell"
The only placeholder supported today is {{ target_ip }}. Add more in
exploits/modules.py::ModuleConfig.render_options when needed.
Running
# 1. Start msfrpcd locally:
msfrpcd -P <password> -U msf -a 127.0.0.1 -p 55553
# 2. Drop a vulnerable target image at vm/images/<name>.qcow2 (e.g.
# Metasploitable2 — see docs/sources.md for sha256).
# 3. Drive an episode:
MSFRPC_PASSWORD=<password> uv run python tools/run_tier3_demo.py \
--module vsftpd_234_backdoor \
--target-port 21 \
--data-root data
The episode's events.jsonl will contain:
driver_setup — module + target snapshotted before fire
exploit_fire — module.execute issued
session_open — new session id observed in session.list
session_landing_probe — first command response (id) recorded
sample_executed — workload kicked off inside the session
session_dormant — workload killed
session_killed — session.stop at episode end
These pair with the standard phase labels in labels.jsonl so a
downstream loader can reconcile "what the orchestrator scheduled"
against "what actually happened on the wire".
Adding a module
- Drop a TOML at
exploits/modules/<name>.tomlper the schema above. - Pick a payload that works without a callback channel until the
br-malwarebridge is in (seevm/launch_target.sh— SLIRP +restrict=onblocks reverse-tcp by design).cmd/unix/interactand other "session on the same socket" payloads are safe. - Drive a quick check:
uv run python tools/run_tier3_demo.py --module <name>. - The new module is automatically picked up by
tools/run_tier3_demo.pyvia--module <name>; no driver code changes needed.
We do not author exploits or modify upstream Metasploit code. The driver is a pure adapter from the project's phase machine to msfrpc.