The v1 driver ran ``yes > /dev/null`` for every sample, which
produced the same envelope shape regardless of which malware family
the orchestrator claimed to be running. That's a poor training
signal: the model sees identical /proc + QMP traces tagged
"cryptominer" / "ransomware" / "RAT" with no distinguishing
features. v2 fixes this.
What landed:
exploits/workloads.py — six ``Workload`` profiles, each producing
a distinct in-session shell command pair (start_cmd / stop_cmd)
that backgrounds a profile-shaped loop:
cpu-saturate — sustained 1-vCPU saturation (XMRig shape)
scan-and-dial — periodic SYN-style probes across 10.200.0.0/24
+ dial-home to gateway (Mirai shape)
io-walk — fs traversal + 4 KiB urandom writes, periodic
re-read (ransomware shape)
bursty-c2 — long idle, periodic 3-packet TCP egress burst
(Dridex C2 beacon shape)
low-and-slow — minimal CPU + periodic awk-driven memory churn
(Kovter / fileless shape)
shell-resident — single long-lived TCP socket pinned to gateway
with periodic 6-byte command ticks (RAT shape)
Each profile uses a /tmp/.cis490-workload-<profile>.{pid,sh} pair so
the stop_cmd can cleanly kill the loop and its descendants.
exploits/driver.py — MSFExploitDriver now accepts an optional
``Sample``. With one supplied, ``infected_running`` dispatches to
the matching workload via exploits.workloads.workload_for(); the
``sample_executed`` event records profile + sample name + sample
kind so the trainer can join cleanly. Without a sample, the v1
yes-loop path remains unchanged (backwards compat).
tools/vm_load_controller.py — the same dispatch on the Tier-2 path
(no exploit, real Alpine guest driven over the serial console).
A fleet wave now produces six visually distinct envelopes per
wave whether the underlying mode is Tier 2 or Tier 3.
tools/run_real_vm_demo.py — accepts ``--sample <name>`` (or
SAMPLE_NAME env from the fleet runner) + auto-wires QMP + agent
sockets into the EpisodeConfig so all three new collectors
(sources 2, 4, 5) run alongside source 1 by default.
tools/run_tier3_demo.py — same ``--sample`` plumbing for the
exploit-driven path.
Tests: 86 pass (was 82). New v2 cases:
- profile dispatch routes infected_running to the workload's
start_cmd (NOT the v1 yes-loop) when a Sample is set
- all six profiles produce distinct start_cmds (the property the
ML model needs)
- unknown profile string falls back to cpu-saturate with a warning
- v1 path (no Sample) still uses yes-loop (backwards compat)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|---|---|---|
| .. | ||
| modules | ||
| __init__.py | ||
| driver.py | ||
| modules.py | ||
| msfrpc.py | ||
| README.md | ||
| workloads.py | ||
exploits/
The Tier-3 exploit driver — fires a Metasploit module against a
vulnerable target VM, watches for the resulting session, and stamps the
session-open transition into the episode's events.jsonl so the
labeler can mark armed → infecting honestly.
Layout
exploits/
msfrpc.py tiny msgpack-over-HTTPS client for msfrpcd
driver.py MSFExploitDriver — plugged in as EpisodeRunner.on_phase
modules.py ModuleConfig + TOML loader
modules/
vsftpd_234_backdoor.toml first canned module (Metasploitable2)
...
Module configs
Each modules/*.toml describes one Metasploit module — its path, the
options to set, and the payload to use. The driver reads these files
to drive module.execute over msfrpc.
description = "..."
[module]
type = "exploit" # exploit | auxiliary | post
path = "unix/ftp/vsftpd_234_backdoor"
[module.options]
RHOSTS = "{{ target_ip }}" # placeholder substituted at runtime
RPORT = 21
[payload]
path = "cmd/unix/interact"
[payload.options] # optional
# LHOST = "{{ target_ip }}"
[session]
type = "shell"
The only placeholder supported today is {{ target_ip }}. Add more in
exploits/modules.py::ModuleConfig.render_options when needed.
Running
# 1. Start msfrpcd locally:
msfrpcd -P <password> -U msf -a 127.0.0.1 -p 55553
# 2. Drop a vulnerable target image at vm/images/<name>.qcow2 (e.g.
# Metasploitable2 — see docs/sources.md for sha256).
# 3. Drive an episode:
MSFRPC_PASSWORD=<password> uv run python tools/run_tier3_demo.py \
--module vsftpd_234_backdoor \
--target-port 21 \
--data-root data
The episode's events.jsonl will contain:
driver_setup — module + target snapshotted before fire
exploit_fire — module.execute issued
session_open — new session id observed in session.list
session_landing_probe — first command response (id) recorded
sample_executed — workload kicked off inside the session
session_dormant — workload killed
session_killed — session.stop at episode end
These pair with the standard phase labels in labels.jsonl so a
downstream loader can reconcile "what the orchestrator scheduled"
against "what actually happened on the wire".
Adding a module
- Drop a TOML at
exploits/modules/<name>.tomlper the schema above. - Pick a payload that works without a callback channel until the
br-malwarebridge is in (seevm/launch_target.sh— SLIRP +restrict=onblocks reverse-tcp by design).cmd/unix/interactand other "session on the same socket" payloads are safe. - Drive a quick check:
uv run python tools/run_tier3_demo.py --module <name>. - The new module is automatically picked up by
tools/run_tier3_demo.pyvia--module <name>; no driver code changes needed.
We do not author exploits or modify upstream Metasploit code. The driver is a pure adapter from the project's phase machine to msfrpc.