The v1 driver ran ``yes > /dev/null`` for every sample, which
produced the same envelope shape regardless of which malware family
the orchestrator claimed to be running. That's a poor training
signal: the model sees identical /proc + QMP traces tagged
"cryptominer" / "ransomware" / "RAT" with no distinguishing
features. v2 fixes this.
What landed:
exploits/workloads.py — six ``Workload`` profiles, each producing
a distinct in-session shell command pair (start_cmd / stop_cmd)
that backgrounds a profile-shaped loop:
cpu-saturate — sustained 1-vCPU saturation (XMRig shape)
scan-and-dial — periodic SYN-style probes across 10.200.0.0/24
+ dial-home to gateway (Mirai shape)
io-walk — fs traversal + 4 KiB urandom writes, periodic
re-read (ransomware shape)
bursty-c2 — long idle, periodic 3-packet TCP egress burst
(Dridex C2 beacon shape)
low-and-slow — minimal CPU + periodic awk-driven memory churn
(Kovter / fileless shape)
shell-resident — single long-lived TCP socket pinned to gateway
with periodic 6-byte command ticks (RAT shape)
Each profile uses a /tmp/.cis490-workload-<profile>.{pid,sh} pair so
the stop_cmd can cleanly kill the loop and its descendants.
exploits/driver.py — MSFExploitDriver now accepts an optional
``Sample``. With one supplied, ``infected_running`` dispatches to
the matching workload via exploits.workloads.workload_for(); the
``sample_executed`` event records profile + sample name + sample
kind so the trainer can join cleanly. Without a sample, the v1
yes-loop path remains unchanged (backwards compat).
tools/vm_load_controller.py — the same dispatch on the Tier-2 path
(no exploit, real Alpine guest driven over the serial console).
A fleet wave now produces six visually distinct envelopes per
wave whether the underlying mode is Tier 2 or Tier 3.
tools/run_real_vm_demo.py — accepts ``--sample <name>`` (or
SAMPLE_NAME env from the fleet runner) + auto-wires QMP + agent
sockets into the EpisodeConfig so all three new collectors
(sources 2, 4, 5) run alongside source 1 by default.
tools/run_tier3_demo.py — same ``--sample`` plumbing for the
exploit-driven path.
Tests: 86 pass (was 82). New v2 cases:
- profile dispatch routes infected_running to the workload's
start_cmd (NOT the v1 yes-loop) when a Sample is set
- all six profiles produce distinct start_cmds (the property the
ML model needs)
- unknown profile string falls back to cpu-saturate with a warning
- v1 path (no Sample) still uses yes-loop (backwards compat)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
96 lines
3.6 KiB
Python
96 lines
3.6 KiB
Python
"""In-guest load controller for tier-2 episodes.
|
|
|
|
Drives a real Alpine VM through the same phase schedule the orchestrator
|
|
follows, but the load this time is generated *inside* the guest by busybox
|
|
``yes`` / ``dd`` / a small marker file. The host /proc collector still
|
|
samples the qemu-system process from outside — what's "real" here is the
|
|
workload itself, not the orchestrator's view of it.
|
|
|
|
Phase commands (all run via the SerialClient):
|
|
|
|
clean — kill any running load, idle.
|
|
armed — small disk write (handshake-shape).
|
|
infecting — disk burst: 512 KiB urandom write to /tmp/payload.
|
|
infected_running — background ``yes > /dev/null`` for sustained CPU.
|
|
dormant — kill background load (back to idle).
|
|
|
|
Designed to mimic the envelope of an XMRig-class compromise without
|
|
running real malware. Tier-3 will replace this with msf-driven exploit
|
|
fire and a real sample.
|
|
"""
|
|
|
|
from __future__ import annotations
|
|
|
|
import logging
|
|
import sys
|
|
from pathlib import Path
|
|
|
|
from vm_serial import SerialClient
|
|
|
|
# Allow running as a script (sibling of tools/).
|
|
sys.path.insert(0, str(Path(__file__).resolve().parent.parent))
|
|
|
|
from exploits.workloads import Workload, workload_for # noqa: E402
|
|
from samples.manifest import Sample # noqa: E402
|
|
|
|
|
|
log = logging.getLogger("cis490.vm_load_controller")
|
|
|
|
|
|
class VMLoadController:
|
|
"""Drives a real Alpine guest through the phase schedule for
|
|
Tier 2 (no exploit). Workload is chosen by ``sample.profile`` —
|
|
same profile catalog as the Tier-3 driver so a fleet wave
|
|
produces matched envelopes whether or not an exploit fires.
|
|
|
|
Without a sample, falls back to the original cpu-saturate yes-loop
|
|
(the original Tier-2 demo behaviour)."""
|
|
|
|
def __init__(self, serial: SerialClient, sample: Sample | None = None) -> None:
|
|
self.s = serial
|
|
self.sample = sample
|
|
self.workload: Workload | None = workload_for(sample)
|
|
|
|
def setup(self) -> None:
|
|
# Kill any pre-existing load and clear scratch space.
|
|
self._kill_load()
|
|
self.s.run("rm -f /tmp/payload /tmp/armed.log; echo setup-ok")
|
|
|
|
def teardown(self) -> None:
|
|
self._kill_load()
|
|
|
|
# ---- phases ---------------------------------------------------------
|
|
|
|
def set_phase(self, phase: str) -> None:
|
|
log.info("vm phase -> %s (profile=%s)",
|
|
phase, self.workload.profile if self.workload else "v1")
|
|
if phase == "clean":
|
|
self._kill_load()
|
|
elif phase == "armed":
|
|
self.s.run("echo armed-handshake-$(date +%s) > /tmp/armed.log")
|
|
elif phase == "infecting":
|
|
self.s.run(
|
|
"dd if=/dev/urandom of=/tmp/payload bs=4k count=128 2>/dev/null && "
|
|
"chmod +x /tmp/payload"
|
|
)
|
|
elif phase == "infected_running":
|
|
self._kill_load()
|
|
if self.workload is not None:
|
|
self.s.run(self.workload.start_cmd)
|
|
else:
|
|
self.s.run(
|
|
"nohup sh -c 'yes > /dev/null' </dev/null >/dev/null 2>&1 & disown"
|
|
)
|
|
elif phase == "dormant":
|
|
self._kill_load()
|
|
else:
|
|
log.warning("unknown phase: %s", phase)
|
|
|
|
# ---- internals ------------------------------------------------------
|
|
|
|
def _kill_load(self) -> None:
|
|
if self.workload is not None:
|
|
self.s.run(self.workload.stop_cmd)
|
|
# Always sweep the v1 leftover commands too, in case we just
|
|
# switched profiles mid-fleet-run.
|
|
self.s.run("pkill yes 2>/dev/null; pkill stress-ng 2>/dev/null; true")
|