runners: take savevm baseline-v1 after boot so revert_at_* actually works

EpisodeConfig.revert_at_start / revert_at_end have been issuing
loadvm "baseline-v1" via QMP since the snapshot/revert wiring landed,
but no part of the system was running savevm — so loadvm targeted a
snapshot that didn't exist and silently emitted snapshot_revert_failed
every time. The reverted-baseline mode was, in effect, dead code.

Both runners now take a savevm immediately after the guest is up
and reachable, before any workload runs:

  run_real_vm_demo.py — after SerialClient.login() succeeds (Tier 2)
  run_tier3_demo.py   — after _wait_for_tcp on the vulnerable port
                        (Tier 3, before the exploit fires)

Both call qmp.QMPClient.savevm("baseline-v1"). Best-effort: if savevm
fails (older qemu, non-qcow2 disk, KVM nesting issue), we log a
warning and run the episode anyway — just without revert support.

The snapshot_name in EpisodeConfig is unified to "baseline-v1" across
both runners (Tier 3 was previously stamping "qcow2-snapshot-on" into
meta, which didn't match what loadvm would target).

Why both runners take savevm individually instead of a unified path:
the two runners boot different launchers (launch_demo.sh for the
Alpine cidata image, launch_target.sh for the vulnerable target).
Each is responsible for its own QMP socket lifecycle. A shared
savevm helper module would just be a one-line wrapper around the
existing qmp.QMPClient.savevm; not worth the indirection.

Existing test coverage: tests/test_qmp.py exercises
QMPClient.savevm/loadvm against a fake server (HMP wrapper, error
path). The runner-side call is exercised in production but not in
unit tests — would need a fake launcher subprocess, which is outside
this commit's scope.

132/132 tests still pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
max 2026-04-30 02:37:05 -05:00
parent 507eac617b
commit 642f7a94d6
2 changed files with 40 additions and 2 deletions

View file

@ -27,6 +27,7 @@ from pathlib import Path
sys.path.insert(0, str(Path(__file__).resolve().parent.parent))
sys.path.insert(0, str(Path(__file__).resolve().parent))
from collectors import qmp # noqa: E402
from orchestrator.episode import EpisodeConfig, EpisodeRunner # noqa: E402
from samples.manifest import SampleManifest # noqa: E402
from vm_load_controller import VMLoadController # noqa: E402
@ -169,6 +170,26 @@ def main() -> int:
serial.connect()
serial.login(boot_timeout_s=args.boot_timeout)
# Take a savevm AFTER the guest is fully up but BEFORE we
# start any workload. EpisodeConfig.revert_at_{start,end} use
# this snapshot for inter-episode reverts (the snapshot lives
# in the qcow2's per-VM-process overlay since launch_demo.sh
# runs with snapshot=on, so it's discarded with the VM).
# Without this step, loadvm would target a snapshot that
# doesn't exist and silently emit snapshot_revert_failed.
qmp_sock = run_dir / "qmp.sock"
if qmp_sock.exists():
try:
_qmp = qmp.QMPClient(qmp_sock)
_qmp.connect()
try:
out = _qmp.savevm("baseline-v1")
log.info("savevm baseline-v1 OK: %s", out.strip()[:160])
finally:
_qmp.close()
except Exception as e:
log.warning("savevm failed; revert_at_start unusable: %s", e)
# Bind the controller to the runner's event log so workload
# success/failure shows up alongside phase_transition events.
# Sample also goes into EpisodeConfig below so meta.sample
@ -184,7 +205,6 @@ def main() -> int:
)
controller.setup()
qmp_sock = run_dir / "qmp.sock"
agent_sock = run_dir / "agent.sock"
cfg = EpisodeConfig(
target_pid=qemu_pid,

View file

@ -33,6 +33,7 @@ from pathlib import Path
# Allow running as a script.
sys.path.insert(0, str(Path(__file__).resolve().parent.parent))
from collectors import qmp # noqa: E402
from exploits.driver import DriverConfig, MSFExploitDriver # noqa: E402
from exploits.modules import load_module_config # noqa: E402
from exploits.msfrpc import MSFRpcClient, MSFRpcConfig # noqa: E402
@ -207,6 +208,23 @@ def main() -> int:
_wait_for_tcp(args.target_ip, args.target_port, args.target_boot_timeout)
log.info("target service is up")
# Pre-exploit savevm so EpisodeConfig.revert_at_{start,end}
# has a known-good baseline to load. Best-effort — we still
# run the episode if savevm fails (just without revert
# support). See run_real_vm_demo.py for the same pattern.
qmp_sock = run_dir / "qmp.sock"
if qmp_sock.exists():
try:
_qmp = qmp.QMPClient(qmp_sock)
_qmp.connect()
try:
out = _qmp.savevm("baseline-v1")
log.info("savevm baseline-v1 OK: %s", out.strip()[:160])
finally:
_qmp.close()
except Exception as e:
log.warning("savevm failed; revert_at_start unusable: %s", e)
client = MSFRpcClient(
MSFRpcConfig(
host=args.msfrpc_host,
@ -223,7 +241,7 @@ def main() -> int:
data_root=Path(args.data_root),
phase_schedule=DEFAULT_SCHEDULE,
image_name=module.name + "-target",
snapshot_name="qcow2-snapshot-on",
snapshot_name="baseline-v1",
sample=sample,
exploit_meta={
"framework": "metasploit",