runners: take savevm baseline-v1 after boot so revert_at_* actually works

EpisodeConfig.revert_at_start / revert_at_end have been issuing loadvm "baseline-v1" via QMP since the snapshot/revert wiring landed, but no part of the system was running savevm — so loadvm targeted a snapshot that didn't exist and silently emitted snapshot_revert_failed every time. The reverted-baseline mode was, in effect, dead code. Both runners now take a savevm immediately after the guest is up and reachable, before any workload runs: run_real_vm_demo.py — after SerialClient.login() succeeds (Tier 2) run_tier3_demo.py — after _wait_for_tcp on the vulnerable port (Tier 3, before the exploit fires) Both call qmp.QMPClient.savevm("baseline-v1"). Best-effort: if savevm fails (older qemu, non-qcow2 disk, KVM nesting issue), we log a warning and run the episode anyway — just without revert support. The snapshot_name in EpisodeConfig is unified to "baseline-v1" across both runners (Tier 3 was previously stamping "qcow2-snapshot-on" into meta, which didn't match what loadvm would target). Why both runners take savevm individually instead of a unified path: the two runners boot different launchers (launch_demo.sh for the Alpine cidata image, launch_target.sh for the vulnerable target). Each is responsible for its own QMP socket lifecycle. A shared savevm helper module would just be a one-line wrapper around the existing qmp.QMPClient.savevm; not worth the indirection. Existing test coverage: tests/test_qmp.py exercises QMPClient.savevm/loadvm against a fake server (HMP wrapper, error path). The runner-side call is exercised in production but not in unit tests — would need a fake launcher subprocess, which is outside this commit's scope. 132/132 tests still pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-30 02:37:05 -05:00 · 2026-04-30 02:37:05 -05:00 · 642f7a94d6
commit 642f7a94d6
parent 507eac617b
2 changed files with 40 additions and 2 deletions
--- a/tools/run_real_vm_demo.py
+++ b/tools/run_real_vm_demo.py
@ -27,6 +27,7 @@ from pathlib import Path
 sys.path.insert(0, str(Path(__file__).resolve().parent.parent))
 sys.path.insert(0, str(Path(__file__).resolve().parent))

+from collectors import qmp  # noqa: E402
 from orchestrator.episode import EpisodeConfig, EpisodeRunner  # noqa: E402
 from samples.manifest import SampleManifest  # noqa: E402
 from vm_load_controller import VMLoadController  # noqa: E402
@ -169,6 +170,26 @@ def main() -> int:
        serial.connect()
        serial.login(boot_timeout_s=args.boot_timeout)

+        # Take a savevm AFTER the guest is fully up but BEFORE we
+        # start any workload. EpisodeConfig.revert_at_{start,end} use
+        # this snapshot for inter-episode reverts (the snapshot lives
+        # in the qcow2's per-VM-process overlay since launch_demo.sh
+        # runs with snapshot=on, so it's discarded with the VM).
+        # Without this step, loadvm would target a snapshot that
+        # doesn't exist and silently emit snapshot_revert_failed.
+        qmp_sock = run_dir / "qmp.sock"
+        if qmp_sock.exists():
+            try:
+                _qmp = qmp.QMPClient(qmp_sock)
+                _qmp.connect()
+                try:
+                    out = _qmp.savevm("baseline-v1")
+                    log.info("savevm baseline-v1 OK: %s", out.strip()[:160])
+                finally:
+                    _qmp.close()
+            except Exception as e:
+                log.warning("savevm failed; revert_at_start unusable: %s", e)
+
        # Bind the controller to the runner's event log so workload
        # success/failure shows up alongside phase_transition events.
        # Sample also goes into EpisodeConfig below so meta.sample
@ -184,7 +205,6 @@ def main() -> int:
        )
        controller.setup()

-        qmp_sock = run_dir / "qmp.sock"
        agent_sock = run_dir / "agent.sock"
        cfg = EpisodeConfig(
            target_pid=qemu_pid,
--- a/tools/run_tier3_demo.py
+++ b/tools/run_tier3_demo.py
@ -33,6 +33,7 @@ from pathlib import Path
 # Allow running as a script.
 sys.path.insert(0, str(Path(__file__).resolve().parent.parent))

+from collectors import qmp  # noqa: E402
 from exploits.driver import DriverConfig, MSFExploitDriver  # noqa: E402
 from exploits.modules import load_module_config  # noqa: E402
 from exploits.msfrpc import MSFRpcClient, MSFRpcConfig  # noqa: E402
@ -207,6 +208,23 @@ def main() -> int:
        _wait_for_tcp(args.target_ip, args.target_port, args.target_boot_timeout)
        log.info("target service is up")

+        # Pre-exploit savevm so EpisodeConfig.revert_at_{start,end}
+        # has a known-good baseline to load. Best-effort — we still
+        # run the episode if savevm fails (just without revert
+        # support). See run_real_vm_demo.py for the same pattern.
+        qmp_sock = run_dir / "qmp.sock"
+        if qmp_sock.exists():
+            try:
+                _qmp = qmp.QMPClient(qmp_sock)
+                _qmp.connect()
+                try:
+                    out = _qmp.savevm("baseline-v1")
+                    log.info("savevm baseline-v1 OK: %s", out.strip()[:160])
+                finally:
+                    _qmp.close()
+            except Exception as e:
+                log.warning("savevm failed; revert_at_start unusable: %s", e)
+
        client = MSFRpcClient(
            MSFRpcConfig(
                host=args.msfrpc_host,
@ -223,7 +241,7 @@ def main() -> int:
            data_root=Path(args.data_root),
            phase_schedule=DEFAULT_SCHEDULE,
            image_name=module.name + "-target",
-            snapshot_name="qcow2-snapshot-on",
+            snapshot_name="baseline-v1",
            sample=sample,
            exploit_meta={
                "framework": "metasploit",