EpisodeConfig.revert_at_start / revert_at_end have been issuing
loadvm "baseline-v1" via QMP since the snapshot/revert wiring landed,
but no part of the system was running savevm — so loadvm targeted a
snapshot that didn't exist and silently emitted snapshot_revert_failed
every time. The reverted-baseline mode was, in effect, dead code.
Both runners now take a savevm immediately after the guest is up
and reachable, before any workload runs:
run_real_vm_demo.py — after SerialClient.login() succeeds (Tier 2)
run_tier3_demo.py — after _wait_for_tcp on the vulnerable port
(Tier 3, before the exploit fires)
Both call qmp.QMPClient.savevm("baseline-v1"). Best-effort: if savevm
fails (older qemu, non-qcow2 disk, KVM nesting issue), we log a
warning and run the episode anyway — just without revert support.
The snapshot_name in EpisodeConfig is unified to "baseline-v1" across
both runners (Tier 3 was previously stamping "qcow2-snapshot-on" into
meta, which didn't match what loadvm would target).
Why both runners take savevm individually instead of a unified path:
the two runners boot different launchers (launch_demo.sh for the
Alpine cidata image, launch_target.sh for the vulnerable target).
Each is responsible for its own QMP socket lifecycle. A shared
savevm helper module would just be a one-line wrapper around the
existing qmp.QMPClient.savevm; not worth the indirection.
Existing test coverage: tests/test_qmp.py exercises
QMPClient.savevm/loadvm against a fake server (HMP wrapper, error
path). The runner-side call is exercised in production but not in
unit tests — would need a fake launcher subprocess, which is outside
this commit's scope.
132/132 tests still pass.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>