CIS490/tools
max 642f7a94d6 runners: take savevm baseline-v1 after boot so revert_at_* actually works
EpisodeConfig.revert_at_start / revert_at_end have been issuing
loadvm "baseline-v1" via QMP since the snapshot/revert wiring landed,
but no part of the system was running savevm — so loadvm targeted a
snapshot that didn't exist and silently emitted snapshot_revert_failed
every time. The reverted-baseline mode was, in effect, dead code.

Both runners now take a savevm immediately after the guest is up
and reachable, before any workload runs:

  run_real_vm_demo.py — after SerialClient.login() succeeds (Tier 2)
  run_tier3_demo.py   — after _wait_for_tcp on the vulnerable port
                        (Tier 3, before the exploit fires)

Both call qmp.QMPClient.savevm("baseline-v1"). Best-effort: if savevm
fails (older qemu, non-qcow2 disk, KVM nesting issue), we log a
warning and run the episode anyway — just without revert support.

The snapshot_name in EpisodeConfig is unified to "baseline-v1" across
both runners (Tier 3 was previously stamping "qcow2-snapshot-on" into
meta, which didn't match what loadvm would target).

Why both runners take savevm individually instead of a unified path:
the two runners boot different launchers (launch_demo.sh for the
Alpine cidata image, launch_target.sh for the vulnerable target).
Each is responsible for its own QMP socket lifecycle. A shared
savevm helper module would just be a one-line wrapper around the
existing qmp.QMPClient.savevm; not worth the indirection.

Existing test coverage: tests/test_qmp.py exercises
QMPClient.savevm/loadvm against a fake server (HMP wrapper, error
path). The runner-side call is exercised in production but not in
unit tests — would need a fake launcher subprocess, which is outside
this commit's scope.

132/132 tests still pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-30 02:37:05 -05:00
..
build_cidata.py Collectors 2/4/5 + fleet runner + sample manifest + Tier-3 setup scripts 2026-04-30 00:02:27 -05:00
cis490_doctor.py Solvable Tier-3 holes: callback payloads, busybox workloads, bridge by default 2026-04-30 02:32:52 -05:00
fetch_sample.py Close out the open issues: bridge pcap wiring, perf collector, Tier-4 2026-04-30 00:17:49 -05:00
index_reader.py Close out the deployment-readiness gaps 2026-04-30 00:31:55 -05:00
load_mimic.py Synthetic envelope demo: phase-driven load mimic + plotter 2026-04-28 23:53:20 -06:00
plot_envelope.py Close out the deployment-readiness gaps 2026-04-30 00:31:55 -05:00
run_envelope_demo.py Synthetic envelope demo: phase-driven load mimic + plotter 2026-04-28 23:53:20 -06:00
run_fleet.py fleet: rotate exploit modules per (host, slot, ep); Tier 3 by default 2026-04-30 02:22:49 -05:00
run_real_vm_demo.py runners: take savevm baseline-v1 after boot so revert_at_* actually works 2026-04-30 02:37:05 -05:00
run_tier3_demo.py runners: take savevm baseline-v1 after boot so revert_at_* actually works 2026-04-30 02:37:05 -05:00
show_envelope.sh Interactive envelope plot via WebAgg (browser-based) 2026-04-29 00:06:22 -06:00
vm_load_controller.py workload audit trail: meta.sample + per-phase events + pre-kill probe 2026-04-30 02:12:34 -05:00
vm_serial.py Tier 2: real Alpine VM, real workload, real envelope 2026-04-29 08:38:53 -06:00