Wraps the gaps surfaced in the "what is not implemented" audit so the
fleet really is shippable end-to-end. Verified live on the Pi:
- cis490-shipper --ping → HTTP 200 through Caddy + mTLS via the
new wg-pki client CA leaf
- real episode dir → tar+zstd → PUT → HTTP 201 stored
- re-ship same bytes → 200 (idempotent)
- re-ship different bytes under same id → 409 (conflict)
Changes:
orchestrator/episode.py
- EpisodeConfig.revert_at_start / revert_at_end (Tier 0+ snapshot/
revert per docs/architecture.md). When set + qmp_socket present,
EpisodeRunner issues loadvm <snapshot_name> and emits
snapshot_revert / snapshot_revert_failed events on the same
monotonic clock as everything else.
collectors/qmp.py
- savevm() / loadvm() helpers using human-monitor-command, plus a
test against the fake QMP server.
exploits/workloads.py
- chunked_real_binary_upload() returns a ChunkedUpload plan: 8 KiB
base64 chunks (~6 KiB binary each) so msfrpc never sees a buffer-
busting payload. Includes a finalize step that sha256-verifies on
the guest before exec.
- real_binary_workload() now wraps the chunked plan for backwards
compat with single-shot callers.
exploits/driver.py
- Tier-4 dispatch walks the chunked plan in MSFExploitDriver:
each chunk is a separate session_shell_write; finalize verifies;
exec only runs on sha-ok. New events: real_binary_upload_begin,
real_binary_verify, real_binary_aborted.
etc/cis490-orchestrator.service
- Reads /etc/cis490/lab-host.env (FLEET_HOST_ID + optional BRIDGE).
- Grants AmbientCapabilities CAP_NET_RAW (tcpdump for source 4) +
CAP_SYS_ADMIN + CAP_PERFMON (perf for source 3) so collectors
work under hardening.
scripts/install-lab-host.sh
- Writes /etc/cis490/lab-host.env on first install with FLEET_HOST_ID
defaulting to `hostname -s`.
- Best-effort: fetches the Alpine baseline qcow2 (sha512-pinned) and
builds cidata.iso with the in-guest agent embedded; symlinks both
into /opt/cis490/vm/images/ so launchers find them.
scripts/fetch-alpine-baseline.sh
- Idempotent fetcher for the Alpine 3.21 cloud-init nocloud qcow2
matching the sha512 in docs/sources.md.
tools/plot_envelope.py
- Rebuilt to render whatever telemetry the episode dir contains:
proc → QMP block ops → perf IPC/miss-rate → bridge pkts/SYNs →
guest agent load/mem. Missing sources are silently skipped.
tools/index_reader.py
- cis490-index CLI: filter receiver's index.jsonl by host / sample
/ time range, sort, count-by group. Closest thing to a query
interface until we stand up Postgres/Timescale.
samples/README.md
- Rewritten to match the new manifest schema, the kind=real vs mimic
split, the per-(host, slot, ep) selection mechanic, and the
chunked-upload safety story.
Tests: 106 pass (was 102). New cases:
- test_qmp.py — savevm + loadvm (HMP wrapper + error path)
- test_tier4.py — chunked plan splitting, sha-pinned finalize,
end-to-end driver walks all chunks + verify + exec via the fake
msfrpc client
Closes the "what is not implemented" punch list.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
50 lines
1.7 KiB
Desktop File
50 lines
1.7 KiB
Desktop File
[Unit]
|
|
Description=CIS490 lab-host episode orchestrator (fleet mode)
|
|
Documentation=https://maxgit.wg/spectral/CIS490
|
|
# Episodes need KVM. msfrpcd (for Tier 3+) is brought up out-of-band
|
|
# by cis490-msfrpcd.service when installed.
|
|
After=network-online.target wg-quick@wg0.service
|
|
Wants=network-online.target
|
|
|
|
[Service]
|
|
Type=simple
|
|
User=cis490
|
|
Group=cis490
|
|
WorkingDirectory=/opt/cis490
|
|
# /etc/cis490/lab-host.env is written by scripts/install-lab-host.sh;
|
|
# carries FLEET_HOST_ID, BRIDGE, and any operator-supplied overrides.
|
|
EnvironmentFile=/etc/cis490/lab-host.env
|
|
# Fleet mode: detect host capacity, run that many concurrent episodes
|
|
# per wave with samples drawn from the manifest. Each invocation runs
|
|
# one wave and exits; systemd respawns per Restart= below, giving us
|
|
# a continuous stream of fresh-sample episodes per host. The shipper
|
|
# picks them up as `done.marker` files appear.
|
|
ExecStart=/opt/cis490/.venv/bin/python /opt/cis490/tools/run_fleet.py \
|
|
--data-root /var/lib/cis490/data \
|
|
--manifest /opt/cis490/samples/manifest.toml \
|
|
--waves 1
|
|
Restart=always
|
|
RestartSec=15
|
|
|
|
# Hardening — note we explicitly grant CAP_NET_RAW for tcpdump
|
|
# (source 4) and CAP_SYS_ADMIN for perf (source 3) when the operator
|
|
# enables those. Both are inherited by per-episode subprocesses.
|
|
NoNewPrivileges=false
|
|
PrivateTmp=true
|
|
ProtectSystem=strict
|
|
ProtectHome=true
|
|
ReadWritePaths=/var/lib/cis490 /tmp
|
|
SupplementaryGroups=kvm
|
|
AmbientCapabilities=CAP_NET_RAW CAP_NET_ADMIN CAP_SYS_ADMIN CAP_PERFMON
|
|
CapabilityBoundingSet=CAP_NET_RAW CAP_NET_ADMIN CAP_SYS_ADMIN CAP_PERFMON CAP_DAC_READ_SEARCH
|
|
|
|
# Hardening
|
|
NoNewPrivileges=true
|
|
PrivateTmp=true
|
|
ProtectSystem=strict
|
|
ProtectHome=true
|
|
ReadWritePaths=/var/lib/cis490
|
|
SupplementaryGroups=kvm
|
|
|
|
[Install]
|
|
WantedBy=multi-user.target
|