Wraps the three remaining 🚧 items from the README so every collector the threat-model promises is actually live, and the Tier-4 path (real-malware fetch + upload + exec) works end-to-end as soon as a sha256 lands in samples/store/. Closes spectral/CIS490#4, #5, #6. == #6 — Bridge pcap wiring == EpisodeConfig grows three optional fields: bridge_iface: str | None # e.g. "br-malware" bridge_ip: str = "10.200.0.1" pcap_snaplen: int = 256 When bridge_iface is set, EpisodeRunner spawns tcpdump for the duration of the schedule (network.pcap), stops it cleanly on episode end, and runs collectors.pcap.bucketize() to produce netflow.jsonl per the 100-ms schema in docs/data-model.md. EpisodeResult + meta.result gain rows_netflow + pcap_bytes counters. vm/launch_demo.sh + launch_target.sh now switch between SLIRP usermode and tap+bridge based on $BRIDGE — operator pre-creates the tap as a bridge member, no sudo from the launcher. run_real_vm_demo.py picks BRIDGE up from env so the fleet runner can opt entire waves into pcap mode by exporting BRIDGE before invocation. == #5 — Source 3 perf collector == collectors/perf_qemu.py shells out to ``perf stat -p <pid> -I 100 -j`` and parses the per-event JSON stream. Aggregates one row per interval across the canonical event set (cycles/instructions/cache-{refs,misses}/ branches/branch-misses/page-faults/context-switches), computes IPC + cache-miss rate. Tolerates missing events (``<not counted>`` / ``<not supported>``) without dropping the row, and skips cleanly when ``perf`` isn't on PATH or the process can't be attached. EpisodeConfig.enable_perf=True opts into the collector — off by default because perf needs CAP_SYS_ADMIN or perf_event_paranoid <= 1. When enabled, runs as a parallel thread alongside the other collectors; EpisodeResult.rows_perf records the count. == #4 — Tier 4 (real-malware fetch + upload + exec) == tools/fetch_sample.py: pulls a sample by sha256 from MalwareBazaar (API key from env or samples/.bazaar.token), unzips with the standard "infected" password, verifies the resulting binary's sha256, lands at samples/store/<sha256>. Idempotent — already-staged correct binaries return immediately. samples/manifest.py: Sample.binary_path(store_root) resolves to the staged binary path, or None for mimics / not-yet-fetched real samples. exploits/workloads.py: real_binary_workload(bytes, sample) builds a Workload that base64-uploads the binary into the shell session via a heredoc, decodes + chmods + execs it in the background, captures the PID for clean stop on dormant. Per-profile pid/bin paths so concurrent samples in the same guest don't collide. exploits/driver.py: dispatch order is now: 1) sample.kind == "real" + binary staged at sample_store_root → real_binary_workload (Tier 4) 2) profile mimic from workloads.workload_for() (Tier 3 v2) 3) None → driver v1 fallback yes-loop DriverConfig.sample_store_root is the new field; run_tier3_demo.py wires it to repo_root/samples/store. driver_setup event records sample_sha256 so trainers can join Tier-4 episodes against the manifest by hash. samples/store/.gitkeep added (binaries themselves are gitignored). Tests: 102 pass (was 86). New suites: tests/test_perf_qemu.py — parser + builder + perf-missing fallback tests/test_tier4.py — real_binary_workload base64 round-trip, stop-cmd kills pidfile, per-profile path isolation, driver dispatch chooses real vs mimic correctly, fetcher input validation and cached-fast-path Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
75 lines
2.8 KiB
Bash
Executable file
75 lines
2.8 KiB
Bash
Executable file
#!/usr/bin/env bash
|
|
# Boot the Cirros qcow2 under KVM with QMP and a monitor socket exposed.
|
|
#
|
|
# This is the v0 VM launcher for phase 2: validate that the orchestrator
|
|
# and host /proc collector work against a real qemu-system process. No
|
|
# host-only bridge yet, no exploit driver, no payload — just boot and
|
|
# idle. We add the bridge and exploit machinery in later phases.
|
|
#
|
|
# Run dir is exported so the orchestrator can read the qemu pid:
|
|
# $RUN_DIR/qemu.pid
|
|
# $RUN_DIR/qmp.sock
|
|
# $RUN_DIR/monitor.sock
|
|
|
|
set -euo pipefail
|
|
|
|
REPO_ROOT="$(cd "$(dirname "$0")/.." && pwd)"
|
|
IMAGE="${IMAGE:-$REPO_ROOT/vm/images/alpine-baseline.qcow2}"
|
|
CIDATA="${CIDATA:-$REPO_ROOT/vm/images/cidata.iso}"
|
|
# SLOT lets the fleet runner spin up N concurrent VMs without socket /
|
|
# port collisions. Default RUN_DIR + ssh hostfwd port keep single-VM
|
|
# usage unchanged.
|
|
SLOT="${SLOT:-0}"
|
|
RUN_DIR="${RUN_DIR:-/tmp/cis490-vm-$SLOT}"
|
|
SSH_PORT="${SSH_PORT:-$((2222 + SLOT))}"
|
|
# When BRIDGE is set, attach a tap to the host-only bridge instead of
|
|
# using SLIRP usermode networking. The tap must already exist and be a
|
|
# member of the bridge — see vm/setup_bridge.sh + (operator) ip tuntap.
|
|
BRIDGE="${BRIDGE:-}"
|
|
TAP="${TAP:-cis490tap$SLOT}"
|
|
|
|
mkdir -p "$RUN_DIR"
|
|
QMP_SOCK="$RUN_DIR/qmp.sock"
|
|
MON_SOCK="$RUN_DIR/monitor.sock"
|
|
PID_FILE="$RUN_DIR/qemu.pid"
|
|
|
|
if [[ ! -f "$IMAGE" ]]; then
|
|
echo "no image at $IMAGE" >&2
|
|
exit 1
|
|
fi
|
|
if [[ ! -f "$CIDATA" ]]; then
|
|
echo "no cidata at $CIDATA — build it with: uv run python tools/build_cidata.py $CIDATA" >&2
|
|
exit 1
|
|
fi
|
|
|
|
AGENT_SOCK="$RUN_DIR/agent.sock"
|
|
|
|
# snapshot=on routes guest writes through a temporary overlay so the qcow2
|
|
# on disk is never mutated — every boot starts from the same bytes.
|
|
#
|
|
# Second virtio-serial port (cis490.guest.agent) carries telemetry
|
|
# from the in-guest agent. Surfaces inside the guest at
|
|
# /dev/virtio-ports/cis490.guest.agent and on the host at $AGENT_SOCK.
|
|
exec qemu-system-x86_64 \
|
|
-name cis490-vm \
|
|
-machine q35,accel=kvm \
|
|
-cpu host \
|
|
-smp 1,sockets=1,cores=1,threads=1 \
|
|
-m 256 \
|
|
-drive file="$IMAGE",format=qcow2,if=virtio,snapshot=on \
|
|
-drive file="$CIDATA",format=raw,if=virtio,readonly=on \
|
|
$(if [[ -n "$BRIDGE" ]]; then \
|
|
echo -n "-netdev tap,id=n0,ifname=$TAP,script=no,downscript=no "; \
|
|
else \
|
|
echo -n "-netdev user,id=n0,hostfwd=tcp:127.0.0.1:$SSH_PORT-:22 "; \
|
|
fi) \
|
|
-device virtio-net-pci,netdev=n0 \
|
|
-device virtio-serial-pci,id=cis490vs0 \
|
|
-chardev socket,id=cis490agent,path="$AGENT_SOCK",server=on,wait=off \
|
|
-device virtserialport,chardev=cis490agent,name=cis490.guest.agent \
|
|
-nographic \
|
|
-serial unix:"$RUN_DIR/serial.sock",server=on,wait=off \
|
|
-monitor unix:"$MON_SOCK",server=on,wait=off \
|
|
-qmp unix:"$QMP_SOCK",server=on,wait=off \
|
|
-pidfile "$PID_FILE" \
|
|
-display none
|