# Lab Setup How to bring up the host, build the guest, and verify the snapshot loop. ## Host prerequisites ``` qemu-system-x86_64 >= 8.0 qemu-img >= 8.0 bridge-utils tcpdump / tshark linux-tools-common (for `perf`) zstd python >= 3.11 uv (https://github.com/astral-sh/uv) ``` `scripts/install-lab-host.sh` installs all of these and wires up systemd — see [`deploy.md`](deploy.md). KVM must be enabled in the kernel and the user must be in the `kvm` group: ``` ls /dev/kvm # must exist groups # must include kvm ``` ## Network: host-only malware bridge `br-malware` (10.200.0.1/24) is the only network the guest sees, and it is host-only — no NAT, no upstream route. The host's WG interface is on a *separate* link (`wg0`) used only for shipping completed episodes to the collector; the bridge and WG never touch. | Interface | Purpose | |---|---| | `br-malware` (10.200.0.1/24) | host-only bridge, only NIC attached to the guest | | guest `eth0` | DHCP from a dnsmasq bound only to `br-malware` | | host WG (`wg0`) | shipping channel to the collector — not connected to the bridge | > Detailed firewall rules and the egress-drop safety net are out of scope for > this document and live in the deploy script. The relevant invariant for > readers is: **the guest cannot route off `br-malware`, period.** ## Guest: Metasploitable 2 1. Download from the [Rapid7 mirror](https://information.rapid7.com/download-metasploitable-2017.html) (verify sha256 against the published value before use). 2. Convert VMware → qcow2: ``` qemu-img convert -O qcow2 -p Metasploitable.vmdk metasploitable2.qcow2 ``` 3. First boot (no snapshot yet) — let it come up, log in (msfadmin/msfadmin), confirm services are listening on the expected ports, shut down cleanly. 4. Take the baseline snapshot: ``` qemu-img snapshot -c baseline-v1 metasploitable2.qcow2 ``` Internal qcow2 snapshots load in well under a second — this is the "factory reset" mechanism for every episode. ## Single-vCPU constrained-device emulation ``` -cpu host -smp 1,sockets=1,cores=1,threads=1 -m 512 -machine type=q35,accel=kvm ``` Plus a host-side cgroup CPU cap on the QEMU process (e.g. 80% of one core) so the guest behaves like a small, constrained device under load. ## Telemetry channels ### virtio-serial for the in-guest agent ``` -device virtio-serial-pci -chardev socket,path=/run/qemu/guest-agent.sock,server=on,wait=off,id=ga -device virtserialport,chardev=ga,name=cis490.guest.agent ``` The in-guest agent opens `/dev/virtio-ports/cis490.guest.agent` and writes JSONL to it. Host side, the orchestrator reads from the unix socket. No network involvement = the malware cannot interfere with this channel. ### QMP for live oracle queries ``` -qmp unix:/run/qemu/qmp.sock,server=on,wait=off ``` The orchestrator polls `query-stats`, `query-blockstats`, and netdev stats over this socket. ### perf stat on the QEMU process ``` perf stat -p -I 100 \ -e cycles,instructions,cache-references,cache-misses,branches,branch-misses,page-faults,context-switches \ -x , -o telemetry-perf.csv ``` The collector tails the CSV, parses, and emits JSONL. ### tcpdump on `br-malware` ``` tcpdump -i br-malware -w network.pcap -B 4096 -s 200 ``` Post-process to `netflow.jsonl` with 100ms buckets. ## Snapshot loop sanity check A green light before any data collection: 1. `qemu-img snapshot -l metasploitable2.qcow2` shows `baseline-v1`. 2. Boot the VM with the qcow2. 3. Touch a file in the guest. Shut down. 4. `qemu-img snapshot -a baseline-v1 metasploitable2.qcow2`. 5. Boot again. The file is gone. ✅ ## Safety checks before running real samples - `ip route show table all | grep br-malware` shows no route off the bridge. - `dig @host example.com` from a guest fails (no DNS for malware). - The host's WG interface is **not** bridged to `br-malware`. (See `scripts/install-lab-host.sh` for the firewall plumbing — it isn't the focus of this project.) ## Where to put VMs and snapshots ``` vm/images/ # qcow2 disk images (gitignored) vm/snapshots/ # named snapshot exports if we ever externalize them ``` Both directories are gitignored. The repo only carries the *recipes* for reproducing them.