Lays down the design surface for the CIS490 behavioral-malware-detection dataset and model. No code yet — schema and topology are decided first so collection can start without rework. Docs: - README: project goal, navigation - architecture: lab topology, KVM choice, episode state machine, deployment-mirror reasoning - threat-model: train/serve parity rule, oracle-vs-deployable feature split, two-model evaluation strategy - data-model: per-episode JSONL layout, row schemas, phase enum - transport: WG-native shipper/receiver design, idempotent uploads - deploy: one-command install for lab-host and receiver roles - lab-setup: KVM prereqs, VM build, snapshot, virtio-serial wiring Skeleton: orchestrator/, collectors/, vm/, exploits/, samples/, training/ (each with a short README explaining purpose). Extended .gitignore to exclude qcow2 images, pcaps, sample binaries, secrets. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|---|---|---|
| .. | ||
| README.md | ||
vm/
Recipes and helpers for building and snapshotting guest VMs. Disk images and snapshots themselves are gitignored — this directory carries the how, not the bytes.
vm/
images/ # qcow2 staging (gitignored)
snapshots/ # exported snapshots if needed (gitignored)
guest-agent/ # in-guest telemetry agent (shipped into the guest)
metasploitable2.md # download/convert/snapshot procedure (TODO)
custom-debian/ # cloud-init for our own vulnerable Debian (TODO)
See docs/lab-setup.md for the full host + guest
bring-up procedure.