Lays down the design surface for the CIS490 behavioral-malware-detection dataset and model. No code yet — schema and topology are decided first so collection can start without rework. Docs: - README: project goal, navigation - architecture: lab topology, KVM choice, episode state machine, deployment-mirror reasoning - threat-model: train/serve parity rule, oracle-vs-deployable feature split, two-model evaluation strategy - data-model: per-episode JSONL layout, row schemas, phase enum - transport: WG-native shipper/receiver design, idempotent uploads - deploy: one-command install for lab-host and receiver roles - lab-setup: KVM prereqs, VM build, snapshot, virtio-serial wiring Skeleton: orchestrator/, collectors/, vm/, exploits/, samples/, training/ (each with a short README explaining purpose). Extended .gitignore to exclude qcow2 images, pcaps, sample binaries, secrets. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
48 lines
479 B
Text
48 lines
479 B
Text
# Disk images and snapshots
|
|
*.iso
|
|
*.img
|
|
*.qcow2
|
|
*.qcow2.*
|
|
*.vmdk
|
|
*.vdi
|
|
*.raw
|
|
vm/images/
|
|
vm/snapshots/
|
|
|
|
# Telemetry output
|
|
data/episodes/
|
|
*.pcap
|
|
*.pcapng
|
|
|
|
# Malware samples — NEVER commit binaries
|
|
samples/store/
|
|
*.bin
|
|
*.elf
|
|
*.exe
|
|
*.dll
|
|
*.so.malware
|
|
|
|
# Python
|
|
__pycache__/
|
|
*.py[cod]
|
|
.venv/
|
|
venv/
|
|
.pytest_cache/
|
|
.mypy_cache/
|
|
.ruff_cache/
|
|
*.egg-info/
|
|
dist/
|
|
build/
|
|
|
|
# Editor
|
|
.vscode/
|
|
.idea/
|
|
*.swp
|
|
.DS_Store
|
|
|
|
# Local secrets (never commit)
|
|
.env
|
|
.env.local
|
|
secrets.toml
|
|
*.pat
|
|
*.token
|