# Threat Model & Train/Serve Parity The single most important design rule in this project: > **A feature used by the deployed model must exist on the deployed device.** Violating this rule produces a model that scores 99% in the lab and is useless in the field. This document spells out which features fall on which side of that line, and why we still bother capturing both. ## The setting The deployed model runs on a real, non-virtualized device — typically a constrained Linux endpoint (server, IoT box, edge node — the specific form factor doesn't matter, only that it has its own kernel and isn't running under our hypervisor). It tries to detect the moment that device gets breached. > **Not the Pi5.** In our project topology, the Pi5 is the WireGuard-side > *collector* that receives episode tarballs from lab hosts. It is *not* the > deployment target for the model. Don't conflate the two roles. Two adversarial facts shape the design: 1. **Malware can lie to in-device tools.** A sufficiently-privileged rootkit can hook `/proc`, intercept `perf_event_open`, and hide its own processes. 2. **There is no host-side QEMU view.** The deployed device is the actual machine. Nothing is watching it from outside *the OS itself*. So the model has two trustworthy floors: - **In-device features that survive most malware** (perf counters via the syscall interface, thermals, gross resource counters) — fast to lie to in principle, but in practice most commodity malware doesn't bother. - **Off-device features at the gateway** (network telemetry observed by an upstream router/gateway) — physics-bound, the malware cannot prevent bytes from leaving the NIC. ## Two roles: features vs. oracles Every measurement we capture in the lab gets one of two roles: | Role | What it's used for | Available in deployment? | |---|---|---| | **Feature** | Input to the trained model | **Must be yes** | | **Oracle** | Ground-truth labeling during training only | No — but we have it in the lab | The oracle channels (host `/proc/`, QMP `query-stats`, `perf -p qemu_pid`) are how we know with certainty what the guest is *actually* doing — not what it claims to be doing. We use that certainty to assign correct labels in `labels.jsonl`. The model never sees them at training time. ## Channel taxonomy | # | Channel | Vantage | Role | Why | |---|---|---|---|---| | 1 | Host `/proc/` | outside guest | oracle | doesn't exist on real device | | 2 | QEMU QMP `query-stats`, `query-blockstats` | outside guest | oracle | same | | 3 | `perf stat -p ` | outside guest | oracle | same | | 4 | Bridge-side pcap (`tcpdump -i br-malware`) | gateway | **feature** | matches Pi5 gateway in field | | 5 | In-guest `/proc/*`, `perf_event_open`, `/sys/class/thermal/*` | inside guest | **feature** | same exact source on real device | Note: in-guest features (5) are the same syscall surfaces we'd read on a real Pi. The data we capture from them in the lab and the data we capture from them on a Pi at deployment are pulled from identical kernel APIs — that is what makes parity hold. ## The two-model evaluation strategy We will train two classifiers from the same dataset: 1. **Realistic model** — trained only on features (channels 4 + 5). *This is what would be deployed.* 2. **Oracle model** — trained on everything (channels 1–5). *This is the upper bound on what was learnable from this dataset.* The interesting numbers are: - The realistic model's accuracy → **how well the deployable system actually works**. - The gap between realistic and oracle → **how much detection power a privileged rootkit can take away from us by lying** (the cost of in-device measurement being adversarial). - Realistic-without-pcap vs. realistic-with-pcap → **how much the gateway observer is worth** (the value of off-device telemetry). Reporting all three is the paper-worthy result — not just one accuracy number. ## Using oracle disagreement as a label When the in-guest agent reports CPU at 8% and the host says the QEMU process is burning 70%, that *is* the rootkit-active signal — and it's more reliable than any single in-guest measurement. Episodes where this divergence appears get a high-confidence `infected_running` label even if the in-guest data alone looks ambiguous. This is the practical payoff of capturing both sides. ## What we are not claiming - We are not claiming to detect kernel rootkits robustly from in-guest data alone. The oracle/feature gap will quantify the limit. - We are not claiming the trained model is safe to deploy without the gateway observer in production — for the strongest threat model, gateway-side fusion is required.