CIS490/docs/threat-model.md
Maximus Gorog 970698af83 Synthetic envelope demo: phase-driven load mimic + plotter
End-to-end pipeline now produces a labeled envelope from a single command.
Drives the orchestrator through an 8-phase XMRig-shaped schedule and
renders a 3-panel envelope (CPU%, RSS, IO write rate) with phase bands
sourced from labels.jsonl. Real telemetry, simulated load — validates the
collection + labeling shape before a real VM is involved.

Components:
- tools/load_mimic.py        phase-driven load generator. Reads phase
                             commands on stdin; CPU/IO behavior matches
                             the named phase (clean=idle, armed=light burst,
                             infecting=disk burst+CPU, infected_running=
                             CPU saturation+stratum-shaped writes,
                             dormant=quieter than clean).
- tools/run_envelope_demo.py spawns load_mimic, drives EpisodeRunner with
                             a default 85s schedule that includes the
                             classic infected_running → dormant → re-entry
                             pattern.
- tools/plot_envelope.py     reads telemetry + labels from an episode dir,
                             writes envelope.png with colored phase bands.

orchestrator: EpisodeRunner now takes an optional phase_schedule and an
on_phase callback. Walks the schedule emitting one label per transition.
Backwards-compatible — existing single-phase tests still green.

Doc fix (user pushback): README + architecture + threat-model no longer
imply the Pi5 is the deployment target. Pi5's actual role here is the
WireGuard-side collector for episode tarballs. Deployment target is
generic ("constrained Linux device"). The "gateway observer" concept
remains a deployment pattern, decoupled from the Pi5's collector role.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-28 23:53:20 -06:00

102 lines
4.6 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Threat Model & Train/Serve Parity
The single most important design rule in this project:
> **A feature used by the deployed model must exist on the deployed device.**
Violating this rule produces a model that scores 99% in the lab and is useless in
the field. This document spells out which features fall on which side of that
line, and why we still bother capturing both.
## The setting
The deployed model runs on a real, non-virtualized device — typically a
constrained Linux endpoint (server, IoT box, edge node — the specific form
factor doesn't matter, only that it has its own kernel and isn't running
under our hypervisor). It tries to detect the moment that device gets
breached.
> **Not the Pi5.** In our project topology, the Pi5 is the WireGuard-side
> *collector* that receives episode tarballs from lab hosts. It is *not* the
> deployment target for the model. Don't conflate the two roles.
Two adversarial facts shape the design:
1. **Malware can lie to in-device tools.** A sufficiently-privileged rootkit can
hook `/proc`, intercept `perf_event_open`, and hide its own processes.
2. **There is no host-side QEMU view.** The deployed device is the actual
machine. Nothing is watching it from outside *the OS itself*.
So the model has two trustworthy floors:
- **In-device features that survive most malware** (perf counters via the syscall
interface, thermals, gross resource counters) — fast to lie to in principle,
but in practice most commodity malware doesn't bother.
- **Off-device features at the gateway** (network telemetry observed by an
upstream router/gateway) — physics-bound, the malware cannot prevent bytes
from leaving the NIC.
## Two roles: features vs. oracles
Every measurement we capture in the lab gets one of two roles:
| Role | What it's used for | Available in deployment? |
|---|---|---|
| **Feature** | Input to the trained model | **Must be yes** |
| **Oracle** | Ground-truth labeling during training only | No — but we have it in the lab |
The oracle channels (host `/proc/<qemu_pid>`, QMP `query-stats`,
`perf -p qemu_pid`) are how we know with certainty what the guest is *actually*
doing — not what it claims to be doing. We use that certainty to assign correct
labels in `labels.jsonl`. The model never sees them at training time.
## Channel taxonomy
| # | Channel | Vantage | Role | Why |
|---|---|---|---|---|
| 1 | Host `/proc/<qemu_pid>` | outside guest | oracle | doesn't exist on real device |
| 2 | QEMU QMP `query-stats`, `query-blockstats` | outside guest | oracle | same |
| 3 | `perf stat -p <qemu_pid>` | outside guest | oracle | same |
| 4 | Bridge-side pcap (`tcpdump -i br-malware`) | gateway | **feature** | matches Pi5 gateway in field |
| 5 | In-guest `/proc/*`, `perf_event_open`, `/sys/class/thermal/*` | inside guest | **feature** | same exact source on real device |
Note: in-guest features (5) are the same syscall surfaces we'd read on a real
Pi. The data we capture from them in the lab and the data we capture from them
on a Pi at deployment are pulled from identical kernel APIs — that is what makes
parity hold.
## The two-model evaluation strategy
We will train two classifiers from the same dataset:
1. **Realistic model** — trained only on features (channels 4 + 5).
*This is what would be deployed.*
2. **Oracle model** — trained on everything (channels 15).
*This is the upper bound on what was learnable from this dataset.*
The interesting numbers are:
- The realistic model's accuracy → **how well the deployable system actually works**.
- The gap between realistic and oracle → **how much detection power a privileged
rootkit can take away from us by lying** (the cost of in-device measurement
being adversarial).
- Realistic-without-pcap vs. realistic-with-pcap → **how much the gateway
observer is worth** (the value of off-device telemetry).
Reporting all three is the paper-worthy result — not just one accuracy number.
## Using oracle disagreement as a label
When the in-guest agent reports CPU at 8% and the host says the QEMU process is
burning 70%, that *is* the rootkit-active signal — and it's more reliable than
any single in-guest measurement. Episodes where this divergence appears get a
high-confidence `infected_running` label even if the in-guest data alone looks
ambiguous. This is the practical payoff of capturing both sides.
## What we are not claiming
- We are not claiming to detect kernel rootkits robustly from in-guest data alone.
The oracle/feature gap will quantify the limit.
- We are not claiming the trained model is safe to deploy without the gateway
observer in production — for the strongest threat model, gateway-side fusion
is required.