README: Tier 4 is shipped, source 3 is shipped — drop the stale 🚧 marks
Closing the loop on the previous wave's commits. Tier 4 (real-malware fetch + chunked upload + guest-side sha-verify + exec) and source 3 (perf stat collector) are both implemented and tested as of a88ac83; the README still tagged them as TBD / planned. Fix. - Tier 4 status: 🚧 → ✅ code; ⏳ awaiting operator's MalwareBazaar API key + at least one sha256 entry in manifest.toml. Same shape as the Tier-3 line. - New "Tier 4 — real malware sample" section walks through the fetch → chunked upload → guest-side sha-verify → exec flow with links to the relevant code. - Source 3 (perf stat): "🚧 planned" → "✅ opt-in via enable_perf". - Snapshot/revert (revert_at_start / revert_at_end via QMP loadvm) added to the Orchestrator + drivers list. - Test-count header updated 86 → 106. - Stale issue links to closed #4 / #5 / #6 dropped. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
parent
a88ac83db0
commit
637fb064df
1 changed files with 35 additions and 11 deletions
46
README.md
46
README.md
|
|
@ -119,6 +119,28 @@ sets up `msfrpcd` (loopback only) as a hardened systemd unit;
|
||||||
[`scripts/fetch-metasploitable2.sh`](scripts/fetch-metasploitable2.sh)
|
[`scripts/fetch-metasploitable2.sh`](scripts/fetch-metasploitable2.sh)
|
||||||
pulls + sha256-verifies a target image from operator-supplied URL.
|
pulls + sha256-verifies a target image from operator-supplied URL.
|
||||||
|
|
||||||
|
### Tier 4 — *real malware sample, fetched + uploaded + executed*
|
||||||
|
|
||||||
|
A manifest entry with a `sha256` flips its `Sample.kind` to `"real"`.
|
||||||
|
The driver then bypasses the mimic profile and runs the real-binary
|
||||||
|
path:
|
||||||
|
|
||||||
|
1. [`tools/fetch_sample.py <sha256>`](tools/fetch_sample.py) pulls the
|
||||||
|
binary from MalwareBazaar (Auth-Key from
|
||||||
|
`samples/.bazaar.token` or `MALWAREBAZAAR_API_KEY`), unzips with the
|
||||||
|
standard `infected` password, sha-verifies, and lands at
|
||||||
|
`samples/store/<sha256>` (gitignored).
|
||||||
|
2. At `infected_running`, the driver chunked-uploads the binary into
|
||||||
|
the shell session as 8 KiB base64 segments
|
||||||
|
(`exploits.workloads.chunked_real_binary_upload`). 256 KiB binaries
|
||||||
|
work without buffer-busting msfrpc.
|
||||||
|
3. The session decodes, sha-verifies *again on the guest side*, chmods,
|
||||||
|
and execs only if the hash matches. Mismatch fail-stops the run.
|
||||||
|
4. `meta.sample.sha256` + per-step events
|
||||||
|
(`real_binary_upload_begin`, `real_binary_verify`,
|
||||||
|
`sample_executed{kind=real}`) record exactly which binary was run
|
||||||
|
and when, so trainers can join cleanly.
|
||||||
|
|
||||||
### Tier maturity
|
### Tier maturity
|
||||||
|
|
||||||
| Tier | What it gives | Status |
|
| Tier | What it gives | Status |
|
||||||
|
|
@ -126,22 +148,22 @@ pulls + sha256-verifies a target image from operator-supplied URL.
|
||||||
| 1 — real VM, idle | confidence the collectors read real KVM behaviour | ✅ done |
|
| 1 — real VM, idle | confidence the collectors read real KVM behaviour | ✅ done |
|
||||||
| 2 — real VM, profile-driven workload | distinguishable in-guest envelopes per malware family | ✅ done |
|
| 2 — real VM, profile-driven workload | distinguishable in-guest envelopes per malware family | ✅ done |
|
||||||
| 3 — real VM, real exploit fire + profile workload | honest `armed → infecting` transitions, driver v2 dispatch | ✅ code; ⏳ awaiting Metasploitable2 image + msfrpcd on a lab host |
|
| 3 — real VM, real exploit fire + profile workload | honest `armed → infecting` transitions, driver v2 dispatch | ✅ code; ⏳ awaiting Metasploitable2 image + msfrpcd on a lab host |
|
||||||
| 4 — real VM, real malware sample (MalwareBazaar fetch) | the full envelope we ultimately train on | 🚧 manifest schema ready (`sample.sha256` → `kind=real`); fetcher TBD |
|
| 4 — real VM, real malware sample (MalwareBazaar fetch) | the full envelope we ultimately train on | ✅ code; ⏳ awaiting MalwareBazaar API key + sha256s in manifest |
|
||||||
|
|
||||||
### Telemetry sources (all four wire into one episode dir)
|
### Telemetry sources (all five wire into one episode dir)
|
||||||
|
|
||||||
| # | Source | Vantage | Role |
|
| # | Source | Vantage | Role |
|
||||||
|---|--------------------------------|---------------|---------------------|
|
|---|--------------------------------|---------------|---------------------|
|
||||||
| 1 | host `/proc/<qemu_pid>` | outside | oracle (label only) |
|
| 1 | host `/proc/<qemu_pid>` | outside | oracle (label only) |
|
||||||
| 2 | QEMU QMP queries | outside | oracle (label only) |
|
| 2 | QEMU QMP queries | outside | oracle (label only) |
|
||||||
| 3 | `perf stat -p <qemu_pid>` | outside | oracle (planned) |
|
| 3 | `perf stat -p <qemu_pid>` | outside | oracle (label only) |
|
||||||
| 4 | Bridge pcap → 100 ms netflow | gateway-side | feature (deployable)|
|
| 4 | Bridge pcap → 100 ms netflow | gateway-side | feature (deployable)|
|
||||||
| 5 | In-guest agent (virtio-serial) | inside | feature (deployable)|
|
| 5 | In-guest agent (virtio-serial) | inside | feature (deployable)|
|
||||||
|
|
||||||
Sources 1, 2, 4, 5 are live as of this commit. The deploy/oracle split
|
All five are live. The deploy/oracle split follows
|
||||||
follows [`docs/threat-model.md`](docs/threat-model.md): only sources
|
[`docs/threat-model.md`](docs/threat-model.md): only sources 4 + 5
|
||||||
4 + 5 are usable as model *features* in the field — sources 1, 2, 3
|
are usable as model *features* in the field — sources 1, 2, 3 exist
|
||||||
exist as labeling oracles only.
|
as labeling oracles only.
|
||||||
|
|
||||||
For an interactive view of any episode (zoom/pan/hover), run:
|
For an interactive view of any episode (zoom/pan/hover), run:
|
||||||
|
|
||||||
|
|
@ -152,7 +174,7 @@ tools/show_envelope.sh data/episodes/<episode_id>
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## Status (86/86 tests passing as of `b80986d`)
|
## Status (106/106 tests passing as of `a88ac83`)
|
||||||
|
|
||||||
**Pipeline (lab-host → Pi → tarball stored)**
|
**Pipeline (lab-host → Pi → tarball stored)**
|
||||||
- ✅ Receiver app (HTTPS PUT, sha256-verified, idempotent) — running on the Pi behind Caddy with mTLS via the wg-pki client CA
|
- ✅ Receiver app (HTTPS PUT, sha256-verified, idempotent) — running on the Pi behind Caddy with mTLS via the wg-pki client CA
|
||||||
|
|
@ -166,16 +188,18 @@ tools/show_envelope.sh data/episodes/<episode_id>
|
||||||
**Telemetry**
|
**Telemetry**
|
||||||
- ✅ Source 1 — host `/proc/<qemu_pid>` @ 10 Hz
|
- ✅ Source 1 — host `/proc/<qemu_pid>` @ 10 Hz
|
||||||
- ✅ Source 2 — QEMU QMP @ 1 Hz
|
- ✅ Source 2 — QEMU QMP @ 1 Hz
|
||||||
- ✅ Source 4 — bridge pcap + 100 ms netflow bucketizer (pure-Python parser, no scapy/dpkt dep). Per-episode wiring in `EpisodeRunner` is tracked in [#6](http://maxgit.wg/spectral/CIS490/issues/6).
|
- ✅ Source 3 — `perf stat -p <qemu_pid>` (opt-in via `enable_perf`; needs `CAP_SYS_ADMIN` / `CAP_PERFMON`)
|
||||||
|
- ✅ Source 4 — bridge pcap + 100 ms netflow bucketizer (pure-Python parser, no scapy/dpkt dep), wired into `EpisodeRunner` via `bridge_iface`
|
||||||
- ✅ Source 5 — in-guest agent over virtio-serial; cidata-embedded for first-boot install on Alpine
|
- ✅ Source 5 — in-guest agent over virtio-serial; cidata-embedded for first-boot install on Alpine
|
||||||
- 🚧 Source 3 — `perf stat -p <qemu_pid>` ([#5](http://maxgit.wg/spectral/CIS490/issues/5))
|
|
||||||
|
|
||||||
**Orchestrator + drivers**
|
**Orchestrator + drivers**
|
||||||
- ✅ Orchestrator v0 — phase-scheduled episode runner, ULID episode ids
|
- ✅ Orchestrator v0 — phase-scheduled episode runner, ULID episode ids
|
||||||
|
- ✅ Snapshot/revert via QMP `loadvm` (`revert_at_start` / `revert_at_end`) for clean baselines between episodes
|
||||||
- ✅ Tier 2 driver — real Alpine VM, profile-driven in-guest workload over serial console
|
- ✅ Tier 2 driver — real Alpine VM, profile-driven in-guest workload over serial console
|
||||||
- ✅ Tier 3 driver v2 — `MSFExploitDriver` + msfrpc client + per-sample workload dispatch; first canned module `vsftpd_234_backdoor.toml`
|
- ✅ Tier 3 driver v2 — `MSFExploitDriver` + msfrpc client + per-sample workload dispatch; first canned module `vsftpd_234_backdoor.toml`
|
||||||
|
- ✅ Tier 4 — `tools/fetch_sample.py` (MalwareBazaar by sha256) + chunked real-binary upload (`exploits.workloads.chunked_real_binary_upload`) + guest-side sha-verify-then-exec dispatch in `MSFExploitDriver`
|
||||||
- ⏳ Tier 3 integration — needs operator to drop a Metasploitable2 image + run `scripts/install-msfrpcd.sh` on a lab host
|
- ⏳ Tier 3 integration — needs operator to drop a Metasploitable2 image + run `scripts/install-msfrpcd.sh` on a lab host
|
||||||
- 🚧 Tier 4 — MalwareBazaar fetch by sha256 (manifest schema is ready; tracked in [#4](http://maxgit.wg/spectral/CIS490/issues/4))
|
- ⏳ Tier 4 integration — needs operator's MalwareBazaar API key + at least one `sha256` entry in `samples/manifest.toml`
|
||||||
|
|
||||||
**Fleet (multi-VM, multi-host data generation)**
|
**Fleet (multi-VM, multi-host data generation)**
|
||||||
- ✅ Resource-aware capacity detector (cores / RAM / load) — `orchestrator/fleet.py`
|
- ✅ Resource-aware capacity detector (cores / RAM / load) — `orchestrator/fleet.py`
|
||||||
|
|
|
||||||
Loading…
Add table
Reference in a new issue