Closing the loop on the previous wave's commits. Tier 4 (real-malware
fetch + chunked upload + guest-side sha-verify + exec) and source 3
(perf stat collector) are both implemented and tested as of a88ac83;
the README still tagged them as TBD / planned. Fix.
- Tier 4 status: 🚧 → ✅ code; ⏳ awaiting operator's MalwareBazaar API
key + at least one sha256 entry in manifest.toml. Same shape as the
Tier-3 line.
- New "Tier 4 — real malware sample" section walks through the
fetch → chunked upload → guest-side sha-verify → exec flow with
links to the relevant code.
- Source 3 (perf stat): "🚧 planned" → "✅ opt-in via enable_perf".
- Snapshot/revert (revert_at_start / revert_at_end via QMP loadvm)
added to the Orchestrator + drivers list.
- Test-count header updated 86 → 106.
- Stale issue links to closed#4 / #5 / #6 dropped.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Wraps the gaps surfaced in the "what is not implemented" audit so the
fleet really is shippable end-to-end. Verified live on the Pi:
- cis490-shipper --ping → HTTP 200 through Caddy + mTLS via the
new wg-pki client CA leaf
- real episode dir → tar+zstd → PUT → HTTP 201 stored
- re-ship same bytes → 200 (idempotent)
- re-ship different bytes under same id → 409 (conflict)
Changes:
orchestrator/episode.py
- EpisodeConfig.revert_at_start / revert_at_end (Tier 0+ snapshot/
revert per docs/architecture.md). When set + qmp_socket present,
EpisodeRunner issues loadvm <snapshot_name> and emits
snapshot_revert / snapshot_revert_failed events on the same
monotonic clock as everything else.
collectors/qmp.py
- savevm() / loadvm() helpers using human-monitor-command, plus a
test against the fake QMP server.
exploits/workloads.py
- chunked_real_binary_upload() returns a ChunkedUpload plan: 8 KiB
base64 chunks (~6 KiB binary each) so msfrpc never sees a buffer-
busting payload. Includes a finalize step that sha256-verifies on
the guest before exec.
- real_binary_workload() now wraps the chunked plan for backwards
compat with single-shot callers.
exploits/driver.py
- Tier-4 dispatch walks the chunked plan in MSFExploitDriver:
each chunk is a separate session_shell_write; finalize verifies;
exec only runs on sha-ok. New events: real_binary_upload_begin,
real_binary_verify, real_binary_aborted.
etc/cis490-orchestrator.service
- Reads /etc/cis490/lab-host.env (FLEET_HOST_ID + optional BRIDGE).
- Grants AmbientCapabilities CAP_NET_RAW (tcpdump for source 4) +
CAP_SYS_ADMIN + CAP_PERFMON (perf for source 3) so collectors
work under hardening.
scripts/install-lab-host.sh
- Writes /etc/cis490/lab-host.env on first install with FLEET_HOST_ID
defaulting to `hostname -s`.
- Best-effort: fetches the Alpine baseline qcow2 (sha512-pinned) and
builds cidata.iso with the in-guest agent embedded; symlinks both
into /opt/cis490/vm/images/ so launchers find them.
scripts/fetch-alpine-baseline.sh
- Idempotent fetcher for the Alpine 3.21 cloud-init nocloud qcow2
matching the sha512 in docs/sources.md.
tools/plot_envelope.py
- Rebuilt to render whatever telemetry the episode dir contains:
proc → QMP block ops → perf IPC/miss-rate → bridge pkts/SYNs →
guest agent load/mem. Missing sources are silently skipped.
tools/index_reader.py
- cis490-index CLI: filter receiver's index.jsonl by host / sample
/ time range, sort, count-by group. Closest thing to a query
interface until we stand up Postgres/Timescale.
samples/README.md
- Rewritten to match the new manifest schema, the kind=real vs mimic
split, the per-(host, slot, ep) selection mechanic, and the
chunked-upload safety story.
Tests: 106 pass (was 102). New cases:
- test_qmp.py — savevm + loadvm (HMP wrapper + error path)
- test_tier4.py — chunked plan splitting, sha-pinned finalize,
end-to-end driver walks all chunks + verify + exec via the fake
msfrpc client
Closes the "what is not implemented" punch list.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Wraps the three remaining 🚧 items from the README so every collector
the threat-model promises is actually live, and the Tier-4 path
(real-malware fetch + upload + exec) works end-to-end as soon as a
sha256 lands in samples/store/.
Closesspectral/CIS490#4, #5, #6.
== #6 — Bridge pcap wiring ==
EpisodeConfig grows three optional fields:
bridge_iface: str | None # e.g. "br-malware"
bridge_ip: str = "10.200.0.1"
pcap_snaplen: int = 256
When bridge_iface is set, EpisodeRunner spawns tcpdump for the duration
of the schedule (network.pcap), stops it cleanly on episode end, and
runs collectors.pcap.bucketize() to produce netflow.jsonl per the
100-ms schema in docs/data-model.md. EpisodeResult + meta.result
gain rows_netflow + pcap_bytes counters.
vm/launch_demo.sh + launch_target.sh now switch between SLIRP usermode
and tap+bridge based on $BRIDGE — operator pre-creates the tap as a
bridge member, no sudo from the launcher.
run_real_vm_demo.py picks BRIDGE up from env so the fleet runner can
opt entire waves into pcap mode by exporting BRIDGE before invocation.
== #5 — Source 3 perf collector ==
collectors/perf_qemu.py shells out to ``perf stat -p <pid> -I 100 -j``
and parses the per-event JSON stream. Aggregates one row per interval
across the canonical event set (cycles/instructions/cache-{refs,misses}/
branches/branch-misses/page-faults/context-switches), computes IPC +
cache-miss rate. Tolerates missing events (``<not counted>`` /
``<not supported>``) without dropping the row, and skips cleanly when
``perf`` isn't on PATH or the process can't be attached.
EpisodeConfig.enable_perf=True opts into the collector — off by default
because perf needs CAP_SYS_ADMIN or perf_event_paranoid <= 1. When
enabled, runs as a parallel thread alongside the other collectors;
EpisodeResult.rows_perf records the count.
== #4 — Tier 4 (real-malware fetch + upload + exec) ==
tools/fetch_sample.py: pulls a sample by sha256 from MalwareBazaar
(API key from env or samples/.bazaar.token), unzips with the standard
"infected" password, verifies the resulting binary's sha256, lands at
samples/store/<sha256>. Idempotent — already-staged correct binaries
return immediately.
samples/manifest.py: Sample.binary_path(store_root) resolves to the
staged binary path, or None for mimics / not-yet-fetched real samples.
exploits/workloads.py: real_binary_workload(bytes, sample) builds a
Workload that base64-uploads the binary into the shell session via a
heredoc, decodes + chmods + execs it in the background, captures the
PID for clean stop on dormant. Per-profile pid/bin paths so concurrent
samples in the same guest don't collide.
exploits/driver.py: dispatch order is now:
1) sample.kind == "real" + binary staged at sample_store_root
→ real_binary_workload (Tier 4)
2) profile mimic from workloads.workload_for() (Tier 3 v2)
3) None → driver v1 fallback yes-loop
DriverConfig.sample_store_root is the new field; run_tier3_demo.py
wires it to repo_root/samples/store. driver_setup event records
sample_sha256 so trainers can join Tier-4 episodes against the
manifest by hash.
samples/store/.gitkeep added (binaries themselves are gitignored).
Tests: 102 pass (was 86). New suites:
tests/test_perf_qemu.py — parser + builder + perf-missing fallback
tests/test_tier4.py — real_binary_workload base64 round-trip,
stop-cmd kills pidfile, per-profile path
isolation, driver dispatch chooses real vs
mimic correctly, fetcher input validation
and cached-fast-path
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
README:
- Intro now describes the multi-host fleet + cross-host sample
diversity as the primary workflow.
- Tier 2 section: profile-driven workload table replaces the old
"yes / dd" description.
- New Tier 3 section: covers driver v2 dispatch + setup automation
scripts.
- Tier maturity table refreshed (1, 2 ✅; 3 ✅ code / ⏳ image; 4 🚧).
- Telemetry-sources table moved into the per-tier story so the
oracle-vs-feature split is visible from the top of the doc.
- Status section restructured by section (Pipeline, Telemetry,
Orchestrator + drivers, Fleet) instead of a flat list. Cross-links
to the new Forgejo issues for the remaining gaps:
#4 — Tier 4 MalwareBazaar fetcher
#5 — source 3 (perf stat)
#6 — bridge pcap per-episode wiring
- Quick-start sections rewritten:
1) "fleet mode (the primary workflow)" with --capacity + --waves
2) "single episode, no fleet" covering both Tier 2 + Tier 3
3) "multi-host fleet — how cross-host diversity works" explains
the deterministic per-(host, slot, ep) selection mechanism
- Repo-layout table updated to include shipper/, scripts/, AGENTS.md,
and the workloads/fleet additions.
- Deploying section: replaces the "TODO scaffolds" wording with the
actual sudo install-receiver / install-lab-host / wg-pki bring-up
flow that's running on the Pi today.
AGENTS.md: adds a "don't put off the hard parts" convention as the
first item under Other conventions, with explicit guidance on when
"deferred-with-reason" is legitimate (genuine operator artifact
missing) and the requirement to file an issue + automate the
bring-up so it Just Works once the artifact lands.
86/86 tests still pass.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The v1 driver ran ``yes > /dev/null`` for every sample, which
produced the same envelope shape regardless of which malware family
the orchestrator claimed to be running. That's a poor training
signal: the model sees identical /proc + QMP traces tagged
"cryptominer" / "ransomware" / "RAT" with no distinguishing
features. v2 fixes this.
What landed:
exploits/workloads.py — six ``Workload`` profiles, each producing
a distinct in-session shell command pair (start_cmd / stop_cmd)
that backgrounds a profile-shaped loop:
cpu-saturate — sustained 1-vCPU saturation (XMRig shape)
scan-and-dial — periodic SYN-style probes across 10.200.0.0/24
+ dial-home to gateway (Mirai shape)
io-walk — fs traversal + 4 KiB urandom writes, periodic
re-read (ransomware shape)
bursty-c2 — long idle, periodic 3-packet TCP egress burst
(Dridex C2 beacon shape)
low-and-slow — minimal CPU + periodic awk-driven memory churn
(Kovter / fileless shape)
shell-resident — single long-lived TCP socket pinned to gateway
with periodic 6-byte command ticks (RAT shape)
Each profile uses a /tmp/.cis490-workload-<profile>.{pid,sh} pair so
the stop_cmd can cleanly kill the loop and its descendants.
exploits/driver.py — MSFExploitDriver now accepts an optional
``Sample``. With one supplied, ``infected_running`` dispatches to
the matching workload via exploits.workloads.workload_for(); the
``sample_executed`` event records profile + sample name + sample
kind so the trainer can join cleanly. Without a sample, the v1
yes-loop path remains unchanged (backwards compat).
tools/vm_load_controller.py — the same dispatch on the Tier-2 path
(no exploit, real Alpine guest driven over the serial console).
A fleet wave now produces six visually distinct envelopes per
wave whether the underlying mode is Tier 2 or Tier 3.
tools/run_real_vm_demo.py — accepts ``--sample <name>`` (or
SAMPLE_NAME env from the fleet runner) + auto-wires QMP + agent
sockets into the EpisodeConfig so all three new collectors
(sources 2, 4, 5) run alongside source 1 by default.
tools/run_tier3_demo.py — same ``--sample`` plumbing for the
exploit-driven path.
Tests: 86 pass (was 82). New v2 cases:
- profile dispatch routes infected_running to the workload's
start_cmd (NOT the v1 yes-loop) when a Sample is set
- all six profiles produce distinct start_cmds (the property the
ML model needs)
- unknown profile string falls back to cpu-saturate with a warning
- v1 path (no Sample) still uses yes-loop (backwards compat)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This is the chunk that makes "real data" actually flow on multiple
hosts in parallel. End-to-end pipe was up at 613c6fa / 2579683; now
the lab-host side has the diversity + concurrency it needs.
Collectors landed:
collectors/qmp.py — source 2 (oracle). Tiny synchronous QMP
client + row builder + run loop. Tolerates
older qemu without query-stats.
collectors/guest_agent.py — source 5 (deployable). Reads the
virtio-serial host-side socket, parses
agent JSON-lines, re-stamps to the host
monotonic clock, persists.
collectors/pcap.py — source 4 (deployable). tcpdump capture
+ pure-Python pcap reader + 100 ms
netflow.jsonl bucketizer. Decodes
Ethernet/IPv4/TCP/UDP enough for the
schema in docs/data-model.md.
In-guest agent:
vm/guest-agent/cis490_agent.py — stdlib-only Python agent. Reads
/proc/{stat,meminfo,loadavg,net/dev,net/tcp*}, top-N RSS procs,
thermal. Writes JSON-lines to /dev/virtio-ports/cis490.guest.agent.
tools/build_cidata.py — embeds the agent + an OpenRC service into
user-data so first boot of the Alpine cidata image auto-starts it.
Launchers:
vm/launch_demo.sh / launch_target.sh — second virtio-serial port for
the agent socket; SLOT env support so multiple VMs run without
socket / port collisions; PORT_BASE on launch_target so multiple
target VMs hostfwd different host ports.
vm/setup_bridge.sh — creates host-only br-malware (10.200.0.1/24,
no NAT). Idempotent.
Fleet:
orchestrator/fleet.py — capacity detector (cores / RAM / load
headroom) + concurrent-slot runner. Per-slot ENV selects the
sample. FleetCapacity dataclass round-trips into meta.json so
"this episode ran with 6 concurrent VMs" is auditable post-hoc.
tools/run_fleet.py — CLI: --capacity report; --waves N runs N
waves of (max_concurrent) episodes each, every slot with a
different sample.
etc/cis490-orchestrator.service — now drives the fleet runner with
Restart=always so each invocation runs one wave and respawns,
giving a continuous stream.
Samples:
samples/manifest.toml — six profiles spanning the five major
behaviour shapes. Each entry is real OR mimic (sha256 distinguishes).
samples/manifest.py — strict TOML loader (rejects dups, unknown
categories) + deterministic select(host_id, slot, episode_index)
so different hosts on the network walk the catalog in different
orders without any coordinator.
EpisodeRunner:
orchestrator/episode.py — optional qmp_socket + guest_agent_socket
fields on EpisodeConfig; when set, additional collector threads
run alongside proc_qemu. EpisodeResult now carries rows_qmp +
rows_guest counters.
Tier-3 setup automation:
scripts/install-msfrpcd.sh — installs metasploit-framework where
the package manager has it, generates a strong password into
/etc/cis490/msfrpc.env, drops a hardened systemd unit bound to
127.0.0.1:55553. After this, run_tier3_demo.py works zero-touch
once MSFRPC_PASSWORD is sourced.
scripts/fetch-metasploitable2.sh — accepts IMAGE_URL + IMAGE_SHA256
from the operator (Rapid7 download is registration-walled), pulls,
verifies, converts vmdk → qcow2, lands at vm/images/.
Tests: 82 pass (was 51). New suites:
tests/test_qmp.py — fake QMP server, capability handshake,
blockstats, async-event interleaving,
5-failure backoff
tests/test_guest_agent.py — fake virtio socket, JSON-lines read +
re-stamp, malformed-line tolerance
tests/test_pcap.py — synthetic pcap with TCP/UDP/ARP frames,
bucketize correctness across windows
tests/test_fleet.py — capacity math (8-core idle / low-RAM /
high-load / Pi5 / 1-core box), manifest
selection determinism + diversity
What's queued for the next commit (already discussed in convo):
- MSFExploitDriver v2: map sample.profile → distinct in-session
workload so Tier-3 episodes don't all produce the same yes-loop
envelope. Critical for ML to learn varied malware shapes.
- Real-sample fetch from MalwareBazaar by sha256.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Implements the deployment loop end-to-end on the CIS490 side:
shipper/
config.py ShipperConfig (host_id, paths, receiver endpoint, mTLS)
transport.py httpx-based PUT + ping with mTLS + bearer support
queue.py scan data/episodes/, tar+zstd via system zstd, ship,
retire to data/shipped/. Idempotent across crashes per
the state machine in docs/transport.md.
__main__.py CLI: --ping (smoke test), --once (one pass), or daemon
receiver/app.py: new POST /v1/ping that requires the same auth as PUT
/v1/episodes but writes nothing. Used by `cis490-shipper --ping`
during lab-host bring-up to verify the WG/Caddy/mTLS path before
shipping any real bytes.
etc/
cis490-shipper.service systemd unit for the lab-host shipper
cis490-orchestrator.service systemd unit for the lab-host queue
(kept disabled by default until queue
mode lands)
lab-host.toml.example config template
scripts/
install-lab-host.sh idempotent installer; verifies prereqs,
creates cis490 service user, syncs repo to
/opt/cis490, builds venv, drops systemd units
and config template
install-receiver.sh same, for the receiver role on the central WG
node (Pi5 in our setup)
tests/test_shipper.py 11 end-to-end tests against a real Uvicorn
server hosting the receiver app. Exercises
ping, tar+ship, idempotent re-ship, 409
conflict, transient (receiver down), tarball
round-trip via system zstd.
AGENTS.md guidance for AI agents working on this and sibling repos.
Headline: when you hit an issue you can't fully fix in
scope, file a Forgejo issue rather than leaving a TODO.
51/51 tests pass.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds the Tier-3 exploit driver — an MSFExploitDriver that plugs into
EpisodeRunner.on_phase, fires a Metasploit module against a target VM
via msfrpcd, watches for the resulting session, and stamps each
transition (exploit_fire, session_open, session_landing_probe,
sample_executed, session_dormant, session_killed) into the episode's
events.jsonl on the orchestrator's monotonic clock.
What landed:
- exploits/msfrpc.py — minimal msgpack-over-HTTPS client (auth,
module.execute, job/session lifecycle) so we don't depend on a
third-party MSF wrapper.
- exploits/driver.py — phase-to-msfrpc adapter; idempotent fire,
session-open polling with timeout, workload start/stop, teardown.
- exploits/modules.py + exploits/modules/vsftpd_234_backdoor.toml —
TOML module configs with {{ target_ip }} placeholders, replacing the
imperative .rc-script approach the README previously hinted at.
- vm/launch_target.sh — SLIRP+restrict=on launcher for the
intentionally-vulnerable target VM (host can reach guest via
hostfwd, guest cannot reach host or internet).
- tools/run_tier3_demo.py — end-to-end runner mirroring run_real_vm_demo.
- tests/test_exploits.py — 12 new tests against a fake MSFRpcClient,
including an integration test that drives a real EpisodeRunner.
Plumbing changes:
- EpisodeRunner._emit_event → public emit_event, so external drivers
share the runner's monotonic clock and events.jsonl.
- mkdir for episode_dir moved to __init__ so emit_event is callable
before run() (driver_setup fires pre-schedule).
Status: driver + tests pass (40/40); end-to-end against a live msfrpcd
+ Metasploitable2 image is the next bring-up step.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
End-to-end now drives a real KVM guest through the full XMRig-shaped
phase schedule with the workload running INSIDE the guest. Telemetry is
host-side /proc/<qemu_pid>; the load is busybox `yes` (sustained CPU
saturation) and `dd if=/dev/urandom` (disk burst on infecting), driven
over the serial console at every phase transition. The plotted envelope
shows clean idle → armed → infecting (disk spike) → infected_running
(100% CPU plateau) → dormant → re-entry → final clean.
Components:
vm/launch_demo.sh now boots Alpine 3.21 nocloud-cloudinit
(Cirros 0.6.x's cirros-init blocks on the
EC2 metadata service for ~17 min before
falling through to NoCloud — abandoned).
Mounts a cidata ISO as a second drive.
tools/build_cidata.py pure-Python NoCloud ISO builder (pycdlib).
Sets root password and ssh_pwauth via
runcmd so we don't depend on a specific
cloud-init version's plain_text_passwd
handling.
tools/vm_serial.py serial-console client (stdlib socket).
Idempotent login (detects already-in-shell
state), sentinel-bracketed run() that
distinguishes shell output from the TTY
echo of input by requiring a leading
\r\n boundary on the marker.
tools/vm_load_controller.py in-guest load controller. set_phase()
dispatches the per-phase shell command
over the serial connection.
tools/run_real_vm_demo.py ties it all together: boot VM, wait for
cloud-init runcmd, log in, run the
EpisodeRunner with on_phase=controller,
shut down VM.
Deps: paramiko, pycdlib added.
docs/sources.md updated with Alpine cloud image (sha512 pinned), and
the new Python deps.
README leads with the tier-2 plot now (real VM, real workload). The
previous synthetic plot is moved below with explicit "host-side mimic,
not a VM" labelling. Tier-2 status flipped to ✅ in the tier table.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The README now leads with a 'What an episode looks like' section that
shows both:
* docs/images/synthetic-envelope.png — pipeline-validation plot. Real
telemetry of a real process whose load is shaped by tools/load_mimic.py
(Python). Explicitly labelled NOT REAL MALWARE in the caption — the
earlier wording was unclear.
* docs/images/real-vm-idle.png — real Cirros 0.6.3 booted under KVM,
same orchestrator + /proc collector pointed at the qemu-system pid.
Idle baseline; no exploit, no payload yet.
A 'What's still missing for the real-malware envelope' table makes the
tier path explicit (real VM idle → real workload in-guest → real exploit
fire → real sample).
Repository nav, deploy steps, design rationale, and threat model are
moved into <details>...</details> blocks so first-time visitors see the
demo plots and the status list without scrolling past wall-of-text.
Stale Pi-as-deployment-target wording in the design-rationale section
is fixed alongside.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
plot_envelope.py grows a --show flag. With it, matplotlib's WebAgg backend
spins up a localhost server with a real interactive figure (zoom, pan,
hover, axes lock) — equivalent to a matlab plot window without needing
tkinter or Qt locally.
tools/show_envelope.sh is a NixOS-aware wrapper: it locates libstdc++.so.6
in /nix/store (numpy's prebuilt wheel needs it on LD_LIBRARY_PATH) and then
exec's the python script with --show. Default port 8988, override via
--port. Bound to 0.0.0.0 so the figure is reachable over WG too.
tornado is added to dev deps because WebAgg requires it.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
vm/launch_demo.sh boots a Cirros qcow2 under KVM with QMP and a monitor
socket exposed; snapshot=on routes guest writes to a temporary overlay
so the on-disk image is never mutated (clean factory reset every boot).
End-to-end verified: vm/launch_demo.sh → orchestrator with --target-pid
<qemu pid> → 201 telemetry rows over 20s against the real qemu-system
process. The plotted envelope shows the expected idle-VM shape:
periodic ~10% CPU spikes from KVM/timer interrupts, flat 230 MiB RSS,
and a single late-boot disk write. Distinct from the synthetic
load_mimic envelope, confirming the collector reads real KVM behavior.
docs/sources.md is the works-cited doc — every tool, library, sample
source, paper, and standard the project leans on, grouped by category.
README's nav table now points at it. README's status section also lists
what's done vs. in progress so reviewers can see scope at a glance.
Note: vm/images/ stays gitignored. The Cirros 0.6.3 image is documented
with its sha256 (7d6355852aeb...) in docs/sources.md so any team member
can reproduce the bytes.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
End-to-end pipeline now produces a labeled envelope from a single command.
Drives the orchestrator through an 8-phase XMRig-shaped schedule and
renders a 3-panel envelope (CPU%, RSS, IO write rate) with phase bands
sourced from labels.jsonl. Real telemetry, simulated load — validates the
collection + labeling shape before a real VM is involved.
Components:
- tools/load_mimic.py phase-driven load generator. Reads phase
commands on stdin; CPU/IO behavior matches
the named phase (clean=idle, armed=light burst,
infecting=disk burst+CPU, infected_running=
CPU saturation+stratum-shaped writes,
dormant=quieter than clean).
- tools/run_envelope_demo.py spawns load_mimic, drives EpisodeRunner with
a default 85s schedule that includes the
classic infected_running → dormant → re-entry
pattern.
- tools/plot_envelope.py reads telemetry + labels from an episode dir,
writes envelope.png with colored phase bands.
orchestrator: EpisodeRunner now takes an optional phase_schedule and an
on_phase callback. Walks the schedule emitting one label per transition.
Backwards-compatible — existing single-phase tests still green.
Doc fix (user pushback): README + architecture + threat-model no longer
imply the Pi5 is the deployment target. Pi5's actual role here is the
WireGuard-side collector for episode tarballs. Deployment target is
generic ("constrained Linux device"). The "gateway observer" concept
remains a deployment pattern, decoupled from the Pi5's collector role.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
End-to-end: ``python -m orchestrator --target-pid <pid> --duration N`` now
writes a complete episode directory matching docs/data-model.md, with phase
labels, events, and a 10 Hz host /proc telemetry stream. No VM yet — pid is
arbitrary so we can validate the loop against e.g. ``sleep 5`` while the lab
side comes up.
collectors/proc_qemu.py — parses /proc/<pid>/{stat,io,status} (handles parens
in comm), single-shot collect_once(), and a stop-event-driven run_loop()
that ticks at a fixed cadence and exits when the pid disappears. Tagged
``available_in_deployment: false`` per the threat-model doc.
orchestrator/episode.py — EpisodeRunner: creates data/episodes/<ulid>/,
atomic meta.json, events.jsonl + labels.jsonl writers, drives the collector
in a thread for duration_s, writes done.marker last so the shipper never
sees a half-finished episode.
orchestrator/ulid.py — tiny 26-char Crockford-base32 ULID generator.
Time-sortable, no third-party dep.
orchestrator/__main__.py — CLI entry point.
Tests (15 new, 28 total green):
- proc_qemu: real-ish stat with parens-in-comm, missing /proc/<pid>/io,
missing pid, run_loop cadence, run_loop terminates when pid disappears.
- episode: full directory shape against os.getpid(), id override,
done.marker written after meta.json finalize.
- ulid: length+alphabet, 2000-burst uniqueness, time-sortability.
Smoke-tested against ``sleep 10``: 16 rows over 1.5s at 100ms cadence,
monotonic clock, RSS stable at ~3.5 MiB as expected for an idle sleep.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Implements docs/transport.md as a small Starlette app. The receiver streams
episode tarballs to disk, verifies sha256 against an X-Content-SHA256 header,
atomically renames into the store on success, and appends one row to a flat
index.jsonl. No DB. Idempotent re-PUTs return 200; conflicting bodies return
409. Optional bearer-token auth (mTLS terminates at Caddy in prod).
receiver/
store.py EpisodeStore: sha-verifying streaming ingest, atomic rename,
append-only index. No HTTP.
app.py make_app(): Starlette routes + bearer guard.
config.py ReceiverConfig.load(): TOML parser.
__main__.py uvicorn entrypoint, reads --config TOML.
tests/test_receiver.py — 13 tests via httpx.ASGITransport. Covers: 201 new,
200 idempotent replay, 409 conflict, 400 sha mismatch + cleanup, 400 missing/
short header, 400 bad id, 400 bad suffix, 413 too large, 401 bearer enforcement,
schema-version pass-through.
etc/cis490-receiver.service — systemd unit with hardening flags.
etc/receiver.toml.example — config template matching docs/deploy.md.
End-to-end smoke-tested with curl: 201 → 200 → 409 path verified, file
on disk, single index row.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>