CIS490

Author	SHA1	Message	Date
Max	1fabd4a246	training: validator, feature/tensor extractors, 6 supervised models, schema-hashed checkpoints, eval suite, dashboard producers The model layer of the project, built honestly: - tools/dataset_validate.py — full-sweep validator over the receiver store (sha256, schema, monotonic labels, telemetry-row gate). On the current corpus: 64,798 accepted + 8,154 degraded + 3,701 rejected + 7 errored across 76,660 shipped episodes. data/processed/validation_v1.parquet is committed as the per-episode acceptance index. - training/_features.py — channel registry (46 channels across proc/guest/qmp/netflow), summary-stat windowing AND channel×time tensor extraction at 10s/5s windowing. Time alignment uses t_wall_ns (Unix ns) — tested fix for a real netflow-vs-host clock-base inconsistency that was silently dropping every netflow channel. - training/_split.py — three held-out recipes (host / sample / time) with profile-stratification assertions. held_out_host carries untested_profiles for cases like scan-and-dial absent from the test host (5 of 6 profiles tested cross-device, never silently averaged). - training/models/ — 6 architectures behind a common BaseModel interface: gbt (XGBoost), mlp, cnn, gru, lstm, transformer. Each trained twice (realistic / oracle) per the deployment threat model. Schema-hashed checkpoints refuse to load if _features.py changed since training (silent-input-drift protection, tested). - training/trainer/ — unified training loop: class-weighted CE, LR warmup + cosine, gradient clipping, mixed precision when CUDA, early stopping on val macro F1, best-on-val checkpoint. Same loop runs MLP/CNN/GRU/LSTM/Transformer; GBT uses XGBoost early_stopping_rounds on val mlogloss. - training/eval_/ — bootstrap 95% CIs on macro F1, per-class F1, per-profile and per-host breakdown, paired-bootstrap significance for model-vs-model gap. Confusion matrix uses union of seen labels. - training/dashboard/producers/ — replay/metrics/perf/profiles emitting the six event types the dashboard's awaiting scenes consume; on-demand tensor extraction so the Pi can run live inference without 65 GB of shards. - 17 unit tests (split coverage, features round-trip, schema mismatch, determinism, time-base alignment regression). End-to-end smoke-trained all six on a 567-episode subset; held-out test macro F1 reported with paired-bootstrap significance. The methodology now reports honest cross-device generalization, not in-distribution validation. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-08 01:19:00 -05:00
elliott	95ac56a382	fix: three install-time bugs found during first lab-host bring-up on k-gamingcom 1. pyproject.toml — move pycdlib to main deps (was dev-only; cidata build fails on first install because the venv doesn't include dev extras). 2. scripts/install-lab-host.sh — create vm/images/ dir before symlinking alpine-baseline.qcow2 and cidata.iso into INSTALL_ROOT. Without the mkdir the ln -sf silently fails (\|\| true), leaving the launchers unable to find the images and causing every episode to fail within 15 s. 3. tools/cis490_doctor.py — two fixes: a. Insert repo_root into sys.path at doctor startup so the inline `from exploits.modules import ...` succeeds when running from /opt/cis490 (package = false means nothing is installed into site-packages). b. Pass cwd=/opt/cis490 to the shipper --ping subprocess so python -m shipper resolves the module correctly regardless of the caller's CWD. Tested on k-gamingcom: install script now builds cidata.iso on first run, 7-slot fleet wave completes with rc=0, doctor shows 13 ok / 4 warn / 2 fail (remaining failures are mTLS certs + collector.wg DNS — both need Pi-side action, not code changes). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-30 15:05:00 -06:00
max	613c6fa223	Tier 3: msfrpc-driven exploit driver + first module config Adds the Tier-3 exploit driver — an MSFExploitDriver that plugs into EpisodeRunner.on_phase, fires a Metasploit module against a target VM via msfrpcd, watches for the resulting session, and stamps each transition (exploit_fire, session_open, session_landing_probe, sample_executed, session_dormant, session_killed) into the episode's events.jsonl on the orchestrator's monotonic clock. What landed: - exploits/msfrpc.py — minimal msgpack-over-HTTPS client (auth, module.execute, job/session lifecycle) so we don't depend on a third-party MSF wrapper. - exploits/driver.py — phase-to-msfrpc adapter; idempotent fire, session-open polling with timeout, workload start/stop, teardown. - exploits/modules.py + exploits/modules/vsftpd_234_backdoor.toml — TOML module configs with {{ target_ip }} placeholders, replacing the imperative .rc-script approach the README previously hinted at. - vm/launch_target.sh — SLIRP+restrict=on launcher for the intentionally-vulnerable target VM (host can reach guest via hostfwd, guest cannot reach host or internet). - tools/run_tier3_demo.py — end-to-end runner mirroring run_real_vm_demo. - tests/test_exploits.py — 12 new tests against a fake MSFRpcClient, including an integration test that drives a real EpisodeRunner. Plumbing changes: - EpisodeRunner._emit_event → public emit_event, so external drivers share the runner's monotonic clock and events.jsonl. - mkdir for episode_dir moved to __init__ so emit_event is callable before run() (driver_setup fires pre-schedule). Status: driver + tests pass (40/40); end-to-end against a live msfrpcd + Metasploitable2 image is the next bring-up step. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-29 23:11:52 -05:00
Maximus Gorog	7216ec09bd	Tier 2: real Alpine VM, real workload, real envelope End-to-end now drives a real KVM guest through the full XMRig-shaped phase schedule with the workload running INSIDE the guest. Telemetry is host-side /proc/<qemu_pid>; the load is busybox `yes` (sustained CPU saturation) and `dd if=/dev/urandom` (disk burst on infecting), driven over the serial console at every phase transition. The plotted envelope shows clean idle → armed → infecting (disk spike) → infected_running (100% CPU plateau) → dormant → re-entry → final clean. Components: vm/launch_demo.sh now boots Alpine 3.21 nocloud-cloudinit (Cirros 0.6.x's cirros-init blocks on the EC2 metadata service for ~17 min before falling through to NoCloud — abandoned). Mounts a cidata ISO as a second drive. tools/build_cidata.py pure-Python NoCloud ISO builder (pycdlib). Sets root password and ssh_pwauth via runcmd so we don't depend on a specific cloud-init version's plain_text_passwd handling. tools/vm_serial.py serial-console client (stdlib socket). Idempotent login (detects already-in-shell state), sentinel-bracketed run() that distinguishes shell output from the TTY echo of input by requiring a leading \r\n boundary on the marker. tools/vm_load_controller.py in-guest load controller. set_phase() dispatches the per-phase shell command over the serial connection. tools/run_real_vm_demo.py ties it all together: boot VM, wait for cloud-init runcmd, log in, run the EpisodeRunner with on_phase=controller, shut down VM. Deps: paramiko, pycdlib added. docs/sources.md updated with Alpine cloud image (sha512 pinned), and the new Python deps. README leads with the tier-2 plot now (real VM, real workload). The previous synthetic plot is moved below with explicit "host-side mimic, not a VM" labelling. Tier-2 status flipped to ✅ in the tier table. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-29 08:38:53 -06:00
Maximus Gorog	cc37fc6c4d	Interactive envelope plot via WebAgg (browser-based) plot_envelope.py grows a --show flag. With it, matplotlib's WebAgg backend spins up a localhost server with a real interactive figure (zoom, pan, hover, axes lock) — equivalent to a matlab plot window without needing tkinter or Qt locally. tools/show_envelope.sh is a NixOS-aware wrapper: it locates libstdc++.so.6 in /nix/store (numpy's prebuilt wheel needs it on LD_LIBRARY_PATH) and then exec's the python script with --show. Default port 8988, override via --port. Bound to 0.0.0.0 so the figure is reachable over WG too. tornado is added to dev deps because WebAgg requires it. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-29 00:06:22 -06:00
Maximus Gorog	970698af83	Synthetic envelope demo: phase-driven load mimic + plotter End-to-end pipeline now produces a labeled envelope from a single command. Drives the orchestrator through an 8-phase XMRig-shaped schedule and renders a 3-panel envelope (CPU%, RSS, IO write rate) with phase bands sourced from labels.jsonl. Real telemetry, simulated load — validates the collection + labeling shape before a real VM is involved. Components: - tools/load_mimic.py phase-driven load generator. Reads phase commands on stdin; CPU/IO behavior matches the named phase (clean=idle, armed=light burst, infecting=disk burst+CPU, infected_running= CPU saturation+stratum-shaped writes, dormant=quieter than clean). - tools/run_envelope_demo.py spawns load_mimic, drives EpisodeRunner with a default 85s schedule that includes the classic infected_running → dormant → re-entry pattern. - tools/plot_envelope.py reads telemetry + labels from an episode dir, writes envelope.png with colored phase bands. orchestrator: EpisodeRunner now takes an optional phase_schedule and an on_phase callback. Walks the schedule emitting one label per transition. Backwards-compatible — existing single-phase tests still green. Doc fix (user pushback): README + architecture + threat-model no longer imply the Pi5 is the deployment target. Pi5's actual role here is the WireGuard-side collector for episode tarballs. Deployment target is generic ("constrained Linux device"). The "gateway observer" concept remains a deployment pattern, decoupled from the Pi5's collector role. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-28 23:53:20 -06:00
Maximus Gorog	83e111961d	Add receiver: PUT /v1/episodes ingest with sha256 verify and idempotency Implements docs/transport.md as a small Starlette app. The receiver streams episode tarballs to disk, verifies sha256 against an X-Content-SHA256 header, atomically renames into the store on success, and appends one row to a flat index.jsonl. No DB. Idempotent re-PUTs return 200; conflicting bodies return 409. Optional bearer-token auth (mTLS terminates at Caddy in prod). receiver/ store.py EpisodeStore: sha-verifying streaming ingest, atomic rename, append-only index. No HTTP. app.py make_app(): Starlette routes + bearer guard. config.py ReceiverConfig.load(): TOML parser. __main__.py uvicorn entrypoint, reads --config TOML. tests/test_receiver.py — 13 tests via httpx.ASGITransport. Covers: 201 new, 200 idempotent replay, 409 conflict, 400 sha mismatch + cleanup, 400 missing/ short header, 400 bad id, 400 bad suffix, 413 too large, 401 bearer enforcement, schema-version pass-through. etc/cis490-receiver.service — systemd unit with hardening flags. etc/receiver.toml.example — config template matching docs/deploy.md. End-to-end smoke-tested with curl: 201 → 200 → 409 path verified, file on disk, single index row. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-28 23:34:04 -06:00

7 commits