CIS490/tools
Max 1fabd4a246 training: validator, feature/tensor extractors, 6 supervised models, schema-hashed checkpoints, eval suite, dashboard producers
The model layer of the project, built honestly:

  - tools/dataset_validate.py — full-sweep validator over the receiver
    store (sha256, schema, monotonic labels, telemetry-row gate). On the
    current corpus: 64,798 accepted + 8,154 degraded + 3,701 rejected +
    7 errored across 76,660 shipped episodes. data/processed/validation_v1.parquet
    is committed as the per-episode acceptance index.

  - training/_features.py — channel registry (46 channels across
    proc/guest/qmp/netflow), summary-stat windowing AND channel×time
    tensor extraction at 10s/5s windowing. Time alignment uses t_wall_ns
    (Unix ns) — tested fix for a real netflow-vs-host clock-base
    inconsistency that was silently dropping every netflow channel.

  - training/_split.py — three held-out recipes (host / sample / time)
    with profile-stratification assertions. held_out_host carries
    untested_profiles for cases like scan-and-dial absent from the test
    host (5 of 6 profiles tested cross-device, never silently averaged).

  - training/models/ — 6 architectures behind a common BaseModel
    interface: gbt (XGBoost), mlp, cnn, gru, lstm, transformer. Each
    trained twice (realistic / oracle) per the deployment threat model.
    Schema-hashed checkpoints refuse to load if _features.py changed
    since training (silent-input-drift protection, tested).

  - training/trainer/ — unified training loop: class-weighted CE, LR
    warmup + cosine, gradient clipping, mixed precision when CUDA,
    early stopping on val macro F1, best-on-val checkpoint. Same loop
    runs MLP/CNN/GRU/LSTM/Transformer; GBT uses XGBoost
    early_stopping_rounds on val mlogloss.

  - training/eval_/ — bootstrap 95% CIs on macro F1, per-class F1,
    per-profile and per-host breakdown, paired-bootstrap significance
    for model-vs-model gap. Confusion matrix uses union of seen labels.

  - training/dashboard/producers/ — replay/metrics/perf/profiles
    emitting the six event types the dashboard's awaiting scenes
    consume; on-demand tensor extraction so the Pi can run live
    inference without 65 GB of shards.

  - 17 unit tests (split coverage, features round-trip, schema mismatch,
    determinism, time-base alignment regression).

End-to-end smoke-trained all six on a 567-episode subset; held-out
test macro F1 reported with paired-bootstrap significance. The
methodology now reports honest cross-device generalization, not
in-distribution validation.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-08 01:19:00 -05:00
..
auto_fetch_samples.py auto_fetch_samples: pick Linux i386 ELF; manifest matches theZoo 2026-05-01 03:28:26 -05:00
build_cidata.py PIPELINE §5 step 1: fix four root-cause defects 2026-05-03 17:05:25 -05:00
build_target.py PIPELINE §5 step 3: target VM build infrastructure + containment posture 2026-05-04 01:31:40 -05:00
check_fleet_health.py fleet-health: exit 0 when alerts found (don't mark unit failed) 2026-05-02 13:51:20 -05:00
cis490_doctor.py shipper: systemd watchdog, quarantine cleanup; doctor surfaces ship errors 2026-05-01 12:02:59 -05:00
dataset_validate.py training: validator, feature/tensor extractors, 6 supervised models, schema-hashed checkpoints, eval suite, dashboard producers 2026-05-08 01:19:00 -05:00
fetch_sample.py Close out the open issues: bridge pcap wiring, perf collector, Tier-4 2026-04-30 00:17:49 -05:00
index_backfill.py Receiver enforces X-Cis490-Code-Commit allow-list (live, auto-refreshed) 2026-05-01 01:38:50 -05:00
index_reader.py Close out the deployment-readiness gaps 2026-04-30 00:31:55 -05:00
load_mimic.py Synthetic envelope demo: phase-driven load mimic + plotter 2026-04-28 23:53:20 -06:00
plot_envelope.py Close out the deployment-readiness gaps 2026-04-30 00:31:55 -05:00
prune_episodes.py Multi-signal prune classifier: rescue valid episodes /proc misses 2026-04-30 19:10:01 -05:00
quarantine_unstamped.py fix: lab-host install loop after commit-gate cutover 2026-05-01 11:36:21 -05:00
run_campaign.py Add automated campaign runner, shipper, and systemd units 2026-04-30 14:53:40 -06:00
run_envelope_demo.py Synthetic envelope demo: phase-driven load mimic + plotter 2026-04-28 23:53:20 -06:00
run_fleet.py PIPELINE §5 step 2: canonical manifest at <repo>/manifest.toml 2026-05-04 01:25:01 -05:00
run_real_vm_demo.py Tier-2 episodes use clean-only schedule; .gitignore VERSION 2026-05-04 01:55:37 -05:00
run_tier3_demo.py Tier-3 fixes: b'' probe false-positive, requires_bridge, msgpack 2026-05-05 15:15:18 -06:00
ship_health_check.py fleet-health: proactive alerts on the Pi + per-host doctor reports 2026-05-02 13:48:31 -05:00
shipper.py Add automated campaign runner, shipper, and systemd units 2026-04-30 14:53:40 -06:00
show_envelope.sh Interactive envelope plot via WebAgg (browser-based) 2026-04-29 00:06:22 -06:00
verify_catalog.py PIPELINE §5 step 4: catalog admission verifier (§4.3) 2026-05-04 01:35:32 -05:00
verify_tier3_local.py tools/verify_tier3_local.py: Pi-runnable Tier-3 verifier 2026-05-01 03:41:21 -05:00
vm_load_controller.py Fix workload-silent false-positive on Alpine busybox guests (closes #15) 2026-04-30 17:28:48 -05:00
vm_serial.py Tier 2: real Alpine VM, real workload, real envelope 2026-04-29 08:38:53 -06:00