The model layer of the project, built honestly:
- tools/dataset_validate.py — full-sweep validator over the receiver
store (sha256, schema, monotonic labels, telemetry-row gate). On the
current corpus: 64,798 accepted + 8,154 degraded + 3,701 rejected +
7 errored across 76,660 shipped episodes. data/processed/validation_v1.parquet
is committed as the per-episode acceptance index.
- training/_features.py — channel registry (46 channels across
proc/guest/qmp/netflow), summary-stat windowing AND channel×time
tensor extraction at 10s/5s windowing. Time alignment uses t_wall_ns
(Unix ns) — tested fix for a real netflow-vs-host clock-base
inconsistency that was silently dropping every netflow channel.
- training/_split.py — three held-out recipes (host / sample / time)
with profile-stratification assertions. held_out_host carries
untested_profiles for cases like scan-and-dial absent from the test
host (5 of 6 profiles tested cross-device, never silently averaged).
- training/models/ — 6 architectures behind a common BaseModel
interface: gbt (XGBoost), mlp, cnn, gru, lstm, transformer. Each
trained twice (realistic / oracle) per the deployment threat model.
Schema-hashed checkpoints refuse to load if _features.py changed
since training (silent-input-drift protection, tested).
- training/trainer/ — unified training loop: class-weighted CE, LR
warmup + cosine, gradient clipping, mixed precision when CUDA,
early stopping on val macro F1, best-on-val checkpoint. Same loop
runs MLP/CNN/GRU/LSTM/Transformer; GBT uses XGBoost
early_stopping_rounds on val mlogloss.
- training/eval_/ — bootstrap 95% CIs on macro F1, per-class F1,
per-profile and per-host breakdown, paired-bootstrap significance
for model-vs-model gap. Confusion matrix uses union of seen labels.
- training/dashboard/producers/ — replay/metrics/perf/profiles
emitting the six event types the dashboard's awaiting scenes
consume; on-demand tensor extraction so the Pi can run live
inference without 65 GB of shards.
- 17 unit tests (split coverage, features round-trip, schema mismatch,
determinism, time-base alignment regression).
End-to-end smoke-trained all six on a 567-episode subset; held-out
test macro F1 reported with paired-bootstrap significance. The
methodology now reports honest cross-device generalization, not
in-distribution validation.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|---|---|---|
| .. | ||
| producers | ||
| static | ||
| __init__.py | ||
| __main__.py | ||
| app.py | ||
| client.py | ||
| dashboard.caddy | ||
| feeder.py | ||
| PRODUCERS.md | ||
| README.md | ||
training/dashboard/
Live web display served at https://dashboard.wg. A Starlette app
on 127.0.0.1:8447 behind Caddy; messages from Python are pushed to
connected browsers over a WebSocket.
This is intentionally a blank slate — the default page just
appends every received JSON message to a scrolling log. Build the
real widgets on top of window.dashboard.onMessage.
Run locally
uv run python -m training.dashboard
# → open http://127.0.0.1:8447
Push live data from Python
Same process (e.g. a notebook driving the page):
from training.dashboard.app import broadcaster
await broadcaster.publish({"type": "metric", "name": "loss", "value": 0.42})
Different process (orchestrator, receiver, ad-hoc shell):
curl -s http://127.0.0.1:8447/publish \
-H 'content-type: application/json' \
-d '{"type":"hello","msg":"from cron"}'
The /publish endpoint is loopback-only (403 otherwise) and is not
reverse-proxied by Caddy, so it cannot be hit from the WG mesh.
Producing events from your own code
If you're writing code that should drive a dashboard widget (model
inference, training-loop metrics, profile envelopes), see
PRODUCERS.md — it documents every event type
the widgets subscribe to, the loopback-only /publish contract, the
reconnect gotcha, and the systemd hardening that constrains
producers running on the Pi. There's a stdlib-only Python helper at
client.py:
from training.dashboard.client import Publisher
Publisher().publish({"type": "model_metric", "model": "lstm", "accuracy": 0.928})
Customizing the page
The default static/index.html exposes window.dashboard:
window.dashboard.onMessage = (msg) => {
if (msg.type === 'metric') updateChart(msg.name, msg.value);
};
window.dashboard.send({type: 'request-snapshot'}); // browser → server
Override onMessage to dispatch to your own widgets. The blank-slate
log renderer is just there so a fresh deploy is observably alive.
Deploying to the Pi
sudo cp etc/cis490-dashboard.service /etc/systemd/system/sudo cp training/dashboard/dashboard.caddy /etc/caddy/Caddyfile.d/sudo systemctl daemon-reload && sudo systemctl enable --now cis490-dashboard.servicesudo systemctl reload caddy