CIS490/training/dashboard
Max 1fabd4a246 training: validator, feature/tensor extractors, 6 supervised models, schema-hashed checkpoints, eval suite, dashboard producers
The model layer of the project, built honestly:

  - tools/dataset_validate.py — full-sweep validator over the receiver
    store (sha256, schema, monotonic labels, telemetry-row gate). On the
    current corpus: 64,798 accepted + 8,154 degraded + 3,701 rejected +
    7 errored across 76,660 shipped episodes. data/processed/validation_v1.parquet
    is committed as the per-episode acceptance index.

  - training/_features.py — channel registry (46 channels across
    proc/guest/qmp/netflow), summary-stat windowing AND channel×time
    tensor extraction at 10s/5s windowing. Time alignment uses t_wall_ns
    (Unix ns) — tested fix for a real netflow-vs-host clock-base
    inconsistency that was silently dropping every netflow channel.

  - training/_split.py — three held-out recipes (host / sample / time)
    with profile-stratification assertions. held_out_host carries
    untested_profiles for cases like scan-and-dial absent from the test
    host (5 of 6 profiles tested cross-device, never silently averaged).

  - training/models/ — 6 architectures behind a common BaseModel
    interface: gbt (XGBoost), mlp, cnn, gru, lstm, transformer. Each
    trained twice (realistic / oracle) per the deployment threat model.
    Schema-hashed checkpoints refuse to load if _features.py changed
    since training (silent-input-drift protection, tested).

  - training/trainer/ — unified training loop: class-weighted CE, LR
    warmup + cosine, gradient clipping, mixed precision when CUDA,
    early stopping on val macro F1, best-on-val checkpoint. Same loop
    runs MLP/CNN/GRU/LSTM/Transformer; GBT uses XGBoost
    early_stopping_rounds on val mlogloss.

  - training/eval_/ — bootstrap 95% CIs on macro F1, per-class F1,
    per-profile and per-host breakdown, paired-bootstrap significance
    for model-vs-model gap. Confusion matrix uses union of seen labels.

  - training/dashboard/producers/ — replay/metrics/perf/profiles
    emitting the six event types the dashboard's awaiting scenes
    consume; on-demand tensor extraction so the Pi can run live
    inference without 65 GB of shards.

  - 17 unit tests (split coverage, features round-trip, schema mismatch,
    determinism, time-base alignment regression).

End-to-end smoke-trained all six on a 567-episode subset; held-out
test macro F1 reported with paired-bootstrap significance. The
methodology now reports honest cross-device generalization, not
in-distribution validation.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-08 01:19:00 -05:00
..
producers training: validator, feature/tensor extractors, 6 supervised models, schema-hashed checkpoints, eval suite, dashboard producers 2026-05-08 01:19:00 -05:00
static training/dashboard: click a db row → render the episode envelope 2026-05-08 01:16:54 -05:00
__init__.py training/dashboard: live deck at dashboard.wg, fed by receiver 2026-05-07 21:26:07 -05:00
__main__.py training/dashboard: live deck at dashboard.wg, fed by receiver 2026-05-07 21:26:07 -05:00
app.py training/dashboard: click a db row → render the episode envelope 2026-05-08 01:16:54 -05:00
client.py training/dashboard: PRODUCERS.md + client.py for the model session 2026-05-07 21:34:54 -05:00
dashboard.caddy training/dashboard: live deck at dashboard.wg, fed by receiver 2026-05-07 21:26:07 -05:00
feeder.py training/dashboard: live deck at dashboard.wg, fed by receiver 2026-05-07 21:26:07 -05:00
PRODUCERS.md training/dashboard: PRODUCERS.md + client.py for the model session 2026-05-07 21:34:54 -05:00
README.md training/dashboard: PRODUCERS.md + client.py for the model session 2026-05-07 21:34:54 -05:00

training/dashboard/

Live web display served at https://dashboard.wg. A Starlette app on 127.0.0.1:8447 behind Caddy; messages from Python are pushed to connected browsers over a WebSocket.

This is intentionally a blank slate — the default page just appends every received JSON message to a scrolling log. Build the real widgets on top of window.dashboard.onMessage.

Run locally

uv run python -m training.dashboard
# → open http://127.0.0.1:8447

Push live data from Python

Same process (e.g. a notebook driving the page):

from training.dashboard.app import broadcaster
await broadcaster.publish({"type": "metric", "name": "loss", "value": 0.42})

Different process (orchestrator, receiver, ad-hoc shell):

curl -s http://127.0.0.1:8447/publish \
     -H 'content-type: application/json' \
     -d '{"type":"hello","msg":"from cron"}'

The /publish endpoint is loopback-only (403 otherwise) and is not reverse-proxied by Caddy, so it cannot be hit from the WG mesh.

Producing events from your own code

If you're writing code that should drive a dashboard widget (model inference, training-loop metrics, profile envelopes), see PRODUCERS.md — it documents every event type the widgets subscribe to, the loopback-only /publish contract, the reconnect gotcha, and the systemd hardening that constrains producers running on the Pi. There's a stdlib-only Python helper at client.py:

from training.dashboard.client import Publisher
Publisher().publish({"type": "model_metric", "model": "lstm", "accuracy": 0.928})

Customizing the page

The default static/index.html exposes window.dashboard:

window.dashboard.onMessage = (msg) => {
  if (msg.type === 'metric') updateChart(msg.name, msg.value);
};
window.dashboard.send({type: 'request-snapshot'});  // browser → server

Override onMessage to dispatch to your own widgets. The blank-slate log renderer is just there so a fresh deploy is observably alive.

Deploying to the Pi

  1. sudo cp etc/cis490-dashboard.service /etc/systemd/system/
  2. sudo cp training/dashboard/dashboard.caddy /etc/caddy/Caddyfile.d/
  3. sudo systemctl daemon-reload && sudo systemctl enable --now cis490-dashboard.service
  4. sudo systemctl reload caddy