- multi_model_metrics: publish gbt / mlp / cnn / knn_semi / gru / lstm / bert (knn handled by knn streamer); read both *_train.json and *_eval.json with macro_f1.point fallback - dashboard.css: add palette gradients for the four non-canonical names so the bars render with a fill colour - dashboard.js: open the bar's visible scale to the full 0–1 range so honest-low cross-host F1s show as a bar instead of clamping to 0% - ship lambda-live-detection-loop.py + dashboard request docs (scenes 7/8/12, sticky cache, lambda-inference-demo) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
74 lines
3.2 KiB
Markdown
74 lines
3.2 KiB
Markdown
# Live inference demo — Lambda runs replay, Pi shows predictions
|
||
|
||
Architecture for the live "catching attacks" demo (scene 7 chunking
|
||
timeline). Pi cannot run inference (RAM-bound; crashed once); all
|
||
model loading + per-window prediction must live on the A100.
|
||
|
||
## Topology
|
||
|
||
```
|
||
Pi (office-print, 10.100.0.1) Lambda A100 (ssh ubuntu@<ip>)
|
||
┌──────────────────────────┐ ┌───────────────────────────┐
|
||
│ dashboard.wg │ │ replay.py running on │
|
||
│ /publish (loopback only) │ │ episode tarballs through │
|
||
│ ↑ │ │ gbt_oracle.ckpt.json │
|
||
│ │ POST │ │ ↓ │
|
||
│ │ via SSH reverse tunnel│ │ POST 127.0.0.1:8447 │
|
||
│ │ │ │ ↑ │
|
||
│ └─── ssh -R 8447:... ───┼─────────────┤ │ │
|
||
│ │ └───────────────────────────┘
|
||
└──────────────────────────┘
|
||
```
|
||
|
||
## Setup steps
|
||
|
||
1. **Stage demo episodes on Lambda** (raw tarballs, sudo to read on Pi):
|
||
```bash
|
||
ssh -i ~/.ssh/lambda_ed25519 ubuntu@<lambda-ip> \
|
||
'mkdir -p ~/cis490/data/episodes_demo'
|
||
for eid in <episode-ids>; do
|
||
sudo cat /var/lib/cis490/episodes/<host>/${eid}.tar.zst | \
|
||
ssh -i ~/.ssh/lambda_ed25519 ubuntu@<lambda-ip> \
|
||
"cat > ~/cis490/data/episodes_demo/${eid}.tar.zst"
|
||
done
|
||
```
|
||
|
||
2. **Open SSH reverse tunnel** from Pi to Lambda. Exposes Pi's
|
||
loopback `127.0.0.1:8447` (the dashboard's `/publish` endpoint)
|
||
on Lambda's loopback `127.0.0.1:8447`:
|
||
```bash
|
||
ssh -i ~/.ssh/lambda_ed25519 \
|
||
-o ServerAliveInterval=30 \
|
||
-o ServerAliveCountMax=3 \
|
||
-o ExitOnForwardFailure=yes \
|
||
-N -R 8447:127.0.0.1:8447 \
|
||
ubuntu@<lambda-ip>
|
||
```
|
||
Verify: from Lambda, `curl http://127.0.0.1:8447/healthz` should
|
||
return the Pi's dashboard health JSON.
|
||
|
||
3. **Run replay loop on Lambda**:
|
||
```bash
|
||
ssh -i ~/.ssh/lambda_ed25519 ubuntu@<lambda-ip>
|
||
cd ~/cis490 && . .venv/bin/activate
|
||
export PYTHONPATH=$PWD/repo
|
||
nohup bash replay_loop.sh > replay_loop.log 2>&1 &
|
||
```
|
||
The loop iterates the staged demo episodes through the
|
||
trained `gbt_oracle.ckpt.json`, emitting `prediction` events
|
||
per window.
|
||
|
||
## What the user sees
|
||
|
||
- Scene 7 (chunking timeline) lights up with predicted/actual phase
|
||
per 10-second window
|
||
- Scene 8/9/12 still populated from Pi-side lightweight publishers
|
||
(knn streamer + multi_model_metrics + profiles streamer)
|
||
|
||
## Why not run replay on Pi
|
||
|
||
Pi RAM = 8 GiB. `replay.py` loads every checkpoint into memory at
|
||
startup (300 MB for KNN sidecars × multiple variants); concurrent
|
||
load with the metrics publisher's per-cycle test-set scoring
|
||
crashed the Pi. Inference belongs on the A100. The Pi's job is
|
||
display + lightweight event publishing only.
|