CIS490/scripts/lambda-inference-demo.md
Max c2a71de4b2 scene 9 bars: paint full zoo + 0–1 visible scale
- multi_model_metrics: publish gbt / mlp / cnn / knn_semi /
  gru / lstm / bert (knn handled by knn streamer); read both
  *_train.json and *_eval.json with macro_f1.point fallback
- dashboard.css: add palette gradients for the four
  non-canonical names so the bars render with a fill colour
- dashboard.js: open the bar's visible scale to the full 0–1
  range so honest-low cross-host F1s show as a bar instead of
  clamping to 0%
- ship lambda-live-detection-loop.py + dashboard request docs
  (scenes 7/8/12, sticky cache, lambda-inference-demo)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-08 17:18:00 -05:00

3.2 KiB
Raw Permalink Blame History

Live inference demo — Lambda runs replay, Pi shows predictions

Architecture for the live "catching attacks" demo (scene 7 chunking timeline). Pi cannot run inference (RAM-bound; crashed once); all model loading + per-window prediction must live on the A100.

Topology

   Pi (office-print, 10.100.0.1)            Lambda A100 (ssh ubuntu@<ip>)
   ┌──────────────────────────┐             ┌───────────────────────────┐
   │ dashboard.wg              │             │  replay.py running on     │
   │ /publish (loopback only)  │             │  episode tarballs through │
   │   ↑                       │             │  gbt_oracle.ckpt.json     │
   │   │ POST                  │             │   ↓                       │
   │   │ via SSH reverse tunnel│             │  POST 127.0.0.1:8447      │
   │   │                       │             │   ↑                       │
   │   └─── ssh -R 8447:... ───┼─────────────┤   │                       │
   │                           │             └───────────────────────────┘
   └──────────────────────────┘

Setup steps

  1. Stage demo episodes on Lambda (raw tarballs, sudo to read on Pi):

    ssh -i ~/.ssh/lambda_ed25519 ubuntu@<lambda-ip> \
        'mkdir -p ~/cis490/data/episodes_demo'
    for eid in <episode-ids>; do
        sudo cat /var/lib/cis490/episodes/<host>/${eid}.tar.zst | \
            ssh -i ~/.ssh/lambda_ed25519 ubuntu@<lambda-ip> \
                "cat > ~/cis490/data/episodes_demo/${eid}.tar.zst"
    done
    
  2. Open SSH reverse tunnel from Pi to Lambda. Exposes Pi's loopback 127.0.0.1:8447 (the dashboard's /publish endpoint) on Lambda's loopback 127.0.0.1:8447:

    ssh -i ~/.ssh/lambda_ed25519 \
        -o ServerAliveInterval=30 \
        -o ServerAliveCountMax=3 \
        -o ExitOnForwardFailure=yes \
        -N -R 8447:127.0.0.1:8447 \
        ubuntu@<lambda-ip>
    

    Verify: from Lambda, curl http://127.0.0.1:8447/healthz should return the Pi's dashboard health JSON.

  3. Run replay loop on Lambda:

    ssh -i ~/.ssh/lambda_ed25519 ubuntu@<lambda-ip>
    cd ~/cis490 && . .venv/bin/activate
    export PYTHONPATH=$PWD/repo
    nohup bash replay_loop.sh > replay_loop.log 2>&1 &
    

    The loop iterates the staged demo episodes through the trained gbt_oracle.ckpt.json, emitting prediction events per window.

What the user sees

  • Scene 7 (chunking timeline) lights up with predicted/actual phase per 10-second window
  • Scene 8/9/12 still populated from Pi-side lightweight publishers (knn streamer + multi_model_metrics + profiles streamer)

Why not run replay on Pi

Pi RAM = 8 GiB. replay.py loads every checkpoint into memory at startup (300 MB for KNN sidecars × multiple variants); concurrent load with the metrics publisher's per-cycle test-set scoring crashed the Pi. Inference belongs on the A100. The Pi's job is display + lightweight event publishing only.