CIS490/docs/dashboard-request-scenes-7-8-12.md

# Dashboard request — scenes 7, 8, 12 visibility fixes

**Audience:** dashboard session (owns `training/dashboard/`).
**Producer side (this session):**
* `training/producers/multi_model_metrics.py` — publishes
  `ModelMetric` and `ModelPerf` for **gbt, mlp, cnn, knn_semi, gru,
  lstm, bert** (every 5 s)
* `training/producers/knn.py stream` — publishes `ModelMetric`+
  `ModelPerf` for **knn**
* Lambda-side `scripts/lambda-live-detection-loop.py` — publishes
  `LiveDetection` **and now also `Prediction`** events per inference
  window

All confirmed delivering (`{"delivered":N}` from `/publish`).
Visibility issues are all in `training/dashboard/static/dashboard.js`.

The user has flagged this twice now: scene 7 (chunking) and scene 9
(model bars) are not showing real-data state in deck mode. The events
exist; the widgets just don't render them. **This is the blocker
for the talk.**

---

## Scene 7 — chunking timeline (`#chunk-row`)

**Problem.** Cells are only built inside `buildExample()`, which is wired
to `demo_start`. The `prediction` handler can only update existing
cells:

```js
on('prediction', m => {
  if (typeof m.window_idx !== 'number') return;
  const cells = rowEl.querySelectorAll('.chunk-cell');
  const cell = cells[m.window_idx];
  if (!cell) return;            // ← always falls through if no demo
  ...
});
```

If a real `prediction` event arrives without `demo_start` having
fired first, `cells.length === 0` and the event is silently dropped.

**Why we can't just publish `demo_start` from this side.** It has
destructive side-effects on other scenes: scene-9 (KNN scatter)
loads synthetic data on `demo_start`, scene-attack profile loads
synthetic curves on `demo_start`, etc. We tried this once and
clobbered the live KNN scatter.

**Fix request.** Lazy cell-build inside the `prediction` handler when
no cells exist yet:

```js
on('prediction', m => {
  if (typeof m.window_idx !== 'number') return;
  if (rowEl.children.length === 0 || rowEl.querySelector('.chunk-empty')) {
    // Build N empty cells on first prediction. Width grows lazily.
    rowEl.innerHTML = '';
    ruleEl.innerHTML = '';
    axisEl.innerHTML = '';
  }
  // Ensure cell at index exists; pad with empty cells up to window_idx.
  let cells = rowEl.querySelectorAll('.chunk-cell');
  while (cells.length <= m.window_idx) {
    const c = document.createElement('div');
    c.className = 'chunk-cell';
    c.textContent = '';
    rowEl.appendChild(c);
    ruleEl.appendChild(Object.assign(
      document.createElement('div'), { className: 'tick' }));
    const t = document.createElement('span');
    t.textContent = `${cells.length * 10}s`;
    axisEl.appendChild(t);
    cells = rowEl.querySelectorAll('.chunk-cell');
  }
  const cell = cells[m.window_idx];
  const phase = m.predicted || m.actual;
  if (!phase) return;
  cell.className = `chunk-cell ${phase}`;
  cell.textContent = phase.replace('_', ' ');
});
```

This keeps `demo_start`/`demo_stop` working and additionally lights up
the row from real `prediction` events.

If the Lambda producer re-runs episodes from window 0, you may also
want a reset on `prediction` events with `window_idx === 0` (clear all
cells, rebuild fresh). We can publish a `prediction_reset` event too
if you'd prefer an explicit signal — let us know.

---

## Scene 8 — model accuracy bars (`.model-row`)

**Problem.** The bar fill formula compresses to nothing for any
F1 < 0.5:

```js
const visiblePct = Math.max(0, Math.min(1, (acc - 0.5) / 0.5)) * 100;
```

Our trained models on the cross-device test split honestly land in
0.30–0.55 range (this is the **point** of held-out-by-host evaluation —
real generalization is hard). With the current scale, ≥ half the bars
render as 0% wide and look like there's no data flowing.

**Fix request.** Either:

(a) Use the full 0–1 range so a 0.35-F1 bar is still visibly 35% filled:

```js
const visiblePct = Math.max(0, Math.min(1, acc)) * 100;
```

(b) Or add the numeric F1 next to the empty-looking bars (we already
publish it in `accuracy`); the right-hand `.model-acc` element does
already render `acc.toFixed(3)` so this may already be readable —
verify that's still being shown when fill is 0%.

We strongly prefer (a). Hiding 0.30-F1 models behind a 0% bar tells the
user "no data" when the truth is "the model is honestly not great
under cross-host generalization." That's the headline finding.

---

## Scene 12 — accuracy vs inference cost scatter

**Problem A: y-axis range.** y is clamped to `[0.7, 1.0]` (or similar
high range). Every model with F1 < 0.7 stacks on the bottom edge.

**Fix.** Open the y-axis to `[0.0, 1.0]` (or auto-fit to the published
range with a small margin). The chart's whole point is "model honesty
under cross-device shift" — letting bad models show as bad is the
right answer.

**Problem B: overlapping labels.** Multiple points at the same
y-coordinate (especially when stacked at the floor) draw their model
name labels on top of each other. We've already shortened the
displayed names producer-side (`gbt-O`, `mlp-R`, `knns-O`, `trf-R`,
etc., max 6 chars). That helps but doesn't fully solve it when 5+
points cluster.

**Fix request, pick whichever is easiest:**

1. Skip label rendering when point density is high (only label points
   that are local extrema, e.g. best F1, lowest latency, or
   non-Pareto-dominated points).
2. Offset overlapping labels with a force layout (`d3-force` style) or
   even just a fixed alternating up/down/left/right pattern.
3. Show labels only on hover, with a small dot-only render at rest.

Option (3) is the cleanest visually and matches how most real "model
zoo" scatters render in papers.

---

## Verification after dashboard JS lands

Producer side keeps publishing on these channels (already running on
the Pi + Lambda):

- `prediction` (scene 7) — once Lambda producer is re-pointed at
  scene 7 events, see request below
- `model_metric` + `model_perf` (scenes 8, 12) — every 30 s from
  `multi_model_metrics.py` on the Pi
- `live_detection` (scene-live) — continuously from Lambda

Open the dashboard, watch each scene. Empty-state placeholders should
disappear within ~30 s of page load.

---

## Side note for scene 7 — currently no `prediction` events flow

The Lambda producer (`live_detection_loop_v2.py`) currently emits
`live_detection` events for the scene-live swim lanes. If you want
scene 7 lit up with the same data, we can mirror per-window output to
the `prediction` event type as well — say the word and we'll add a
second emit. Doing that without the lazy-cell-build above accomplishes
nothing on the dashboard, so let us wait on this until the JS lands.