# Dashboard request — cap + evict for the KNN scatter, plus snapshot-replace semantics

**Audience:** dashboard session (owns `training/dashboard/`).
**Producer side:** `training/producers/knn.py stream` (currently
running on the Pi).
**Status:** request, not implementation. The producer side has reduced
its cadence as a band-aid, but the underlying fix lives in
`training/dashboard/static/dashboard.js` scene-9 (KNN scatter) handler.

## Problem

The current scene-9 handler is:

```js
on('embedding', m => {
  if (typeof m.x !== 'number' || typeof m.y !== 'number') return;
  const pt = { x: m.x, y: m.y, z: ..., phase: m.phase,
               predicted: m.predicted, cluster: ... };
  points.push(pt);
  addStat(pt);
  rebuildLegend();
});
```

Every `embedding` event pushes onto the `points` array with no cap.
The producer republishes its (deterministic, stable) point set on a
cycle so reconnecting browsers eventually see the scatter populate;
each cycle therefore pushes the same N points onto `points` again,
and over time the in-memory point count grows without bound. After
~10 minutes the browser starts slowing down.

The producer side has band-aided this by reducing cycle cadence
(200 points every 30 s, was 600 every 5 s). That's 18× slower
accumulation, but still a leak.

## Two complementary fixes the dashboard could land

### A. Cap the points array (cheapest)

Add a FIFO eviction:

```js
const MAX_POINTS = 4000;   // tune to taste
on('embedding', m => {
  // ...validate + build pt as before...
  points.push(pt);
  if (points.length > MAX_POINTS) {
    points.shift();   // or splice(0, points.length - MAX_POINTS)
  }
  addStat(pt);
  rebuildLegend();
});
```

This bounds memory regardless of how often the producer publishes.
Existing visual quality stays the same once the cap is reached
(the most-recent N points are kept).

### B. Snapshot-replace via a new event type

For a cleaner architecture: the producer sends one `embedding_batch`
event per cycle containing the full set of points; the handler
*replaces* the contents of `points` rather than appending. Eliminates
duplicate-publish leakage entirely and naturally supports the
"refresh shows something" use case via state replay (see the
companion request `dashboard-request-embedding-persistence.md`).

Producer would emit:

```json
{ "type": "embedding_batch",
  "points": [ {x, y, z, phase, predicted, cluster}, ... ],
  "replace": true }
```

Handler:

```js
on('embedding_batch', m => {
  if (m.replace) { points.length = 0; resetStats(); }
  for (const pt of m.points) { points.push(pt); addStat(pt); }
  rebuildLegend();
});
```

If you want this, the producer side will switch to it; just confirm
the event name and payload shape.

## Suggested order

A first (1-line change, fixes the leak today). B second when you have
time — pairs naturally with the snapshot/sticky-cache request for
refresh-time hydration.