From 4172ddb0c89958ea578ea8015d25a717664c2811 Mon Sep 17 00:00:00 2001 From: Max Date: Fri, 8 May 2026 15:30:45 -0500 Subject: [PATCH] =?UTF-8?q?docs:=20request=20to=20dashboard=20side=20?= =?UTF-8?q?=E2=80=94=20cap=20+=20evict=20for=20the=20KNN=20scatter?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The scene-9 embedding handler appends to a `points` array without ever capping. The producer republishes its (stable, deterministic) point set on a cycle so reconnecting browsers eventually see the scatter; each cycle pushes the same N points again and the in-memory count grows without bound. Browser slows after ~10 min. Two complementary fixes proposed: A. FIFO cap (1-line change in the handler — fixes the leak today) B. embedding_batch event with replace=true (cleaner, pairs with the snapshot/sticky-cache request for refresh-time hydration) Producer side has already reduced cadence as a band-aid (200 pts every 30 s, was 600 every 5 s) — 18x slower accumulation but still unbounded. Co-Authored-By: Claude Opus 4.7 (1M context) --- docs/dashboard-request-knn-cap-evict.md | 93 +++++++++++++++++++++++++ 1 file changed, 93 insertions(+) create mode 100644 docs/dashboard-request-knn-cap-evict.md diff --git a/docs/dashboard-request-knn-cap-evict.md b/docs/dashboard-request-knn-cap-evict.md new file mode 100644 index 0000000..50faab1 --- /dev/null +++ b/docs/dashboard-request-knn-cap-evict.md @@ -0,0 +1,93 @@ +# Dashboard request — cap + evict for the KNN scatter, plus snapshot-replace semantics + +**Audience:** dashboard session (owns `training/dashboard/`). +**Producer side:** `training/producers/knn.py stream` (currently +running on the Pi). +**Status:** request, not implementation. The producer side has reduced +its cadence as a band-aid, but the underlying fix lives in +`training/dashboard/static/dashboard.js` scene-9 (KNN scatter) handler. + +## Problem + +The current scene-9 handler is: + +```js +on('embedding', m => { + if (typeof m.x !== 'number' || typeof m.y !== 'number') return; + const pt = { x: m.x, y: m.y, z: ..., phase: m.phase, + predicted: m.predicted, cluster: ... }; + points.push(pt); + addStat(pt); + rebuildLegend(); +}); +``` + +Every `embedding` event pushes onto the `points` array with no cap. +The producer republishes its (deterministic, stable) point set on a +cycle so reconnecting browsers eventually see the scatter populate; +each cycle therefore pushes the same N points onto `points` again, +and over time the in-memory point count grows without bound. After +~10 minutes the browser starts slowing down. + +The producer side has band-aided this by reducing cycle cadence +(200 points every 30 s, was 600 every 5 s). That's 18× slower +accumulation, but still a leak. + +## Two complementary fixes the dashboard could land + +### A. Cap the points array (cheapest) + +Add a FIFO eviction: + +```js +const MAX_POINTS = 4000; // tune to taste +on('embedding', m => { + // ...validate + build pt as before... + points.push(pt); + if (points.length > MAX_POINTS) { + points.shift(); // or splice(0, points.length - MAX_POINTS) + } + addStat(pt); + rebuildLegend(); +}); +``` + +This bounds memory regardless of how often the producer publishes. +Existing visual quality stays the same once the cap is reached +(the most-recent N points are kept). + +### B. Snapshot-replace via a new event type + +For a cleaner architecture: the producer sends one `embedding_batch` +event per cycle containing the full set of points; the handler +*replaces* the contents of `points` rather than appending. Eliminates +duplicate-publish leakage entirely and naturally supports the +"refresh shows something" use case via state replay (see the +companion request `dashboard-request-embedding-persistence.md`). + +Producer would emit: + +```json +{ "type": "embedding_batch", + "points": [ {x, y, z, phase, predicted, cluster}, ... ], + "replace": true } +``` + +Handler: + +```js +on('embedding_batch', m => { + if (m.replace) { points.length = 0; resetStats(); } + for (const pt of m.points) { points.push(pt); addStat(pt); } + rebuildLegend(); +}); +``` + +If you want this, the producer side will switch to it; just confirm +the event name and payload shape. + +## Suggested order + +A first (1-line change, fixes the leak today). B second when you have +time — pairs naturally with the snapshot/sticky-cache request for +refresh-time hydration.