Starlette + WebSocket dashboard run on the Pi as cis490-dashboard.service
(127.0.0.1:8447, Caddy-fronted at dashboard.wg). Tails
/var/lib/cis490/index.jsonl for episode events, snapshots host counts
every 30s, broadcasts to every connected browser. New connections get a
warm snapshot (recent_episodes, total_bytes, host_counts) so reloads
don't see a cold dashboard.
Frontend is a 10-scene scrollytelling deck following the project
outline: intro, collect, hosts, db explorer, baseline, attacks,
chunking, models, knn, perf. Sticky full-bleed canvas with a
right-aligned prose column (matrix-explorable layout). Hotkeys (arrows,
space, j/k, c, Home/End), prev/next chevrons, FAB, and an opt-in
click-to-advance toggle. Demo toggle drives synthetic data for the
five scenes that have no real producer yet (attack envelopes,
chunking, model bars, knn scatter, perf scatter); when off, those
scenes show "awaiting <event_type> events" rather than fake data.
Producers wire in by POSTing typed JSON to 127.0.0.1:8447/publish
(loopback only; Caddy 404s it externally). Event types the widgets
subscribe to: model_metric {model, accuracy}, embedding {x, y, phase},
model_perf {model, latency_us, accuracy}, prediction {episode_id,
window_idx, predicted, actual}, attack_profile {name, shape, curve},
phase {phase}.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
301 lines
13 KiB
HTML
301 lines
13 KiB
HTML
<!DOCTYPE html>
|
||
<html lang="en">
|
||
<head>
|
||
<meta charset="utf-8">
|
||
<meta name="viewport" content="width=device-width, initial-scale=1">
|
||
<title>CIS490 — live</title>
|
||
<link rel="stylesheet" href="/static/dashboard.css?v=8176c951">
|
||
</head>
|
||
<body>
|
||
<header class="topbar">
|
||
<span class="brand">CIS490</span>
|
||
<span id="status" class="status">connecting…</span>
|
||
<span class="spacer"></span>
|
||
<span class="counter"><span id="scene-idx">1</span> / <span id="scene-total">1</span></span>
|
||
<button id="prev-btn" class="ghost icon" title="Previous (← / k)">◀</button>
|
||
<button id="next-btn" class="ghost icon" title="Next (→ / space / j)">▶</button>
|
||
<button id="click-nav-btn" class="ghost" title="Click on the stage to advance to the next slide (c)">click-nav: off</button>
|
||
<button id="demo-btn" class="ghost" title="Toggle local synthetic data">demo: off</button>
|
||
</header>
|
||
|
||
<div class="layout">
|
||
<div class="canvas-wrapper" id="stage-col">
|
||
<div class="stage">
|
||
|
||
<!-- 1. intro -->
|
||
<div class="stage-view" data-view="intro">
|
||
<div class="bg-grid"></div>
|
||
<div class="intro-block">
|
||
<div class="intro-eyebrow">cis490 · live fleet telemetry</div>
|
||
<div class="intro-title">behavioral<br>malware<br>detection</div>
|
||
</div>
|
||
</div>
|
||
|
||
<!-- 2. collect -->
|
||
<div class="stage-view" data-view="collect">
|
||
<div class="metric-stack">
|
||
<div class="metric-eyebrow">episodes ingested</div>
|
||
<div class="metric-big" id="ingest-total">0</div>
|
||
<div class="metric-sub">
|
||
<span id="ingest-rate">0.0</span> / sec · last 60 s ·
|
||
total bytes on disk: <span id="ingest-bytes">0 B</span>
|
||
</div>
|
||
<svg class="sparkline" id="ingest-spark" viewBox="0 0 600 120" preserveAspectRatio="none">
|
||
<path id="ingest-spark-fill" d=""></path>
|
||
<path id="ingest-spark-path" d=""></path>
|
||
</svg>
|
||
</div>
|
||
</div>
|
||
|
||
<!-- 3. hosts -->
|
||
<div class="stage-view" data-view="hosts">
|
||
<div class="metric-stack">
|
||
<div class="metric-eyebrow">per-host shipping</div>
|
||
<div class="bars" id="host-bars">
|
||
<div class="awaiting">awaiting snapshot…</div>
|
||
</div>
|
||
</div>
|
||
</div>
|
||
|
||
<!-- 4. db — episode database explorer -->
|
||
<div class="stage-view" data-view="db">
|
||
<div class="metric-stack metric-stack-wide">
|
||
<div class="db-header">
|
||
<div class="metric-eyebrow">episode database · last 200 records</div>
|
||
<div class="db-count" id="db-count">0 of 0</div>
|
||
</div>
|
||
<div class="db-controls">
|
||
<div class="db-tabs" id="db-tabs"></div>
|
||
<input class="db-search" id="db-search" type="text"
|
||
placeholder="filter by host / id / sha…" />
|
||
</div>
|
||
<div class="db-table-wrap">
|
||
<table class="db-table">
|
||
<thead>
|
||
<tr>
|
||
<th>host</th>
|
||
<th>episode_id</th>
|
||
<th>received</th>
|
||
<th>size</th>
|
||
</tr>
|
||
</thead>
|
||
<tbody id="db-tbody"></tbody>
|
||
</table>
|
||
</div>
|
||
<div class="db-detail" id="db-detail" hidden>
|
||
<pre id="db-detail-pre"></pre>
|
||
</div>
|
||
</div>
|
||
</div>
|
||
|
||
<!-- 5. baseline -->
|
||
<div class="stage-view" data-view="baseline">
|
||
<div class="metric-stack">
|
||
<div class="metric-eyebrow">phase mix · last 5 min</div>
|
||
<div class="phase-stack" id="phase-stack"></div>
|
||
<div class="phase-legend" id="phase-legend"></div>
|
||
<div class="metric-sub">awaiting <code>phase</code> events from
|
||
the orchestrator. A clean fleet sits mostly in
|
||
<code>clean</code>; skew toward <code>infecting</code> means
|
||
the workload is firing.</div>
|
||
</div>
|
||
</div>
|
||
|
||
<!-- 6. attacks -->
|
||
<div class="stage-view" data-view="attacks">
|
||
<div class="metric-stack">
|
||
<div class="metric-eyebrow">attack envelopes · /proc signature per profile</div>
|
||
<div class="profile-grid" id="profile-grid"></div>
|
||
</div>
|
||
</div>
|
||
|
||
<!-- 7. chunking -->
|
||
<div class="stage-view" data-view="chunking">
|
||
<div class="metric-stack">
|
||
<div class="metric-eyebrow">10-second windows · model input shape</div>
|
||
<div class="chunk-rule" id="chunk-rule"></div>
|
||
<div class="chunk-row" id="chunk-row"></div>
|
||
<div class="chunk-axis" id="chunk-axis"></div>
|
||
<div class="metric-sub">each window: 100 samples (10 Hz × 10 s),
|
||
labeled by the phase that occupies its center.</div>
|
||
</div>
|
||
</div>
|
||
|
||
<!-- 8. models -->
|
||
<div class="stage-view" data-view="models">
|
||
<div class="metric-stack">
|
||
<div class="metric-eyebrow">sequence models · accuracy on held-out samples</div>
|
||
<div class="model-bars" id="model-bars"></div>
|
||
</div>
|
||
</div>
|
||
|
||
<!-- 9. knn -->
|
||
<div class="stage-view" data-view="knn">
|
||
<div class="metric-stack">
|
||
<div class="metric-eyebrow">window features · 2-D projection</div>
|
||
<svg class="scatter" id="knn-scatter" viewBox="0 0 600 360" preserveAspectRatio="xMidYMid meet"></svg>
|
||
<div class="phase-legend" id="knn-legend"></div>
|
||
</div>
|
||
</div>
|
||
|
||
<!-- 10. perf -->
|
||
<div class="stage-view" data-view="perf">
|
||
<div class="metric-stack">
|
||
<div class="metric-eyebrow">accuracy vs inference cost</div>
|
||
<svg class="scatter" id="perf-scatter" viewBox="0 0 600 360" preserveAspectRatio="xMidYMid meet"></svg>
|
||
<div class="metric-sub">x: μs / window (lower is better) ·
|
||
y: held-out accuracy (higher is better).</div>
|
||
</div>
|
||
</div>
|
||
|
||
</div>
|
||
<button id="next-fab" class="fab" data-no-advance title="Next (→)">▼</button>
|
||
</div>
|
||
|
||
<article class="article">
|
||
|
||
<section class="scene" data-stage="intro">
|
||
<div class="prose">
|
||
<p class="lede">Most malware doesn't look like malware in a database
|
||
— it looks like a process behaving badly.</p>
|
||
<p>An <strong>intrusion detection system</strong> spots the bad
|
||
behavior; an <strong>intrusion prevention system</strong> stops it.
|
||
Both depend on knowing what bad behavior <em>looks like</em> at the
|
||
level of telemetry the device can actually see.</p>
|
||
<p>This deck is the live face of the dataset we're building to teach
|
||
a model that distinction — every panel on the left is a slice of
|
||
real data shipping in right now.</p>
|
||
<p class="hint">scroll, click, or → to advance</p>
|
||
</div>
|
||
</section>
|
||
|
||
<section class="scene" data-stage="collect">
|
||
<div class="prose">
|
||
<h2>Collecting the dataset</h2>
|
||
<p>Each lab host on the WireGuard mesh boots a real Alpine VM, runs
|
||
a profile-driven workload inside it, and samples
|
||
<code>/proc/<qemu_pid></code> at 10 Hz. Every ~30 seconds
|
||
the labeled tarball is shipped to this Pi over mTLS.</p>
|
||
<p>The counter on the left is the running total, sourced from the
|
||
receiver's <code>index.jsonl</code> on disk. The sparkline is the
|
||
arrival rate over the last sixty seconds.</p>
|
||
</div>
|
||
</section>
|
||
|
||
<section class="scene" data-stage="hosts">
|
||
<div class="prose">
|
||
<h2>A multi-host fleet</h2>
|
||
<p>Running the same orchestrator on multiple hosts gives novel,
|
||
non-overlapping data per host — no central coordinator. Each host
|
||
pulls a different slice of the manifest, so the dataset grows in
|
||
parallel.</p>
|
||
<p>The numbers below are absolute episode counts on disk, refreshed
|
||
from <code>/var/lib/cis490/episodes/<host>/</code> every
|
||
thirty seconds.</p>
|
||
</div>
|
||
</section>
|
||
|
||
<section class="scene" data-stage="db">
|
||
<div class="prose">
|
||
<h2>The dataset, browsable</h2>
|
||
<p>Every row is one labeled episode tarball stored at
|
||
<code>/var/lib/cis490/episodes/<host>/<id>.tar.zst</code>
|
||
after the receiver verifies its SHA-256 and writes it through.</p>
|
||
<p>Filter by host with the tabs, or grep by host / episode id /
|
||
sha with the search box. Click a row for the full
|
||
<code>index.jsonl</code> record. The view holds the most recent
|
||
two hundred records — older history is on disk, indexable
|
||
from the receiver.</p>
|
||
</div>
|
||
</section>
|
||
|
||
<section class="scene" data-stage="baseline">
|
||
<div class="prose">
|
||
<h2>A baseline of normal</h2>
|
||
<p>Before we can detect a deviation, we have to know what the fleet
|
||
looks like when it's healthy. The stacked bar shows the fraction
|
||
of the last five minutes of fleet activity that sat in each phase
|
||
— a healthy mix has plenty of <code>clean</code>.</p>
|
||
<p>If the model only ever sees <code>clean</code>, it overfits to
|
||
"everything is fine." The phase schedule fixes that by forcing the
|
||
workload to walk through every phase on every run.</p>
|
||
</div>
|
||
</section>
|
||
|
||
<section class="scene" data-stage="attacks">
|
||
<div class="prose">
|
||
<h2>Linking attack to telemetry</h2>
|
||
<p>The same six profiles run across every host, and each one
|
||
produces a different envelope in <code>/proc</code>. A
|
||
cryptominer pegs one core for minutes. A bursty C2 channel sits
|
||
idle, then exhales three packets. Ransomware walks the
|
||
filesystem and saturates I/O.</p>
|
||
<p>The thumbnails on the left are the canonical envelopes the
|
||
model has to learn to recognize — same axes, different shapes.
|
||
That shape difference is what makes detection tractable.</p>
|
||
</div>
|
||
</section>
|
||
|
||
<section class="scene" data-stage="chunking">
|
||
<div class="prose">
|
||
<h2>Ten-second windows</h2>
|
||
<p>Models eat fixed-size inputs. We chop each episode into
|
||
10-second windows — 100 samples per window at 10 Hz — and
|
||
label each window with the phase that occupies its center.</p>
|
||
<p>Window size is a knob. Too short and the model can't see slow
|
||
envelopes (low-and-slow malware, idle C2). Too long and you can't
|
||
react fast enough to be a useful prevention signal. Ten seconds
|
||
is the starting point we tune around.</p>
|
||
</div>
|
||
</section>
|
||
|
||
<section class="scene" data-stage="models">
|
||
<div class="prose">
|
||
<h2>Sequence models</h2>
|
||
<p><strong>RNN, GRU, LSTM</strong> — recurrent models that read the
|
||
window one timestep at a time and carry state forward. Cheap,
|
||
mature, easy to interpret.</p>
|
||
<p><strong>BERT-style transformer</strong> — the window becomes a
|
||
sequence of "tokens"; attention captures cross-position context
|
||
instead of accumulating it through a hidden state. More
|
||
parameters, more compute, more room to overfit a small dataset.</p>
|
||
<p>Same input, same labels, four different inductive biases. The
|
||
comparison on the left is the punchline of the whole project.</p>
|
||
</div>
|
||
</section>
|
||
|
||
<section class="scene" data-stage="knn">
|
||
<div class="prose">
|
||
<h2>Nearest-neighbor as a sanity check</h2>
|
||
<p>Before anything fancy: engineer summary features per window
|
||
(mean, std, p95, slope, zero-bucket counts per channel) and run
|
||
<strong>KNN</strong> in that feature space.</p>
|
||
<p>If the phase clusters separate visibly in two dimensions, KNN
|
||
already does most of the work and a deep model is only buying
|
||
marginal improvement. If they don't separate, you've learned
|
||
something about the feature engineering before training a single
|
||
epoch.</p>
|
||
</div>
|
||
</section>
|
||
|
||
<section class="scene" data-stage="perf">
|
||
<div class="prose">
|
||
<h2>Accuracy vs complexity</h2>
|
||
<p>Bigger models earn better numbers in the validation set — but
|
||
they also need more parameters, more inference time, and more
|
||
memory at the edge. The deployed model has to fit on the device
|
||
it's protecting.</p>
|
||
<p>The scatter on the left is the usable trade-off curve: every
|
||
point above and to the left of where you currently sit is a
|
||
reachable upgrade. The point in the bottom-right is a model
|
||
you'd never ship.</p>
|
||
</div>
|
||
</section>
|
||
|
||
<div class="scene-end-spacer"></div>
|
||
</article>
|
||
</div>
|
||
|
||
<script src="/static/dashboard.js?v=9d42eb5f"></script>
|
||
</body>
|
||
</html>
|