training/dashboard(references): description sidebar + better space use

Two changes per the user's feedback that the slide had unused horizontal space and needed per-PDF context. Layout - The reference scene is now a 2-column grid inside the metric-stack: PDF iframe at ~1.7fr on the left, description panel at ~0.55fr on the right (min 280px). On narrow viewports (<1100px) it falls back to a vertical stack with the description capped to 240px. - Added #zoom=page-width to the iframe URL so the PDF's page fits its column width instead of leaving margins beside an 8.5x11 page rendered in a wider iframe. - Hide the prose card on the references scene — the description panel inside the stack covers what the prose was saying, and freeing the right edge gives the description proper room. Description content - Backend reads <stem>.md sidecar files alongside each PDF and returns the contents in the /api/references payload. - Frontend renders them with a tiny built-in markdown subset (headings, bold/italic, lists, inline code, paragraphs) — no third-party renderer dependency. - Initial draft sidecar .md files committed for the four PDFs currently in references/. Each describes how the paper informs a specific scene of the deck (which model row, which eval protocol, which channel selection). Edit them in place and the panel updates on the next reload.
2026-05-08 12:40:27 -05:00 · 2026-05-08 12:40:27 -05:00 · 9e38f78379
commit 9e38f78379
parent 69c563275a
8 changed files with 232 additions and 19 deletions
--- a/references/A
+++ b/references/A
@ -0,0 +1,26 @@
 # Closest direct precedent
 This paper applies deep learning to **time-series system-call traces
 inside virtual machines** for malware detection — almost exactly the
 framing of this project, just one layer deeper in the stack
 (syscall traces vs `/proc` samples).
 ## What we borrowed
 - **Windowing strategy.** The paper's fixed-length sliding-window
  formulation over a sequential telemetry stream is the same shape
  we use for our 10-second `/proc` windows fed to LSTM/GRU/RNN.
 - **Recurrent architecture as the simple-but-strong baseline.**
  Their result that an LSTM on raw sequences beats hand-crafted
  feature classifiers on the same data is the cited justification
  for our LSTM/GRU/RNN row of the model comparison.
 - **Per-VM containment posture.** Confirms our choice to run each
  episode in its own throwaway Alpine guest rather than instrumenting
  the host process directly.
 ## Where it differs
 - Their telemetry is full **syscall traces** (much richer than
  `/proc` resource counters), which is why their numbers don't
  transfer 1-to-1 to our setup. They establish *that* this works;
  we measure how well it works on a thinner, more deployable signal.
--- a/references/DANTE:
+++ b/references/DANTE:
@ -0,0 +1,26 @@
 # LSTM on event-log sequences
 DANTE applies a **plain LSTM directly to system-log event sequences**
 to flag insider-threat behavior. Earlier in the literature than the
 transformer wave, and useful here as a methodological baseline.
 ## What we borrowed
 - **Evidence that simple recurrent models are enough.** The paper
  shows an LSTM on sequence-of-events alone — no per-task feature
  engineering — captures enough temporal structure to beat
  bag-of-events classifiers. That's the empirical ground for the
  *RNN/GRU/LSTM* entries in our model comparison being plain, not
  bespoke.
 - **Negative-evidence framing.** DANTE is also explicit about cases
  where the LSTM under-performs (low-volume users, novel event
  types). Informs the *split-by-sample, not split-by-time* eval
  protocol on the perf scene — generalising to unseen actors is
  the bar.
 ## Where it differs
 - Operates on log-event token sequences (categorical), not numeric
  resource metrics (continuous). Our channels are floats from
  `/proc`, so we use the temporal structure DANTE validates without
  inheriting the embedding setup.
--- a/references/LogBERT:
+++ b/references/LogBERT:
@ -0,0 +1,26 @@
 # Transformer pretraining for log anomaly detection
 LogBERT trains **BERT-style masked-language-modeling on log
 sequences** and uses the resulting representations for unsupervised
 anomaly scoring. The closest published example of "BERT, but for
 host telemetry."
 ## What we borrowed
 - **The transformer entry in our model comparison.** LogBERT is the
  citation for why a transformer is even in the model lineup on
  scene 9 — it shows that attention over moderate-length log windows
  has enough signal to separate normal from anomalous *without*
  per-anomaly labels.
 - **Pretraining + fine-tune split.** Their two-stage setup
  (self-supervised pretrain on benign logs, downstream classifier
  on labeled anomalies) is the template we follow when describing
  the BERT model's training story on the *training-code* scene.
 ## Where it differs
 - Logs are categorical (template tokens); our windows are dense
  float vectors (12 channels × 100 samples). The BERT we run is the
  same architecture but reads continuous-valued tokens, so the
  masking objective is regression-on-masked-channels rather than
  cross-entropy-on-masked-token.
--- a/references/Transformer-based
+++ b/references/Transformer-based
@ -0,0 +1,30 @@
 # Strongest published precedent for this exact setup
 This paper applies **transformer architectures to per-process
 resource-utilisation metrics** — the same shape of telemetry we
 collect from `/proc`. Closest reference to "the project we're doing,
 but already published."
 ## What we borrowed
 - **Channel selection.** Their list of `/proc` channels overlaps
  heavily with ours (`cpu_user_jiffies`, `cpu_sys_jiffies`,
  `rss_bytes`, `io_*_bytes`, `voluntary_ctxsw`, `involuntary_ctxsw`,
  page-fault counters). Our 12-channel selection is essentially
  this set, validated.
 - **Window-and-classify framing.** They confirm that a transformer
  reading short windows of these counters beats per-window
  hand-features fed to gradient-boosted trees. That is exactly the
  comparison we run: KNN-on-features vs sequence-models-on-windows.
 - **Held-out-sample evaluation.** They emphasise generalising to
  *unseen* malware families, not unseen time-slices of the same
  family. We adopt the same eval protocol on the perf scene.
 ## Where it differs
 - They use a much larger corpus and run on commercial endpoints;
  we run on three lab hosts and a Pi. Their numbers are an upper
  bound on what we can hope to reproduce — they're the target, not
  the floor.
 - They don't publish their exact dataset, so the comparison is
  architectural, not reproductive.
--- a/training/dashboard/app.py
+++ b/training/dashboard/app.py
@ -249,9 +249,21 @@ def make_app(
                # for display and URL-encode for the path so the
                # iframe can fetch /refs/<encoded-name>.
                display_name = " ".join(p.stem.split())
                # Sidecar markdown: <stem>.md alongside the PDF
                # holds a free-form description of how the paper
                # was used in the project. Optional — the
                # frontend shows a placeholder if missing.
                description = None
                md_path = p.with_suffix(".md")
                if md_path.is_file():
                    try:
                        description = md_path.read_text(encoding="utf-8")
                    except OSError:
                        log.warning("could not read sidecar %s", md_path)
                items.append({
                    "name": display_name,
                    "path": "/refs/" + quote(p.name, safe=""),
                    "description": description,
                })
        except OSError:
            log.exception("could not list references in %s", REFS_DIR)
--- a/training/dashboard/static/dashboard.css
+++ b/training/dashboard/static/dashboard.css
@ -286,8 +286,8 @@ body[data-theme="laser"] .bg-laser         { display: block; }
  to   { transform: rotate(360deg); }
 }
-/* ─── References scene (PDF viewer + tab strip) ─────────────────────── */
+/* ─── References scene (PDF viewer + tab strip + description) ──────── */
-.ref-stack { /* metric-stack-wide variant; let viewer take the height */
+.ref-stack { /* metric-stack-wide variant; let content area fill height */
  height: 100%;
  justify-content: flex-start;
 }
@ -315,26 +315,78 @@ body[data-theme="laser"] .bg-laser         { display: block; }
  color: var(--accent); border-color: var(--accent);
  background: var(--accent-soft);
 }
-.ref-viewer-wrap {
+/* Two-column layout: PDF viewer on the left taking the larger
   share, description panel on the right. The viewer's column
   uses minmax(0, …) so the iframe won't blow out the grid when
   the PDF reports a wide intrinsic size. */
 .ref-content {
  flex: 1 1 auto; min-height: 0;
  display: grid;
  grid-template-columns: minmax(0, 1.7fr) minmax(280px, 0.55fr);
  gap: 14px;
 }
 .ref-viewer-wrap {
  background: var(--bg-elev);
  border: 1px solid var(--line); border-radius: 4px;
  overflow: hidden;
  min-height: 0;
 }
 .ref-viewer {
  width: 100%; height: 100%;
  min-height: clamp(360px, 70vh, 900px);
  border: 0; display: block; background: var(--bg-elev);
 }
 .ref-description {
  background: var(--bg-elev);
  border: 1px solid var(--line); border-radius: 4px;
  overflow-y: auto;
  padding: 18px 22px;
  font-size: 14px; line-height: 1.6;
  color: var(--fg);
  min-height: 0;
 }
 .ref-description h1, .ref-description h2 {
  font-size: 15px; font-weight: 600; margin: 0 0 10px;
  color: var(--fg);
 }
 .ref-description h3 { font-size: 13px; font-weight: 600; margin: 12px 0 4px; }
 .ref-description p  { margin: 0 0 10px; }
 .ref-description ul,
 .ref-description ol { margin: 0 0 10px; padding-left: 20px; }
 .ref-description li { margin: 0 0 4px; }
 .ref-description code {
  font-family: ui-monospace, SFMono-Regular, Menlo, monospace;
  font-size: 0.9em; color: var(--accent);
  background: var(--accent-soft); padding: 1px 5px; border-radius: 3px;
 }
 .ref-description strong { color: var(--fg); font-weight: 600; }
 .ref-description em     { color: var(--fg-dim); font-style: italic; }
 .ref-description .awaiting {
  color: var(--fg-mute); font-style: italic;
  font-family: ui-monospace, SFMono-Regular, Menlo, monospace;
  font-size: 12px;
 }
 /* On narrow viewports stack vertically: PDF on top, description
   below, capped to a sensible height so the PDF still gets room. */
@media (max-width: 1100px) {
  .ref-content { grid-template-columns: 1fr; }
  .ref-description { max-height: 240px; }
 }
 /* References scene wants more horizontal room than the default
   metric scenes — the PDF is the point. Drop the right padding
-   that reserves space for the prose column down to a small gutter,
+   that reserves space for the prose column. The prose for this
-   so the iframe can stretch most of the way across. The prose card
+   scene is hidden anyway (see below) so we can use the full width
-   still overlays the right edge with its feathered backdrop. */
+   for the PDF + description grid. */
 .stage-view[data-view="references"] {
-  padding-right: clamp(8px, 4vw, 96px);
+  padding-right: clamp(8px, 2vw, 48px);
 }
 /* Hide the prose card on the references scene — the description
   panel inside the metric-stack already explains each PDF in
   context, and freeing the right-side viewport gives the
   description panel proper room. */
 .scene[data-stage="references"] .prose { display: none; }
 /* ─── Per-theme settings section ───────────────────────────────────── */
 .theme-bg-section { display: none; }
--- a/training/dashboard/static/dashboard.js
+++ b/training/dashboard/static/dashboard.js
@ -1641,7 +1641,8 @@ for epoch in range(20):
  (function () {
    const tabsEl = document.getElementById('ref-tabs');
    const viewerEl = document.getElementById('ref-viewer');
-    if (!tabsEl || !viewerEl) return;
+    const descEl = document.getElementById('ref-description');
    if (!tabsEl || !viewerEl || !descEl) return;
    let refs = [];
    let activeIdx = -1;
@ -1661,6 +1662,7 @@ for epoch in range(20):
        empty.className = 'awaiting';
        empty.textContent = 'no PDFs found in /opt/cis490/references/';
        tabsEl.appendChild(empty);
        renderDescription(null);
        return;
      }
      refs.forEach((r, i) => {
@ -1680,10 +1682,46 @@ for epoch in range(20):
      if (i < 0 || i >= refs.length) return;
      activeIdx = i;
      rebuildTabs();
-      // Append a hash so that hitting the same PDF twice in a row
+      // #zoom=page-width forces the browser's PDF viewer to fit the
-      // still triggers a reload (helps if the file was updated on
+      // page horizontally to the iframe — without it, an 8.5×11
-      // disk; iframes cache aggressively otherwise).
+      // page leaves whitespace on either side when the iframe is
-      viewerEl.src = refs[i].path;
+      // wider than the page's natural width.
      viewerEl.src = refs[i].path + '#zoom=page-width';
      renderDescription(refs[i].description);
    }
    // Tiny markdown-ish renderer: enough to display headings,
    // paragraphs, bold/italic, lists, inline code. Keeps this widget
    // dependency-free (no marked.js / showdown.js / etc).
    function renderDescription(md) {
      if (!md) {
        descEl.innerHTML =
          '<p class="awaiting">no description for this reference yet — drop a sidecar &lt;stem&gt;.md next to the PDF in /opt/cis490/references/</p>';
        return;
      }
      // Escape HTML first so user content can't inject markup.
      let s = md.replace(/&/g, '&amp;').replace(/</g, '&lt;').replace(/>/g, '&gt;');
      // Inline: bold, italic, code.
      s = s.replace(/\*\*([^*]+)\*\*/g, '<strong>$1</strong>')
           .replace(/(?<!\*)\*([^*\n]+)\*(?!\*)/g, '<em>$1</em>')
           .replace(/`([^`\n]+)`/g, '<code>$1</code>');
      // Block-level: split on blank lines, then handle headings + lists.
      const blocks = s.split(/\n{2,}/).map(block => {
        const stripped = block.trim();
        if (!stripped) return '';
        if (stripped.startsWith('# '))   return `<h2>${stripped.slice(2)}</h2>`;
        if (stripped.startsWith('## '))  return `<h2>${stripped.slice(3)}</h2>`;
        if (stripped.startsWith('### ')) return `<h3>${stripped.slice(4)}</h3>`;
        const lines = stripped.split('\n');
        if (lines.every(l => /^[-*]\s/.test(l))) {
          return '<ul>' + lines.map(l => `<li>${l.replace(/^[-*]\s+/, '')}</li>`).join('') + '</ul>';
        }
        if (lines.every(l => /^\d+\.\s/.test(l))) {
          return '<ol>' + lines.map(l => `<li>${l.replace(/^\d+\.\s+/, '')}</li>`).join('') + '</ol>';
        }
        return `<p>${stripped.replace(/\n/g, '<br>')}</p>`;
      });
      descEl.innerHTML = blocks.join('');
    }
    fetch('/api/references')
--- a/training/dashboard/static/index.html
+++ b/training/dashboard/static/index.html
@ -4,7 +4,7 @@
  <meta charset="utf-8">
  <meta name="viewport" content="width=device-width, initial-scale=1">
  <title>CIS490 — live</title>
-  <link rel="stylesheet" href="/static/dashboard.css?v=a591789b">
+  <link rel="stylesheet" href="/static/dashboard.css?v=afecfcf3">
 </head>
 <body>
  <!-- SVG filter defs for the lava-lamp goo effect. Width/height 0
@ -301,15 +301,18 @@
          </div>
        </div>
-        <!-- 13. references — PDF viewer with tabs -->
+        <!-- 13. references — PDF viewer with tabs + description -->
        <div class="stage-view" data-view="references">
          <div class="metric-stack metric-stack-wide ref-stack">
            <div class="metric-eyebrow">references · papers, notes, prior work</div>
            <div class="ref-tabs" id="ref-tabs"></div>
-            <div class="ref-viewer-wrap">
+            <div class="ref-content">
-              <iframe class="ref-viewer" id="ref-viewer"
+              <div class="ref-viewer-wrap">
-                      title="reference viewer"
+                <iframe class="ref-viewer" id="ref-viewer"
-                      sandbox="allow-same-origin allow-scripts allow-popups allow-forms"></iframe>
+                        title="reference viewer"
                        sandbox="allow-same-origin allow-scripts allow-popups allow-forms"></iframe>
              </div>
              <div class="ref-description" id="ref-description"></div>
            </div>
          </div>
        </div>
@ -515,6 +518,6 @@
    </article>
  </div>
-  <script src="/static/dashboard.js?v=b1cb9f39"></script>
+  <script src="/static/dashboard.js?v=f2a8bda2"></script>
 </body>
 </html>