training/dashboard(references): description sidebar + better space use
Two changes per the user's feedback that the slide had unused horizontal space and needed per-PDF context. Layout - The reference scene is now a 2-column grid inside the metric-stack: PDF iframe at ~1.7fr on the left, description panel at ~0.55fr on the right (min 280px). On narrow viewports (<1100px) it falls back to a vertical stack with the description capped to 240px. - Added #zoom=page-width to the iframe URL so the PDF's page fits its column width instead of leaving margins beside an 8.5x11 page rendered in a wider iframe. - Hide the prose card on the references scene — the description panel inside the stack covers what the prose was saying, and freeing the right edge gives the description proper room. Description content - Backend reads <stem>.md sidecar files alongside each PDF and returns the contents in the /api/references payload. - Frontend renders them with a tiny built-in markdown subset (headings, bold/italic, lists, inline code, paragraphs) — no third-party renderer dependency. - Initial draft sidecar .md files committed for the four PDFs currently in references/. Each describes how the paper informs a specific scene of the deck (which model row, which eval protocol, which channel selection). Edit them in place and the panel updates on the next reload.
This commit is contained in:
parent
69c563275a
commit
9e38f78379
8 changed files with 232 additions and 19 deletions
|
|
@ -0,0 +1,26 @@
|
||||||
|
# Closest direct precedent
|
||||||
|
|
||||||
|
This paper applies deep learning to **time-series system-call traces
|
||||||
|
inside virtual machines** for malware detection — almost exactly the
|
||||||
|
framing of this project, just one layer deeper in the stack
|
||||||
|
(syscall traces vs `/proc` samples).
|
||||||
|
|
||||||
|
## What we borrowed
|
||||||
|
|
||||||
|
- **Windowing strategy.** The paper's fixed-length sliding-window
|
||||||
|
formulation over a sequential telemetry stream is the same shape
|
||||||
|
we use for our 10-second `/proc` windows fed to LSTM/GRU/RNN.
|
||||||
|
- **Recurrent architecture as the simple-but-strong baseline.**
|
||||||
|
Their result that an LSTM on raw sequences beats hand-crafted
|
||||||
|
feature classifiers on the same data is the cited justification
|
||||||
|
for our LSTM/GRU/RNN row of the model comparison.
|
||||||
|
- **Per-VM containment posture.** Confirms our choice to run each
|
||||||
|
episode in its own throwaway Alpine guest rather than instrumenting
|
||||||
|
the host process directly.
|
||||||
|
|
||||||
|
## Where it differs
|
||||||
|
|
||||||
|
- Their telemetry is full **syscall traces** (much richer than
|
||||||
|
`/proc` resource counters), which is why their numbers don't
|
||||||
|
transfer 1-to-1 to our setup. They establish *that* this works;
|
||||||
|
we measure how well it works on a thinner, more deployable signal.
|
||||||
|
|
@ -0,0 +1,26 @@
|
||||||
|
# LSTM on event-log sequences
|
||||||
|
|
||||||
|
DANTE applies a **plain LSTM directly to system-log event sequences**
|
||||||
|
to flag insider-threat behavior. Earlier in the literature than the
|
||||||
|
transformer wave, and useful here as a methodological baseline.
|
||||||
|
|
||||||
|
## What we borrowed
|
||||||
|
|
||||||
|
- **Evidence that simple recurrent models are enough.** The paper
|
||||||
|
shows an LSTM on sequence-of-events alone — no per-task feature
|
||||||
|
engineering — captures enough temporal structure to beat
|
||||||
|
bag-of-events classifiers. That's the empirical ground for the
|
||||||
|
*RNN/GRU/LSTM* entries in our model comparison being plain, not
|
||||||
|
bespoke.
|
||||||
|
- **Negative-evidence framing.** DANTE is also explicit about cases
|
||||||
|
where the LSTM under-performs (low-volume users, novel event
|
||||||
|
types). Informs the *split-by-sample, not split-by-time* eval
|
||||||
|
protocol on the perf scene — generalising to unseen actors is
|
||||||
|
the bar.
|
||||||
|
|
||||||
|
## Where it differs
|
||||||
|
|
||||||
|
- Operates on log-event token sequences (categorical), not numeric
|
||||||
|
resource metrics (continuous). Our channels are floats from
|
||||||
|
`/proc`, so we use the temporal structure DANTE validates without
|
||||||
|
inheriting the embedding setup.
|
||||||
26
references/LogBERT: Log Anomaly Detection via BERT.md
Normal file
26
references/LogBERT: Log Anomaly Detection via BERT.md
Normal file
|
|
@ -0,0 +1,26 @@
|
||||||
|
# Transformer pretraining for log anomaly detection
|
||||||
|
|
||||||
|
LogBERT trains **BERT-style masked-language-modeling on log
|
||||||
|
sequences** and uses the resulting representations for unsupervised
|
||||||
|
anomaly scoring. The closest published example of "BERT, but for
|
||||||
|
host telemetry."
|
||||||
|
|
||||||
|
## What we borrowed
|
||||||
|
|
||||||
|
- **The transformer entry in our model comparison.** LogBERT is the
|
||||||
|
citation for why a transformer is even in the model lineup on
|
||||||
|
scene 9 — it shows that attention over moderate-length log windows
|
||||||
|
has enough signal to separate normal from anomalous *without*
|
||||||
|
per-anomaly labels.
|
||||||
|
- **Pretraining + fine-tune split.** Their two-stage setup
|
||||||
|
(self-supervised pretrain on benign logs, downstream classifier
|
||||||
|
on labeled anomalies) is the template we follow when describing
|
||||||
|
the BERT model's training story on the *training-code* scene.
|
||||||
|
|
||||||
|
## Where it differs
|
||||||
|
|
||||||
|
- Logs are categorical (template tokens); our windows are dense
|
||||||
|
float vectors (12 channels × 100 samples). The BERT we run is the
|
||||||
|
same architecture but reads continuous-valued tokens, so the
|
||||||
|
masking objective is regression-on-masked-channels rather than
|
||||||
|
cross-entropy-on-masked-token.
|
||||||
|
|
@ -0,0 +1,30 @@
|
||||||
|
# Strongest published precedent for this exact setup
|
||||||
|
|
||||||
|
This paper applies **transformer architectures to per-process
|
||||||
|
resource-utilisation metrics** — the same shape of telemetry we
|
||||||
|
collect from `/proc`. Closest reference to "the project we're doing,
|
||||||
|
but already published."
|
||||||
|
|
||||||
|
## What we borrowed
|
||||||
|
|
||||||
|
- **Channel selection.** Their list of `/proc` channels overlaps
|
||||||
|
heavily with ours (`cpu_user_jiffies`, `cpu_sys_jiffies`,
|
||||||
|
`rss_bytes`, `io_*_bytes`, `voluntary_ctxsw`, `involuntary_ctxsw`,
|
||||||
|
page-fault counters). Our 12-channel selection is essentially
|
||||||
|
this set, validated.
|
||||||
|
- **Window-and-classify framing.** They confirm that a transformer
|
||||||
|
reading short windows of these counters beats per-window
|
||||||
|
hand-features fed to gradient-boosted trees. That is exactly the
|
||||||
|
comparison we run: KNN-on-features vs sequence-models-on-windows.
|
||||||
|
- **Held-out-sample evaluation.** They emphasise generalising to
|
||||||
|
*unseen* malware families, not unseen time-slices of the same
|
||||||
|
family. We adopt the same eval protocol on the perf scene.
|
||||||
|
|
||||||
|
## Where it differs
|
||||||
|
|
||||||
|
- They use a much larger corpus and run on commercial endpoints;
|
||||||
|
we run on three lab hosts and a Pi. Their numbers are an upper
|
||||||
|
bound on what we can hope to reproduce — they're the target, not
|
||||||
|
the floor.
|
||||||
|
- They don't publish their exact dataset, so the comparison is
|
||||||
|
architectural, not reproductive.
|
||||||
|
|
@ -249,9 +249,21 @@ def make_app(
|
||||||
# for display and URL-encode for the path so the
|
# for display and URL-encode for the path so the
|
||||||
# iframe can fetch /refs/<encoded-name>.
|
# iframe can fetch /refs/<encoded-name>.
|
||||||
display_name = " ".join(p.stem.split())
|
display_name = " ".join(p.stem.split())
|
||||||
|
# Sidecar markdown: <stem>.md alongside the PDF
|
||||||
|
# holds a free-form description of how the paper
|
||||||
|
# was used in the project. Optional — the
|
||||||
|
# frontend shows a placeholder if missing.
|
||||||
|
description = None
|
||||||
|
md_path = p.with_suffix(".md")
|
||||||
|
if md_path.is_file():
|
||||||
|
try:
|
||||||
|
description = md_path.read_text(encoding="utf-8")
|
||||||
|
except OSError:
|
||||||
|
log.warning("could not read sidecar %s", md_path)
|
||||||
items.append({
|
items.append({
|
||||||
"name": display_name,
|
"name": display_name,
|
||||||
"path": "/refs/" + quote(p.name, safe=""),
|
"path": "/refs/" + quote(p.name, safe=""),
|
||||||
|
"description": description,
|
||||||
})
|
})
|
||||||
except OSError:
|
except OSError:
|
||||||
log.exception("could not list references in %s", REFS_DIR)
|
log.exception("could not list references in %s", REFS_DIR)
|
||||||
|
|
|
||||||
|
|
@ -286,8 +286,8 @@ body[data-theme="laser"] .bg-laser { display: block; }
|
||||||
to { transform: rotate(360deg); }
|
to { transform: rotate(360deg); }
|
||||||
}
|
}
|
||||||
|
|
||||||
/* ─── References scene (PDF viewer + tab strip) ─────────────────────── */
|
/* ─── References scene (PDF viewer + tab strip + description) ──────── */
|
||||||
.ref-stack { /* metric-stack-wide variant; let viewer take the height */
|
.ref-stack { /* metric-stack-wide variant; let content area fill height */
|
||||||
height: 100%;
|
height: 100%;
|
||||||
justify-content: flex-start;
|
justify-content: flex-start;
|
||||||
}
|
}
|
||||||
|
|
@ -315,26 +315,78 @@ body[data-theme="laser"] .bg-laser { display: block; }
|
||||||
color: var(--accent); border-color: var(--accent);
|
color: var(--accent); border-color: var(--accent);
|
||||||
background: var(--accent-soft);
|
background: var(--accent-soft);
|
||||||
}
|
}
|
||||||
.ref-viewer-wrap {
|
/* Two-column layout: PDF viewer on the left taking the larger
|
||||||
|
share, description panel on the right. The viewer's column
|
||||||
|
uses minmax(0, …) so the iframe won't blow out the grid when
|
||||||
|
the PDF reports a wide intrinsic size. */
|
||||||
|
.ref-content {
|
||||||
flex: 1 1 auto; min-height: 0;
|
flex: 1 1 auto; min-height: 0;
|
||||||
|
display: grid;
|
||||||
|
grid-template-columns: minmax(0, 1.7fr) minmax(280px, 0.55fr);
|
||||||
|
gap: 14px;
|
||||||
|
}
|
||||||
|
.ref-viewer-wrap {
|
||||||
background: var(--bg-elev);
|
background: var(--bg-elev);
|
||||||
border: 1px solid var(--line); border-radius: 4px;
|
border: 1px solid var(--line); border-radius: 4px;
|
||||||
overflow: hidden;
|
overflow: hidden;
|
||||||
|
min-height: 0;
|
||||||
}
|
}
|
||||||
.ref-viewer {
|
.ref-viewer {
|
||||||
width: 100%; height: 100%;
|
width: 100%; height: 100%;
|
||||||
min-height: clamp(360px, 70vh, 900px);
|
min-height: clamp(360px, 70vh, 900px);
|
||||||
border: 0; display: block; background: var(--bg-elev);
|
border: 0; display: block; background: var(--bg-elev);
|
||||||
}
|
}
|
||||||
|
.ref-description {
|
||||||
|
background: var(--bg-elev);
|
||||||
|
border: 1px solid var(--line); border-radius: 4px;
|
||||||
|
overflow-y: auto;
|
||||||
|
padding: 18px 22px;
|
||||||
|
font-size: 14px; line-height: 1.6;
|
||||||
|
color: var(--fg);
|
||||||
|
min-height: 0;
|
||||||
|
}
|
||||||
|
.ref-description h1, .ref-description h2 {
|
||||||
|
font-size: 15px; font-weight: 600; margin: 0 0 10px;
|
||||||
|
color: var(--fg);
|
||||||
|
}
|
||||||
|
.ref-description h3 { font-size: 13px; font-weight: 600; margin: 12px 0 4px; }
|
||||||
|
.ref-description p { margin: 0 0 10px; }
|
||||||
|
.ref-description ul,
|
||||||
|
.ref-description ol { margin: 0 0 10px; padding-left: 20px; }
|
||||||
|
.ref-description li { margin: 0 0 4px; }
|
||||||
|
.ref-description code {
|
||||||
|
font-family: ui-monospace, SFMono-Regular, Menlo, monospace;
|
||||||
|
font-size: 0.9em; color: var(--accent);
|
||||||
|
background: var(--accent-soft); padding: 1px 5px; border-radius: 3px;
|
||||||
|
}
|
||||||
|
.ref-description strong { color: var(--fg); font-weight: 600; }
|
||||||
|
.ref-description em { color: var(--fg-dim); font-style: italic; }
|
||||||
|
.ref-description .awaiting {
|
||||||
|
color: var(--fg-mute); font-style: italic;
|
||||||
|
font-family: ui-monospace, SFMono-Regular, Menlo, monospace;
|
||||||
|
font-size: 12px;
|
||||||
|
}
|
||||||
|
|
||||||
|
/* On narrow viewports stack vertically: PDF on top, description
|
||||||
|
below, capped to a sensible height so the PDF still gets room. */
|
||||||
|
@media (max-width: 1100px) {
|
||||||
|
.ref-content { grid-template-columns: 1fr; }
|
||||||
|
.ref-description { max-height: 240px; }
|
||||||
|
}
|
||||||
|
|
||||||
/* References scene wants more horizontal room than the default
|
/* References scene wants more horizontal room than the default
|
||||||
metric scenes — the PDF is the point. Drop the right padding
|
metric scenes — the PDF is the point. Drop the right padding
|
||||||
that reserves space for the prose column down to a small gutter,
|
that reserves space for the prose column. The prose for this
|
||||||
so the iframe can stretch most of the way across. The prose card
|
scene is hidden anyway (see below) so we can use the full width
|
||||||
still overlays the right edge with its feathered backdrop. */
|
for the PDF + description grid. */
|
||||||
.stage-view[data-view="references"] {
|
.stage-view[data-view="references"] {
|
||||||
padding-right: clamp(8px, 4vw, 96px);
|
padding-right: clamp(8px, 2vw, 48px);
|
||||||
}
|
}
|
||||||
|
/* Hide the prose card on the references scene — the description
|
||||||
|
panel inside the metric-stack already explains each PDF in
|
||||||
|
context, and freeing the right-side viewport gives the
|
||||||
|
description panel proper room. */
|
||||||
|
.scene[data-stage="references"] .prose { display: none; }
|
||||||
|
|
||||||
/* ─── Per-theme settings section ───────────────────────────────────── */
|
/* ─── Per-theme settings section ───────────────────────────────────── */
|
||||||
.theme-bg-section { display: none; }
|
.theme-bg-section { display: none; }
|
||||||
|
|
|
||||||
|
|
@ -1641,7 +1641,8 @@ for epoch in range(20):
|
||||||
(function () {
|
(function () {
|
||||||
const tabsEl = document.getElementById('ref-tabs');
|
const tabsEl = document.getElementById('ref-tabs');
|
||||||
const viewerEl = document.getElementById('ref-viewer');
|
const viewerEl = document.getElementById('ref-viewer');
|
||||||
if (!tabsEl || !viewerEl) return;
|
const descEl = document.getElementById('ref-description');
|
||||||
|
if (!tabsEl || !viewerEl || !descEl) return;
|
||||||
|
|
||||||
let refs = [];
|
let refs = [];
|
||||||
let activeIdx = -1;
|
let activeIdx = -1;
|
||||||
|
|
@ -1661,6 +1662,7 @@ for epoch in range(20):
|
||||||
empty.className = 'awaiting';
|
empty.className = 'awaiting';
|
||||||
empty.textContent = 'no PDFs found in /opt/cis490/references/';
|
empty.textContent = 'no PDFs found in /opt/cis490/references/';
|
||||||
tabsEl.appendChild(empty);
|
tabsEl.appendChild(empty);
|
||||||
|
renderDescription(null);
|
||||||
return;
|
return;
|
||||||
}
|
}
|
||||||
refs.forEach((r, i) => {
|
refs.forEach((r, i) => {
|
||||||
|
|
@ -1680,10 +1682,46 @@ for epoch in range(20):
|
||||||
if (i < 0 || i >= refs.length) return;
|
if (i < 0 || i >= refs.length) return;
|
||||||
activeIdx = i;
|
activeIdx = i;
|
||||||
rebuildTabs();
|
rebuildTabs();
|
||||||
// Append a hash so that hitting the same PDF twice in a row
|
// #zoom=page-width forces the browser's PDF viewer to fit the
|
||||||
// still triggers a reload (helps if the file was updated on
|
// page horizontally to the iframe — without it, an 8.5×11
|
||||||
// disk; iframes cache aggressively otherwise).
|
// page leaves whitespace on either side when the iframe is
|
||||||
viewerEl.src = refs[i].path;
|
// wider than the page's natural width.
|
||||||
|
viewerEl.src = refs[i].path + '#zoom=page-width';
|
||||||
|
renderDescription(refs[i].description);
|
||||||
|
}
|
||||||
|
|
||||||
|
// Tiny markdown-ish renderer: enough to display headings,
|
||||||
|
// paragraphs, bold/italic, lists, inline code. Keeps this widget
|
||||||
|
// dependency-free (no marked.js / showdown.js / etc).
|
||||||
|
function renderDescription(md) {
|
||||||
|
if (!md) {
|
||||||
|
descEl.innerHTML =
|
||||||
|
'<p class="awaiting">no description for this reference yet — drop a sidecar <stem>.md next to the PDF in /opt/cis490/references/</p>';
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
// Escape HTML first so user content can't inject markup.
|
||||||
|
let s = md.replace(/&/g, '&').replace(/</g, '<').replace(/>/g, '>');
|
||||||
|
// Inline: bold, italic, code.
|
||||||
|
s = s.replace(/\*\*([^*]+)\*\*/g, '<strong>$1</strong>')
|
||||||
|
.replace(/(?<!\*)\*([^*\n]+)\*(?!\*)/g, '<em>$1</em>')
|
||||||
|
.replace(/`([^`\n]+)`/g, '<code>$1</code>');
|
||||||
|
// Block-level: split on blank lines, then handle headings + lists.
|
||||||
|
const blocks = s.split(/\n{2,}/).map(block => {
|
||||||
|
const stripped = block.trim();
|
||||||
|
if (!stripped) return '';
|
||||||
|
if (stripped.startsWith('# ')) return `<h2>${stripped.slice(2)}</h2>`;
|
||||||
|
if (stripped.startsWith('## ')) return `<h2>${stripped.slice(3)}</h2>`;
|
||||||
|
if (stripped.startsWith('### ')) return `<h3>${stripped.slice(4)}</h3>`;
|
||||||
|
const lines = stripped.split('\n');
|
||||||
|
if (lines.every(l => /^[-*]\s/.test(l))) {
|
||||||
|
return '<ul>' + lines.map(l => `<li>${l.replace(/^[-*]\s+/, '')}</li>`).join('') + '</ul>';
|
||||||
|
}
|
||||||
|
if (lines.every(l => /^\d+\.\s/.test(l))) {
|
||||||
|
return '<ol>' + lines.map(l => `<li>${l.replace(/^\d+\.\s+/, '')}</li>`).join('') + '</ol>';
|
||||||
|
}
|
||||||
|
return `<p>${stripped.replace(/\n/g, '<br>')}</p>`;
|
||||||
|
});
|
||||||
|
descEl.innerHTML = blocks.join('');
|
||||||
}
|
}
|
||||||
|
|
||||||
fetch('/api/references')
|
fetch('/api/references')
|
||||||
|
|
|
||||||
|
|
@ -4,7 +4,7 @@
|
||||||
<meta charset="utf-8">
|
<meta charset="utf-8">
|
||||||
<meta name="viewport" content="width=device-width, initial-scale=1">
|
<meta name="viewport" content="width=device-width, initial-scale=1">
|
||||||
<title>CIS490 — live</title>
|
<title>CIS490 — live</title>
|
||||||
<link rel="stylesheet" href="/static/dashboard.css?v=a591789b">
|
<link rel="stylesheet" href="/static/dashboard.css?v=afecfcf3">
|
||||||
</head>
|
</head>
|
||||||
<body>
|
<body>
|
||||||
<!-- SVG filter defs for the lava-lamp goo effect. Width/height 0
|
<!-- SVG filter defs for the lava-lamp goo effect. Width/height 0
|
||||||
|
|
@ -301,15 +301,18 @@
|
||||||
</div>
|
</div>
|
||||||
</div>
|
</div>
|
||||||
|
|
||||||
<!-- 13. references — PDF viewer with tabs -->
|
<!-- 13. references — PDF viewer with tabs + description -->
|
||||||
<div class="stage-view" data-view="references">
|
<div class="stage-view" data-view="references">
|
||||||
<div class="metric-stack metric-stack-wide ref-stack">
|
<div class="metric-stack metric-stack-wide ref-stack">
|
||||||
<div class="metric-eyebrow">references · papers, notes, prior work</div>
|
<div class="metric-eyebrow">references · papers, notes, prior work</div>
|
||||||
<div class="ref-tabs" id="ref-tabs"></div>
|
<div class="ref-tabs" id="ref-tabs"></div>
|
||||||
<div class="ref-viewer-wrap">
|
<div class="ref-content">
|
||||||
<iframe class="ref-viewer" id="ref-viewer"
|
<div class="ref-viewer-wrap">
|
||||||
title="reference viewer"
|
<iframe class="ref-viewer" id="ref-viewer"
|
||||||
sandbox="allow-same-origin allow-scripts allow-popups allow-forms"></iframe>
|
title="reference viewer"
|
||||||
|
sandbox="allow-same-origin allow-scripts allow-popups allow-forms"></iframe>
|
||||||
|
</div>
|
||||||
|
<div class="ref-description" id="ref-description"></div>
|
||||||
</div>
|
</div>
|
||||||
</div>
|
</div>
|
||||||
</div>
|
</div>
|
||||||
|
|
@ -515,6 +518,6 @@
|
||||||
</article>
|
</article>
|
||||||
</div>
|
</div>
|
||||||
|
|
||||||
<script src="/static/dashboard.js?v=b1cb9f39"></script>
|
<script src="/static/dashboard.js?v=f2a8bda2"></script>
|
||||||
</body>
|
</body>
|
||||||
</html>
|
</html>
|
||||||
|
|
|
||||||
Loading…
Add table
Reference in a new issue