CIS490/references/A Deep Learning Model Leveraging Time‑Series System Call Data to Detect Malware Attacks in Virtual Machines.md
Max Gorog 9e38f78379 training/dashboard(references): description sidebar + better space use
Two changes per the user's feedback that the slide had unused
horizontal space and needed per-PDF context.

Layout
- The reference scene is now a 2-column grid inside the
  metric-stack: PDF iframe at ~1.7fr on the left, description
  panel at ~0.55fr on the right (min 280px). On narrow viewports
  (<1100px) it falls back to a vertical stack with the
  description capped to 240px.
- Added #zoom=page-width to the iframe URL so the PDF's page
  fits its column width instead of leaving margins beside an
  8.5x11 page rendered in a wider iframe.
- Hide the prose card on the references scene — the description
  panel inside the stack covers what the prose was saying, and
  freeing the right edge gives the description proper room.

Description content
- Backend reads <stem>.md sidecar files alongside each PDF and
  returns the contents in the /api/references payload.
- Frontend renders them with a tiny built-in markdown subset
  (headings, bold/italic, lists, inline code, paragraphs) — no
  third-party renderer dependency.
- Initial draft sidecar .md files committed for the four PDFs
  currently in references/. Each describes how the paper informs
  a specific scene of the deck (which model row, which eval
  protocol, which channel selection). Edit them in place and the
  panel updates on the next reload.
2026-05-08 12:40:32 -05:00

26 lines
1.2 KiB
Markdown

# Closest direct precedent
This paper applies deep learning to **time-series system-call traces
inside virtual machines** for malware detection — almost exactly the
framing of this project, just one layer deeper in the stack
(syscall traces vs `/proc` samples).
## What we borrowed
- **Windowing strategy.** The paper's fixed-length sliding-window
formulation over a sequential telemetry stream is the same shape
we use for our 10-second `/proc` windows fed to LSTM/GRU/RNN.
- **Recurrent architecture as the simple-but-strong baseline.**
Their result that an LSTM on raw sequences beats hand-crafted
feature classifiers on the same data is the cited justification
for our LSTM/GRU/RNN row of the model comparison.
- **Per-VM containment posture.** Confirms our choice to run each
episode in its own throwaway Alpine guest rather than instrumenting
the host process directly.
## Where it differs
- Their telemetry is full **syscall traces** (much richer than
`/proc` resource counters), which is why their numbers don't
transfer 1-to-1 to our setup. They establish *that* this works;
we measure how well it works on a thinner, more deployable signal.