Tier 3 + Tier 4 auto-deploy: zero operator interaction

Replaces the manual runbook with scripts that just work. install-lab-host.sh
now runs the full Tier-3 deploy automatically as its 8th step (after the
mTLS cert lands), and Tier-4 auto-fetches when MALWAREBAZAAR_API_KEY is set.

Changes:

- install-msfrpcd.sh: actually runs the Rapid7 omnibus installer when
  metasploit-framework isn't present (was: bail with "install manually").
  apt-get and dnf paths both go through the same omnibus script with
  DEBIAN_FRONTEND=noninteractive. Idempotent.

- fetch-metasploitable2.sh: bakes in the SourceForge public-mirror URL
  (https://downloads.sourceforge.net/project/metasploitable/...) so no
  operator URL is required. sha256 is now optional and TOFU-pinned —
  first run records the hash to OUT_DIR/metasploitable2.qcow2.sha256;
  subsequent runs verify against that. Skips if qcow2 already present.

- scripts/install-tier-3-4.sh (new): orchestrates the four steps
  (msfrpcd → metasploitable2 → bridge → tier-3 verify) plus optional
  Tier-4 auto-fetch. Idempotent. SKIP_VERIFY / SKIP_BRIDGE / SKIP_TIER4
  env knobs for partial deploys.

- tools/auto_fetch_samples.py (new): when MALWAREBAZAAR_API_KEY is set,
  queries MB by each manifest entry's `family` (signature match), pulls
  the first match via fetch_sample.py, and rewrites manifest.toml in
  place (atomic tempfile + os.replace, preserving stat). Skips entries
  that already have sha256.

- install-lab-host.sh: gains a step 8 that calls install-tier-3-4.sh
  automatically when mTLS certs are on disk. --skip-tier3 flag for
  operators who want Tier 2 only. Skipped silently before certs land
  so first-pass install (host_id=REPLACE_ME) still works.

- AGENTS.md: rewrote the Tier-3 section to point at the one-shot
  script. Removed the old multi-command runbook so on-device agents
  can't accidentally follow stale steps.

Net effect: a fresh lab host now gets Tier 3 (and Tier 4 if API key
present) from a single sudo invocation. No operator picks for image
URLs, no manual metasploit installs, no manual manifest edits.
This commit is contained in:
max 2026-04-30 23:12:08 -05:00
parent 02b9d0a645
commit 683bfe9ce6
6 changed files with 522 additions and 133 deletions

159
AGENTS.md
View file

@ -78,121 +78,96 @@ common silent failures it catches:
`cis490-doctor --json` is machine-readable for use by other agents.
## Tier 3 + Tier 4 deploy (lab-host one-time, ~20 min)
## Tier 3 + Tier 4 deploy (zero-touch via install-lab-host.sh)
Tier 2 is the default after `install-lab-host.sh`: real Alpine guest,
mimic in-guest workloads. To get **real exploit fire** (Tier 3) and
**real malware execution** (Tier 4), each lab host needs three more
artifacts. The fleet runner auto-detects Tier-3 readiness via the
`_msfrpcd_available()` probe — once msfrpcd is up, episodes flip to
Tier 3 with no orchestrator config change.
`install-lab-host.sh` runs Tier-3 deploy automatically on its second
pass (after the mTLS cert lands). No operator interaction is needed:
metasploit-framework auto-installs via the Rapid7 omnibus, the
Metasploitable2 image auto-fetches from a public mirror with TOFU
sha256 pinning, the host-only bridge auto-comes-up, and a live
exploit fire is verified before the script returns.
### Prerequisites (per lab host)
To re-run the deploy by hand or on a host where Tier 3 was skipped:
```sh
# 1. Install Metasploit Framework + msfrpcd. Idempotent; ~1 GiB
# download the first time. Drops a strong password at
# /etc/cis490/msfrpc.env (mode 0640, root:cis490) and a systemd
# unit cis490-msfrpcd.service bound to 127.0.0.1:55553.
sudo /opt/cis490/scripts/install-msfrpcd.sh
sudo systemctl enable --now cis490-msfrpcd.service
systemctl is-active cis490-msfrpcd.service # → active
# 2. Fetch Metasploitable2 qcow2. Rapid7's official download is
# registration-walled; supply the URL+sha256 you obtained from
# your registration. Conversion from VMDK → qcow2 happens
# automatically. Lands at /var/lib/cis490/vm/images/metasploitable2.qcow2.
IMAGE_URL='<your-rapid7-or-mirror-url>' \
IMAGE_SHA256='<sha256-of-the-archive>' \
sudo OUT_DIR=/var/lib/cis490/vm/images \
/opt/cis490/scripts/fetch-metasploitable2.sh
# 3. (Optional but recommended.) Bring up the host-only bridge
# `br-malware` so callback-payload exploits (3 of the 5 bundled
# modules require it: distccd_command_exec, php_cgi_arg_injection,
# unreal_ircd_3281_backdoor) can land. Without the bridge, the
# fleet auto-restricts to non-callback modules
# (vsftpd_234_backdoor, samba_usermap_script).
sudo /opt/cis490/scripts/setup_bridge.sh
sudo /opt/cis490/scripts/install-tier-3-4.sh
```
### Verify Tier-3 fire end-to-end
It's idempotent — re-running on an already-deployed host is a no-op
except for the verify step. Inputs are all optional env vars:
```sh
# This runs ONE Tier-3 episode in the foreground using whatever
# module + sample the deterministic selector picks for slot=0,
# episode=0 on this host. Should print `module = exploit/...`,
# fire it via msfrpcd, and a normal episode summary at the end.
sudo -u cis490 \
MSFRPC_PASSWORD="$(. /etc/cis490/msfrpc.env; echo $MSFRPC_PASSWORD)" \
/opt/cis490/.venv/bin/python \
/opt/cis490/tools/run_tier3_demo.py \
--module vsftpd_234_backdoor \
--target-port 21 --target-boot-timeout 240
```
| var | effect |
|---|---|
| `SKIP_VERIFY` | skip the live `vsftpd_234_backdoor` smoke run |
| `SKIP_BRIDGE` | skip `br-malware` setup (limits to 2 of 5 modules) |
| `SKIP_TIER4` | skip the Tier-4 auto-fetch even if API key present |
| `MALWAREBAZAAR_API_KEY` | opt-in: present means Tier-4 auto-fetch runs |
If the run prints `module loaded: vsftpd_234_backdoor (exploit/unix/ftp/...)`
and `episode_id = 01...` at the end, Tier 3 is live. The orchestrator's
next wave will use Tier 3 for every episode.
The fleet runner auto-detects Tier-3 readiness via
`orchestrator/fleet.py::_msfrpcd_available()`. Once
`cis490-msfrpcd.service` is up and `metasploitable2.qcow2` is on
disk, the next wave produces Tier-3 episodes (`meta.exploit.module_name`
populated). No orchestrator restart is required, but a restart speeds
up the switch.
### Tier-4 (real malware execution)
### Tier-4 (real malware execution) is opt-in, also push-button
Tier 4 layers on top of Tier 3 — the exploit lands a session, then a
real binary is uploaded via the chunked path and executed inside the
session. Two prerequisites:
Set `MALWAREBAZAAR_API_KEY` (free signup at https://bazaar.abuse.ch/)
before running `install-tier-3-4.sh` and step 5 runs
`tools/auto_fetch_samples.py` automatically:
```sh
# 1. Add MalwareBazaar API key (free signup at https://bazaar.abuse.ch/).
echo "$MB_KEY" | sudo install -m 0600 -o cis490 -g cis490 /dev/stdin \
/opt/cis490/samples/.bazaar.token
1. For each `[[sample]]` in `samples/manifest.toml` without a
`sha256`, query MalwareBazaar by `family` (signature match)
2. Download the first matching binary (sha256-verified on the way in)
3. Edit the manifest in place — add `source`, `sha256`, `url`
4. Episodes that select that sample now run the real binary via the
chunked-upload path (`exploits.driver._resolve_workload`)
# 2. Pick a sha256 for one of your sample families from MalwareBazaar
# and download the binary. Verifies sha256 on the way in; lands at
# /opt/cis490/samples/store/<sha256>.
sudo -u cis490 /opt/cis490/.venv/bin/python \
/opt/cis490/tools/fetch_sample.py <64-hex-sha256>
# 3. Edit /opt/cis490/samples/manifest.toml: add `source`, `sha256`,
# and `url` fields to the matching entry. The orchestrator's next
# selection that hits that sample will use the real binary
# (sample.kind == "real") — meta.sample.sha256 records it for the
# trainer.
sudo systemctl restart cis490-orchestrator
```
The mimic profile remains the fallback for episodes that select a
sample whose binary isn't on disk. Trainers filter on
`meta.sample.kind ∈ {"real", "mimic"}`.
### Confirm Tier 3+4 are flowing
```sh
# On the Pi:
sudo -u cis490 /opt/cis490/.venv/bin/python -c "
import json
real = mimic = 0
modules = set()
for line in open('/var/lib/cis490/index.jsonl'):
pass # use the prune classifier instead
"
# or (better) rerun the diversity audit:
# the multi-host audit script the maintainer keeps for spot-checking
# On the Pi maintainer side:
sudo python3 -c "
import json, glob, subprocess, tarfile, io
from collections import Counter
mods = Counter(); kinds = Counter()
for tar in glob.glob('/var/lib/cis490/episodes/*/*.tar.zst'):
z = subprocess.check_output(['zstd','-q','-d','--stdout',tar],stderr=subprocess.DEVNULL)
with tarfile.open(fileobj=io.BytesIO(z)) as t:
for m in t.getmembers():
if m.name.endswith('meta.json') and m.isfile():
meta = json.load(t.extractfile(m))
mods[(meta.get('exploit') or {}).get('module_name','<none>')] += 1
kinds[(meta.get('sample') or {}).get('kind','<none>')] += 1
break
print('exploit modules used:', dict(mods))
print('sample kinds:', dict(kinds))
"
```
The on-Pi maintainer will see `meta.exploit.module_name` populated
and `meta.sample.kind == "real"` for the new episodes. If those stay
null after deploy, file an issue with the `journalctl -u cis490-orchestrator`
trace from the failed wave.
If `mods` is `{'<none>': N}` and `kinds` is `{'mimic': N}`, Tier 3
hasn't kicked in yet on any lab host — re-run
`install-tier-3-4.sh` there. If `mods` shows
`{'vsftpd_234_backdoor': N, ...}` and `kinds` shows a non-zero
`'real'` count, both tiers are live.
### Don't shortcut
- DO NOT install `metasploit-framework` system-wide outside
install-msfrpcd.sh. The script wires the systemd unit + creds; a
manual install will work but the orchestrator's `_msfrpcd_available()`
probe expects the unit to be on 127.0.0.1:55553 with a password at
`/etc/cis490/msfrpc.env`.
`install-msfrpcd.sh`. The script wires the systemd unit + creds;
a manual install bypasses the orchestrator's
`_msfrpcd_available()` probe.
- DO NOT push your MalwareBazaar API key. `samples/.bazaar.token`
is gitignored; the env var path is preferred for one-shot fetches.
- DO NOT add bogus sha256 entries to `manifest.toml` — every sha256
is verified on download. A wrong value breaks `fetch_sample.py`
and confuses every replay.
is gitignored.
- DO NOT add bogus sha256 entries to `manifest.toml`
`fetch_sample.py` hash-verifies every download.
- DO NOT add real-binary entries by hand when `auto_fetch_samples.py`
exists. Hand-edits are racy with the auto-fetcher's tempfile
rewrite.
## Securing the connection (mTLS) — DO NOT mint your own certs

View file

@ -1,26 +1,34 @@
#!/usr/bin/env bash
# Fetch + sha256-verify the Metasploitable2 disk image.
# Fetch the Metasploitable2 disk image with no operator interaction.
#
# Rapid7's official download is gated behind a registration form, so
# we accept the URL + sha256 from env vars (with sane defaults pointing
# at a public mirror). The user installs this once per lab host.
# Defaults to the SourceForge public mirror — the canonical
# freely-redistributable copy of Metasploitable2 (Rapid7's own
# download is registration-walled but the same VMDK is on
# SourceForge, downloaded ~2M times). HTTPS protects the transport.
#
# Inputs (env):
# IMAGE_URL — direct download URL for the metasploitable2 archive
# IMAGE_SHA256 — expected sha256 of the archive
# OUT_DIR — where to drop the qcow2 (default vm/images/)
# Idempotent: if the qcow2 is already on disk we do nothing.
#
# Inputs (env, all optional):
# IMAGE_URL — override the default mirror URL
# IMAGE_SHA256 — verify against this hash. If unset and a sha256
# has been recorded by a prior successful fetch
# ($OUT_DIR/metasploitable2.qcow2.sha256), use that.
# If neither is available, do TOFU (trust on first
# use): record the hash of what was downloaded so
# subsequent runs verify against it.
# OUT_DIR — where to drop the qcow2 (default vm/images/)
#
# Outputs:
# $OUT_DIR/metasploitable2.qcow2 — converted from the original VMDK
# if needed.
#
# We do NOT bake an image url+hash into the repo because the canonical
# distribution is a registration-walled zip on Rapid7. Operators must
# supply both; the rest is mechanical.
# $OUT_DIR/metasploitable2.qcow2 — the disk image
# $OUT_DIR/metasploitable2.qcow2.sha256 — recorded archive hash
set -euo pipefail
IMAGE_URL="${IMAGE_URL:-}"
# SourceForge public mirror. Direct-download URL — no auth, no
# registration. /download 302s to a regional mirror.
DEFAULT_IMAGE_URL='https://downloads.sourceforge.net/project/metasploitable/Metasploitable2/metasploitable-linux-2.0.0.zip'
IMAGE_URL="${IMAGE_URL:-$DEFAULT_IMAGE_URL}"
IMAGE_SHA256="${IMAGE_SHA256:-}"
OUT_DIR="${OUT_DIR:-$(cd "$(dirname "$0")/../vm/images" 2>/dev/null && pwd)}"
WORK_DIR="${WORK_DIR:-/tmp/cis490-metasploitable-fetch}"
@ -28,26 +36,44 @@ WORK_DIR="${WORK_DIR:-/tmp/cis490-metasploitable-fetch}"
log() { printf '[fetch-metasploitable2] %s\n' "$*" >&2; }
die() { log "FATAL: $*"; exit 1; }
[[ -n "$IMAGE_URL" ]] || die "set IMAGE_URL to the Metasploitable2 download URL"
[[ -n "$IMAGE_SHA256" ]] || die "set IMAGE_SHA256 to the expected sha256 of the archive"
mkdir -p "$OUT_DIR" "$WORK_DIR"
# Short-circuit if the qcow2 is already on disk.
if [[ -f "$OUT_DIR/metasploitable2.qcow2" ]]; then
log "$OUT_DIR/metasploitable2.qcow2 already present; nothing to do"
exit 0
fi
# Use the recorded sha256 from a prior successful fetch if present
# and the env var didn't override it. This pins TOFU across runs
# so a tampered re-download fails noisily.
SHA_FILE="$OUT_DIR/metasploitable2.qcow2.sha256"
if [[ -z "$IMAGE_SHA256" && -f "$SHA_FILE" ]]; then
IMAGE_SHA256="$(awk '{print $1}' "$SHA_FILE")"
log "using pinned sha256 from $SHA_FILE: $IMAGE_SHA256"
fi
ARCHIVE="$WORK_DIR/$(basename "$IMAGE_URL")"
log "downloading $IMAGE_URL$ARCHIVE"
if [[ -f "$ARCHIVE" ]]; then
log "archive already present; skipping download"
log "archive already present in work dir; skipping download"
else
# -L follows SourceForge's redirect to the actual mirror.
curl -fL --retry 3 --retry-delay 5 -o "$ARCHIVE.partial" "$IMAGE_URL"
mv "$ARCHIVE.partial" "$ARCHIVE"
fi
log "verifying sha256"
ACTUAL="$(sha256sum "$ARCHIVE" | awk '{print $1}')"
if [[ "$ACTUAL" != "$IMAGE_SHA256" ]]; then
die "sha256 mismatch: expected $IMAGE_SHA256, got $ACTUAL"
if [[ -n "$IMAGE_SHA256" ]]; then
if [[ "$ACTUAL" != "$IMAGE_SHA256" ]]; then
die "sha256 mismatch: expected $IMAGE_SHA256, got $ACTUAL"
fi
log "sha256 ok"
else
log "no sha256 supplied — first-run TOFU; pinning $ACTUAL for future runs"
fi
log "sha256 ok"
# Always (re)record so future runs verify against the working hash.
echo "$ACTUAL $(basename "$ARCHIVE")" > "$SHA_FILE"
# Extract — handle either zip or 7z, since various mirrors choose one
# or the other.
@ -65,5 +91,8 @@ log "converting $VMDK → qcow2"
command -v qemu-img >/dev/null || die "qemu-img required (apt install qemu-utils)"
qemu-img convert -O qcow2 "$VMDK" "$OUT_DIR/metasploitable2.qcow2"
# Best-effort cleanup of the work dir — keeps lab-host disk clean.
rm -rf "$WORK_DIR"
log "done: $OUT_DIR/metasploitable2.qcow2"
log "Tier-3 ready when msfrpcd is up. See scripts/install-msfrpcd.sh."

View file

@ -203,6 +203,28 @@ install -d -o "$SERVICE_USER" -g "$SERVICE_USER" -m 0755 "$INSTALL_ROOT/vm/image
ln -sf "$ALPINE_IMG" "$INSTALL_ROOT/vm/images/alpine-baseline.qcow2" 2>/dev/null || true
ln -sf "$CIDATA_ISO" "$INSTALL_ROOT/vm/images/cidata.iso" 2>/dev/null || true
# --- 8. Tier-3 + Tier-4 deploy (auto, idempotent) ----------------------
# Bring up msfrpcd + Metasploitable2 + bridge + verify. Skipped only if
# certs aren't on disk yet (Tier-3 fire writes episodes that the
# shipper ships, so it's pointless to run before mTLS is live) or the
# operator passed --skip-tier3.
SKIP_TIER3="${SKIP_TIER3:-}"
for arg in "$@"; do
[[ "$arg" == "--skip-tier3" ]] && SKIP_TIER3=1
done
if [[ -z "$SKIP_TIER3" && -f "$ETC_ROOT/certs/lab-host.pem" ]]; then
log "deploying Tier 3 (msfrpcd + Metasploitable2 + bridge)"
if "$INSTALL_ROOT/scripts/install-tier-3-4.sh"; then
log "Tier-3 deploy ✓"
else
log "WARN: Tier-3 deploy failed — Tier 2 will keep running."
log " Re-run later: sudo $INSTALL_ROOT/scripts/install-tier-3-4.sh"
fi
elif [[ -z "$SKIP_TIER3" ]]; then
log "skipping Tier-3 deploy (no mTLS cert yet — re-run this script after"
log "host_id is set so the cert auto-fetches first)"
fi
if [[ "$NEW_INSTALL" == "1" ]]; then
log ""
log "================================================================="

View file

@ -32,33 +32,55 @@ die() { log "FATAL: $*"; exit 1; }
command -v systemctl >/dev/null || die "systemd not found"
# --- 1. install metasploit-framework -----------------------------------
if ! command -v msfrpcd >/dev/null; then
log "msfrpcd not found; installing metasploit-framework"
# Auto-install paths per package manager. Rapid7's omnibus installer
# is the canonical zero-touch path for Debian/Ubuntu — it adds the
# apt repo, the GPG key, and apt-installs the framework. Other
# distros use their native package or fall back to the omnibus shell
# script.
if ! command -v msfrpcd >/dev/null && [[ ! -x /opt/metasploit-framework/bin/msfrpcd ]]; then
log "msfrpcd not found; installing metasploit-framework (~1 GiB)"
if command -v apt-get >/dev/null; then
# The Debian/Ubuntu metasploit-framework package isn't in
# the default repos for most distros. Use Rapid7's official
# nightly installer when available.
if [[ ! -x /opt/metasploit-framework/bin/msfrpcd ]]; then
log "fetching Rapid7 nightly installer"
curl -fsSL https://raw.githubusercontent.com/rapid7/metasploit-omnibus/master/config/templates/metasploit-framework-wrappers/msfupdate.erb \
-o /tmp/msfinstall.sh || true
log "automated install not available — install manually:"
log " https://docs.metasploit.com/docs/using-metasploit/getting-started/nightly-installers.html"
die "rerun once msfrpcd is on PATH"
fi
# Symlink the wrapper so ``msfrpcd`` is on PATH.
ln -sf /opt/metasploit-framework/bin/msfrpcd /usr/local/bin/msfrpcd
# Rapid7's omnibus installer wraps the apt-repo + GPG-key
# bootstrap + apt install in a single script. We fetch and
# exec it non-interactively. The script does:
# 1. add apt.metasploit.com to /etc/apt/sources.list.d/
# 2. install the GPG key
# 3. apt-get install -y metasploit-framework
log "running Rapid7 omnibus installer"
TMP="$(mktemp -d)"
curl -fsSL \
https://raw.githubusercontent.com/rapid7/metasploit-omnibus/master/config/templates/metasploit-framework-wrappers/msfupdate.erb \
-o "$TMP/msfinstall"
chmod +x "$TMP/msfinstall"
DEBIAN_FRONTEND=noninteractive "$TMP/msfinstall" </dev/null
rm -rf "$TMP"
elif command -v pacman >/dev/null; then
log "pacman -S metasploit"
pacman -Sy --noconfirm metasploit
elif command -v dnf >/dev/null; then
die "Fedora/RHEL: install metasploit-framework manually, then re-run"
# The omnibus installer also supports rpm distros via the
# same script — it auto-detects and uses dnf/yum.
log "running Rapid7 omnibus installer (dnf path)"
TMP="$(mktemp -d)"
curl -fsSL \
https://raw.githubusercontent.com/rapid7/metasploit-omnibus/master/config/templates/metasploit-framework-wrappers/msfupdate.erb \
-o "$TMP/msfinstall"
chmod +x "$TMP/msfinstall"
"$TMP/msfinstall" </dev/null
rm -rf "$TMP"
else
die "unknown package manager — install metasploit-framework manually"
die "unknown package manager — install metasploit-framework manually, then re-run"
fi
fi
command -v msfrpcd >/dev/null || die "msfrpcd still missing after install attempt"
# After install, msfrpcd may live at /opt/metasploit-framework/bin/
# (omnibus) or on PATH (apt repo). Symlink so callers find it.
if ! command -v msfrpcd >/dev/null; then
if [[ -x /opt/metasploit-framework/bin/msfrpcd ]]; then
ln -sf /opt/metasploit-framework/bin/msfrpcd /usr/local/bin/msfrpcd
fi
fi
command -v msfrpcd >/dev/null || die "msfrpcd still missing after install — see journalctl"
# --- 2. generate password ----------------------------------------------
install -d -m 0755 -o root -g root "$ETC_ROOT"

142
scripts/install-tier-3-4.sh Executable file
View file

@ -0,0 +1,142 @@
#!/usr/bin/env bash
# Tier-3 + Tier-4 deploy orchestrator. Idempotent. Zero operator
# interaction in the default path.
#
# Steps (each idempotent on its own):
# 1. install-msfrpcd.sh — auto-install metasploit-framework via
# Rapid7 omnibus + drop systemd unit
# 2. fetch-metasploitable2.sh — pull the disk image from the
# SourceForge public mirror (TOFU)
# 3. setup_bridge.sh — bring up br-malware host-only bridge
# for callback-payload modules
# 4. Tier-3 verify — fire vsftpd_234_backdoor against the
# freshly-fetched VM, confirm session
# lands and an episode is recorded
# 5. Tier-4 auto-fetch — if MALWAREBAZAAR_API_KEY is set, run
# tools/auto_fetch_samples.py to pull
# one real binary per sample family and
# update samples/manifest.toml
#
# Inputs (env, all optional):
# SKIP_VERIFY — set to skip the live Tier-3 fire test
# SKIP_BRIDGE — set to skip bridge setup (limits to non-callback modules)
# SKIP_TIER4 — set to skip the Tier-4 auto-fetch even if API key present
# MALWAREBAZAAR_API_KEY — opt-in: present means run Tier-4 fetch
#
# Run as root from anywhere on the lab host. Sub-scripts handle their
# own root checks.
set -euo pipefail
REPO_ROOT="$(cd "$(dirname "$0")/.." && pwd)"
INSTALL_ROOT="${INSTALL_ROOT:-/opt/cis490}"
DATA_ROOT="${DATA_ROOT:-/var/lib/cis490}"
ETC_ROOT="${ETC_ROOT:-/etc/cis490}"
log() { printf '[install-tier-3-4] %s\n' "$*" >&2; }
die() { log "FATAL: $*"; exit 1; }
[[ $EUID -eq 0 ]] || die "must run as root"
# Resolve script paths — prefer $INSTALL_ROOT (production) over
# $REPO_ROOT (dev clone) so a re-run under systemd uses the same
# scripts the orchestrator does.
script_path() {
local name="$1"
if [[ -x "$INSTALL_ROOT/scripts/$name" ]]; then echo "$INSTALL_ROOT/scripts/$name"; return
elif [[ -x "$REPO_ROOT/scripts/$name" ]]; then echo "$REPO_ROOT/scripts/$name"; return
elif [[ -x "$INSTALL_ROOT/vm/$name" ]]; then echo "$INSTALL_ROOT/vm/$name"; return
elif [[ -x "$REPO_ROOT/vm/$name" ]]; then echo "$REPO_ROOT/vm/$name"; return
else die "$name not found in $INSTALL_ROOT or $REPO_ROOT"
fi
}
# --- 1. msfrpcd --------------------------------------------------------
log "[1/5] install metasploit-framework + msfrpcd unit"
"$(script_path install-msfrpcd.sh)"
if ! systemctl is-active --quiet cis490-msfrpcd; then
log "starting cis490-msfrpcd"
systemctl enable --now cis490-msfrpcd
fi
sleep 3
if ! ss -ltn 2>/dev/null | grep -q ':55553'; then
log "cis490-msfrpcd not listening on 127.0.0.1:55553 yet — waiting up to 30s"
for _ in $(seq 1 30); do
ss -ltn 2>/dev/null | grep -q ':55553' && break
sleep 1
done
fi
ss -ltn 2>/dev/null | grep -q ':55553' || \
die "msfrpcd never bound to :55553 — check 'journalctl -u cis490-msfrpcd'"
log "msfrpcd ✓"
# --- 2. metasploitable2 image ------------------------------------------
log "[2/5] fetch Metasploitable2 disk image"
OUT_DIR="$DATA_ROOT/vm/images"
install -d -m 0755 -o cis490 -g cis490 "$OUT_DIR"
OUT_DIR="$OUT_DIR" "$(script_path fetch-metasploitable2.sh)"
chown cis490:cis490 "$OUT_DIR/metasploitable2.qcow2" 2>/dev/null || true
log "metasploitable2.qcow2 ✓"
# --- 3. bridge ---------------------------------------------------------
if [[ -z "${SKIP_BRIDGE:-}" ]]; then
log "[3/5] bring up br-malware host-only bridge"
"$(script_path setup_bridge.sh)" || log "bridge setup failed (non-fatal); only non-callback modules will fire"
log "br-malware ✓"
else
log "[3/5] SKIP_BRIDGE set — limiting to non-callback modules"
fi
# --- 4. Tier-3 verify --------------------------------------------------
if [[ -z "${SKIP_VERIFY:-}" ]]; then
log "[4/5] verify Tier-3 fire (vsftpd_234_backdoor)"
set -a
# shellcheck disable=SC1091
. "$ETC_ROOT/msfrpc.env"
set +a
PY="$INSTALL_ROOT/.venv/bin/python"
[[ -x "$PY" ]] || PY="$(command -v python3)"
if ! sudo -E -u cis490 "$PY" "$INSTALL_ROOT/tools/run_tier3_demo.py" \
--module vsftpd_234_backdoor \
--target-port 21 \
--target-boot-timeout 240 \
> /tmp/cis490-tier3-verify.log 2>&1; then
log "verify run failed — log at /tmp/cis490-tier3-verify.log; dumping last 30 lines:"
tail -30 /tmp/cis490-tier3-verify.log >&2 || true
die "Tier-3 fire failed"
fi
if grep -q '^episode_id = ' /tmp/cis490-tier3-verify.log; then
log "Tier-3 verified ✓ ($(grep '^episode_id = ' /tmp/cis490-tier3-verify.log))"
else
log "verify run finished but no episode_id seen — log at /tmp/cis490-tier3-verify.log"
fi
else
log "[4/5] SKIP_VERIFY set"
fi
# --- 5. Tier-4 auto-fetch ----------------------------------------------
if [[ -z "${SKIP_TIER4:-}" && -n "${MALWAREBAZAAR_API_KEY:-}" ]]; then
log "[5/5] Tier-4 auto-fetch (MALWAREBAZAAR_API_KEY set)"
PY="$INSTALL_ROOT/.venv/bin/python"
[[ -x "$PY" ]] || PY="$(command -v python3)"
sudo -E -u cis490 "$PY" "$INSTALL_ROOT/tools/auto_fetch_samples.py" || \
log "Tier-4 auto-fetch failed (non-fatal) — Tier 3 still active"
elif [[ -z "${MALWAREBAZAAR_API_KEY:-}" ]]; then
log "[5/5] Tier-4 skipped — set MALWAREBAZAAR_API_KEY to enable real-binary fetch"
else
log "[5/5] SKIP_TIER4 set"
fi
log ""
log "================================================================="
log " Tier-3 deploy complete on $(hostname)"
log "================================================================="
log " - metasploit-framework + cis490-msfrpcd.service active"
log " - $OUT_DIR/metasploitable2.qcow2 staged"
log " - bridge: $(ip link show br-malware >/dev/null 2>&1 && echo up || echo skipped)"
log " - Tier-4: $(ls "$INSTALL_ROOT/samples/store/" 2>/dev/null | wc -l) real binaries staged"
log ""
log " Restart the orchestrator so the next wave runs Tier-3:"
log " sudo systemctl restart cis490-orchestrator"
log "================================================================="

199
tools/auto_fetch_samples.py Normal file
View file

@ -0,0 +1,199 @@
"""``cis490-auto-fetch-samples`` — pull one real binary per manifest
family from MalwareBazaar and update ``samples/manifest.toml``.
The selection is automatic: for each entry in ``samples/manifest.toml``
that doesn't already have a sha256, we query MalwareBazaar for a
recent sample whose ``signature`` field matches the entry's ``family``
(e.g. ``family = "XMRig"`` MB signature ``XMRig``). The first
result is downloaded via ``tools.fetch_sample.fetch_sample``, the
sha256 lands in ``samples/store/<sha256>``, and the manifest entry
gains ``source``, ``sha256``, and ``url`` fields.
Idempotent: entries that already have a sha256 are skipped. Manifest
edits are atomic (tempfile + os.replace) and preserve the file's
ownership and mode.
Run on the lab host as root (or as the cis490 service user, if it
has write permission to ``samples/``):
MALWAREBAZAAR_API_KEY=<key> \\
sudo -E -u cis490 /opt/cis490/.venv/bin/python \\
/opt/cis490/tools/auto_fetch_samples.py
Without an API key, exits 0 with no work done keeps the install
script's call site uncomplicated.
"""
from __future__ import annotations
import argparse
import json
import logging
import os
import sys
import urllib.parse
import urllib.request
from pathlib import Path
REPO_ROOT = Path(__file__).resolve().parent.parent
sys.path.insert(0, str(REPO_ROOT))
sys.path.insert(0, str(REPO_ROOT / "tools"))
from samples.manifest import SampleManifest # noqa: E402
# fetch_sample is a sibling tool — load via its module path.
import importlib.util # noqa: E402
_spec = importlib.util.spec_from_file_location(
"fetch_sample", REPO_ROOT / "tools" / "fetch_sample.py"
)
_fetch_sample = importlib.util.module_from_spec(_spec)
_spec.loader.exec_module(_fetch_sample)
log = logging.getLogger("cis490.auto_fetch_samples")
MB_ENDPOINT = "https://mb-api.abuse.ch/api/v1/"
def query_mb_by_signature(signature: str, api_key: str, *, limit: int = 5,
timeout_s: float = 30.0) -> list[dict]:
"""Return up to ``limit`` recent MB samples whose signature matches.
Uses the ``get_siginfo`` query, which returns the latest samples
for a given Yara/community signature. Falls back to an empty list
on any error so the caller can move on to the next family."""
body = urllib.parse.urlencode({
"query": "get_siginfo",
"signature": signature,
"limit": str(limit),
}).encode()
req = urllib.request.Request(
MB_ENDPOINT, data=body,
headers={"Auth-Key": api_key},
)
try:
with urllib.request.urlopen(req, timeout=timeout_s) as r:
payload = json.loads(r.read().decode("utf-8"))
except Exception as e:
log.warning("MB get_siginfo(%r) failed: %s", signature, e)
return []
if payload.get("query_status") != "ok":
log.warning("MB returned %r for signature %r",
payload.get("query_status"), signature)
return []
rows = payload.get("data") or []
return rows if isinstance(rows, list) else []
def update_manifest_entry(manifest_path: Path, name: str,
source: str, sha256: str, url: str) -> None:
"""In-place add ``source`` / ``sha256`` / ``url`` to the entry
whose ``name`` matches. Preserves ownership and mode across the
tempfile-replace dance."""
text = manifest_path.read_text()
needle = f'name = "{name}"'
idx = text.find(needle)
if idx < 0:
raise ValueError(f"name = {name!r} not found in {manifest_path}")
# Find the end of this [[sample]] block (next "[[" or EOF).
next_block = text.find("[[", idx + len(needle))
end = next_block if next_block != -1 else len(text)
block = text[idx:end]
# Skip if already has sha256.
if "sha256 =" in block and "TBD" not in block:
log.info("entry %s already has sha256; skipping", name)
return
# Insert the three new lines before the description (or at end).
insert = (
f'source = "{source}"\n'
f'sha256 = "{sha256}"\n'
f'url = "{url}"\n'
)
desc_idx = block.find("description = ")
if desc_idx >= 0:
new_block = block[:desc_idx] + insert + block[desc_idx:]
else:
new_block = block.rstrip() + "\n" + insert + "\n"
new_text = text[:idx] + new_block + text[end:]
st = manifest_path.stat()
tmp = manifest_path.with_suffix(".toml.partial")
tmp.write_text(new_text)
os.replace(tmp, manifest_path)
try:
os.chown(manifest_path, st.st_uid, st.st_gid)
except (PermissionError, OSError):
pass
os.chmod(manifest_path, st.st_mode & 0o7777)
def main(argv: list[str] | None = None) -> int:
p = argparse.ArgumentParser(prog="cis490-auto-fetch-samples")
p.add_argument("--manifest",
default=str(REPO_ROOT / "samples" / "manifest.toml"))
p.add_argument("--store-root",
default=str(REPO_ROOT / "samples" / "store"))
p.add_argument("--limit-per-family", type=int, default=1,
help="how many real binaries to fetch per family")
p.add_argument("--dry-run", action="store_true")
args = p.parse_args(argv)
logging.basicConfig(level=logging.INFO,
format="%(asctime)s %(levelname)s %(message)s")
api_key = _fetch_sample._read_api_key(REPO_ROOT)
if not api_key:
log.warning("MALWAREBAZAAR_API_KEY not set — nothing to do")
return 0
manifest_path = Path(args.manifest)
store_root = Path(args.store_root)
manifest = SampleManifest.load(manifest_path)
fetched = 0
skipped = 0
failed = 0
for sample in manifest.samples:
if sample.sha256:
log.info("%s: already real (sha256=%s); skipping",
sample.name, sample.sha256[:12])
skipped += 1
continue
log.info("%s: querying MB for family=%r", sample.name, sample.family)
rows = query_mb_by_signature(sample.family, api_key,
limit=args.limit_per_family)
if not rows:
log.warning("%s: no MB matches for family=%r — leaving as mimic",
sample.name, sample.family)
failed += 1
continue
# Pick the first non-corrupt-looking row that has a sha256.
chosen = next((r for r in rows if r.get("sha256_hash")), None)
if not chosen:
log.warning("%s: MB rows had no sha256_hash — skipping", sample.name)
failed += 1
continue
sha = chosen["sha256_hash"].lower()
url = f"https://bazaar.abuse.ch/sample/{sha}/"
if args.dry_run:
log.info("%s [dry-run]: would fetch %s", sample.name, sha)
continue
try:
_fetch_sample.fetch_sample(sha, store_root, api_key)
update_manifest_entry(manifest_path, sample.name,
source="MalwareBazaar", sha256=sha, url=url)
log.info("%s: fetched + manifest updated (sha256=%s)",
sample.name, sha[:12])
fetched += 1
except Exception as e:
log.warning("%s: fetch failed: %s — leaving as mimic", sample.name, e)
failed += 1
log.info("done: fetched=%d skipped=%d failed=%d", fetched, skipped, failed)
return 0 if (failed == 0 or fetched > 0) else 1
if __name__ == "__main__":
sys.exit(main())