diff --git a/AGENTS.md b/AGENTS.md index e4f7668..6549ebb 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -78,121 +78,96 @@ common silent failures it catches: `cis490-doctor --json` is machine-readable for use by other agents. -## Tier 3 + Tier 4 deploy (lab-host one-time, ~20 min) +## Tier 3 + Tier 4 deploy (zero-touch via install-lab-host.sh) -Tier 2 is the default after `install-lab-host.sh`: real Alpine guest, -mimic in-guest workloads. To get **real exploit fire** (Tier 3) and -**real malware execution** (Tier 4), each lab host needs three more -artifacts. The fleet runner auto-detects Tier-3 readiness via the -`_msfrpcd_available()` probe — once msfrpcd is up, episodes flip to -Tier 3 with no orchestrator config change. +`install-lab-host.sh` runs Tier-3 deploy automatically on its second +pass (after the mTLS cert lands). No operator interaction is needed: +metasploit-framework auto-installs via the Rapid7 omnibus, the +Metasploitable2 image auto-fetches from a public mirror with TOFU +sha256 pinning, the host-only bridge auto-comes-up, and a live +exploit fire is verified before the script returns. -### Prerequisites (per lab host) +To re-run the deploy by hand or on a host where Tier 3 was skipped: ```sh -# 1. Install Metasploit Framework + msfrpcd. Idempotent; ~1 GiB -# download the first time. Drops a strong password at -# /etc/cis490/msfrpc.env (mode 0640, root:cis490) and a systemd -# unit cis490-msfrpcd.service bound to 127.0.0.1:55553. -sudo /opt/cis490/scripts/install-msfrpcd.sh -sudo systemctl enable --now cis490-msfrpcd.service -systemctl is-active cis490-msfrpcd.service # → active - -# 2. Fetch Metasploitable2 qcow2. Rapid7's official download is -# registration-walled; supply the URL+sha256 you obtained from -# your registration. Conversion from VMDK → qcow2 happens -# automatically. Lands at /var/lib/cis490/vm/images/metasploitable2.qcow2. -IMAGE_URL='' \ -IMAGE_SHA256='' \ -sudo OUT_DIR=/var/lib/cis490/vm/images \ - /opt/cis490/scripts/fetch-metasploitable2.sh - -# 3. (Optional but recommended.) Bring up the host-only bridge -# `br-malware` so callback-payload exploits (3 of the 5 bundled -# modules require it: distccd_command_exec, php_cgi_arg_injection, -# unreal_ircd_3281_backdoor) can land. Without the bridge, the -# fleet auto-restricts to non-callback modules -# (vsftpd_234_backdoor, samba_usermap_script). -sudo /opt/cis490/scripts/setup_bridge.sh +sudo /opt/cis490/scripts/install-tier-3-4.sh ``` -### Verify Tier-3 fire end-to-end +It's idempotent — re-running on an already-deployed host is a no-op +except for the verify step. Inputs are all optional env vars: -```sh -# This runs ONE Tier-3 episode in the foreground using whatever -# module + sample the deterministic selector picks for slot=0, -# episode=0 on this host. Should print `module = exploit/...`, -# fire it via msfrpcd, and a normal episode summary at the end. -sudo -u cis490 \ - MSFRPC_PASSWORD="$(. /etc/cis490/msfrpc.env; echo $MSFRPC_PASSWORD)" \ - /opt/cis490/.venv/bin/python \ - /opt/cis490/tools/run_tier3_demo.py \ - --module vsftpd_234_backdoor \ - --target-port 21 --target-boot-timeout 240 -``` +| var | effect | +|---|---| +| `SKIP_VERIFY` | skip the live `vsftpd_234_backdoor` smoke run | +| `SKIP_BRIDGE` | skip `br-malware` setup (limits to 2 of 5 modules) | +| `SKIP_TIER4` | skip the Tier-4 auto-fetch even if API key present | +| `MALWAREBAZAAR_API_KEY` | opt-in: present means Tier-4 auto-fetch runs | -If the run prints `module loaded: vsftpd_234_backdoor (exploit/unix/ftp/...)` -and `episode_id = 01...` at the end, Tier 3 is live. The orchestrator's -next wave will use Tier 3 for every episode. +The fleet runner auto-detects Tier-3 readiness via +`orchestrator/fleet.py::_msfrpcd_available()`. Once +`cis490-msfrpcd.service` is up and `metasploitable2.qcow2` is on +disk, the next wave produces Tier-3 episodes (`meta.exploit.module_name` +populated). No orchestrator restart is required, but a restart speeds +up the switch. -### Tier-4 (real malware execution) +### Tier-4 (real malware execution) is opt-in, also push-button -Tier 4 layers on top of Tier 3 — the exploit lands a session, then a -real binary is uploaded via the chunked path and executed inside the -session. Two prerequisites: +Set `MALWAREBAZAAR_API_KEY` (free signup at https://bazaar.abuse.ch/) +before running `install-tier-3-4.sh` and step 5 runs +`tools/auto_fetch_samples.py` automatically: -```sh -# 1. Add MalwareBazaar API key (free signup at https://bazaar.abuse.ch/). -echo "$MB_KEY" | sudo install -m 0600 -o cis490 -g cis490 /dev/stdin \ - /opt/cis490/samples/.bazaar.token +1. For each `[[sample]]` in `samples/manifest.toml` without a + `sha256`, query MalwareBazaar by `family` (signature match) +2. Download the first matching binary (sha256-verified on the way in) +3. Edit the manifest in place — add `source`, `sha256`, `url` +4. Episodes that select that sample now run the real binary via the + chunked-upload path (`exploits.driver._resolve_workload`) -# 2. Pick a sha256 for one of your sample families from MalwareBazaar -# and download the binary. Verifies sha256 on the way in; lands at -# /opt/cis490/samples/store/. -sudo -u cis490 /opt/cis490/.venv/bin/python \ - /opt/cis490/tools/fetch_sample.py <64-hex-sha256> - -# 3. Edit /opt/cis490/samples/manifest.toml: add `source`, `sha256`, -# and `url` fields to the matching entry. The orchestrator's next -# selection that hits that sample will use the real binary -# (sample.kind == "real") — meta.sample.sha256 records it for the -# trainer. - -sudo systemctl restart cis490-orchestrator -``` +The mimic profile remains the fallback for episodes that select a +sample whose binary isn't on disk. Trainers filter on +`meta.sample.kind ∈ {"real", "mimic"}`. ### Confirm Tier 3+4 are flowing ```sh -# On the Pi: -sudo -u cis490 /opt/cis490/.venv/bin/python -c " -import json -real = mimic = 0 -modules = set() -for line in open('/var/lib/cis490/index.jsonl'): - pass # use the prune classifier instead -" -# or (better) rerun the diversity audit: -# the multi-host audit script the maintainer keeps for spot-checking +# On the Pi maintainer side: +sudo python3 -c " +import json, glob, subprocess, tarfile, io +from collections import Counter +mods = Counter(); kinds = Counter() +for tar in glob.glob('/var/lib/cis490/episodes/*/*.tar.zst'): + z = subprocess.check_output(['zstd','-q','-d','--stdout',tar],stderr=subprocess.DEVNULL) + with tarfile.open(fileobj=io.BytesIO(z)) as t: + for m in t.getmembers(): + if m.name.endswith('meta.json') and m.isfile(): + meta = json.load(t.extractfile(m)) + mods[(meta.get('exploit') or {}).get('module_name','')] += 1 + kinds[(meta.get('sample') or {}).get('kind','')] += 1 + break +print('exploit modules used:', dict(mods)) +print('sample kinds:', dict(kinds)) +" ``` -The on-Pi maintainer will see `meta.exploit.module_name` populated -and `meta.sample.kind == "real"` for the new episodes. If those stay -null after deploy, file an issue with the `journalctl -u cis490-orchestrator` -trace from the failed wave. +If `mods` is `{'': N}` and `kinds` is `{'mimic': N}`, Tier 3 +hasn't kicked in yet on any lab host — re-run +`install-tier-3-4.sh` there. If `mods` shows +`{'vsftpd_234_backdoor': N, ...}` and `kinds` shows a non-zero +`'real'` count, both tiers are live. ### Don't shortcut - DO NOT install `metasploit-framework` system-wide outside - install-msfrpcd.sh. The script wires the systemd unit + creds; a - manual install will work but the orchestrator's `_msfrpcd_available()` - probe expects the unit to be on 127.0.0.1:55553 with a password at - `/etc/cis490/msfrpc.env`. + `install-msfrpcd.sh`. The script wires the systemd unit + creds; + a manual install bypasses the orchestrator's + `_msfrpcd_available()` probe. - DO NOT push your MalwareBazaar API key. `samples/.bazaar.token` - is gitignored; the env var path is preferred for one-shot fetches. -- DO NOT add bogus sha256 entries to `manifest.toml` — every sha256 - is verified on download. A wrong value breaks `fetch_sample.py` - and confuses every replay. + is gitignored. +- DO NOT add bogus sha256 entries to `manifest.toml` — + `fetch_sample.py` hash-verifies every download. +- DO NOT add real-binary entries by hand when `auto_fetch_samples.py` + exists. Hand-edits are racy with the auto-fetcher's tempfile + rewrite. ## Securing the connection (mTLS) — DO NOT mint your own certs diff --git a/scripts/fetch-metasploitable2.sh b/scripts/fetch-metasploitable2.sh index 253962e..80275ff 100755 --- a/scripts/fetch-metasploitable2.sh +++ b/scripts/fetch-metasploitable2.sh @@ -1,26 +1,34 @@ #!/usr/bin/env bash -# Fetch + sha256-verify the Metasploitable2 disk image. +# Fetch the Metasploitable2 disk image with no operator interaction. # -# Rapid7's official download is gated behind a registration form, so -# we accept the URL + sha256 from env vars (with sane defaults pointing -# at a public mirror). The user installs this once per lab host. +# Defaults to the SourceForge public mirror — the canonical +# freely-redistributable copy of Metasploitable2 (Rapid7's own +# download is registration-walled but the same VMDK is on +# SourceForge, downloaded ~2M times). HTTPS protects the transport. # -# Inputs (env): -# IMAGE_URL — direct download URL for the metasploitable2 archive -# IMAGE_SHA256 — expected sha256 of the archive -# OUT_DIR — where to drop the qcow2 (default vm/images/) +# Idempotent: if the qcow2 is already on disk we do nothing. +# +# Inputs (env, all optional): +# IMAGE_URL — override the default mirror URL +# IMAGE_SHA256 — verify against this hash. If unset and a sha256 +# has been recorded by a prior successful fetch +# ($OUT_DIR/metasploitable2.qcow2.sha256), use that. +# If neither is available, do TOFU (trust on first +# use): record the hash of what was downloaded so +# subsequent runs verify against it. +# OUT_DIR — where to drop the qcow2 (default vm/images/) # # Outputs: -# $OUT_DIR/metasploitable2.qcow2 — converted from the original VMDK -# if needed. -# -# We do NOT bake an image url+hash into the repo because the canonical -# distribution is a registration-walled zip on Rapid7. Operators must -# supply both; the rest is mechanical. +# $OUT_DIR/metasploitable2.qcow2 — the disk image +# $OUT_DIR/metasploitable2.qcow2.sha256 — recorded archive hash set -euo pipefail -IMAGE_URL="${IMAGE_URL:-}" +# SourceForge public mirror. Direct-download URL — no auth, no +# registration. /download 302s to a regional mirror. +DEFAULT_IMAGE_URL='https://downloads.sourceforge.net/project/metasploitable/Metasploitable2/metasploitable-linux-2.0.0.zip' + +IMAGE_URL="${IMAGE_URL:-$DEFAULT_IMAGE_URL}" IMAGE_SHA256="${IMAGE_SHA256:-}" OUT_DIR="${OUT_DIR:-$(cd "$(dirname "$0")/../vm/images" 2>/dev/null && pwd)}" WORK_DIR="${WORK_DIR:-/tmp/cis490-metasploitable-fetch}" @@ -28,26 +36,44 @@ WORK_DIR="${WORK_DIR:-/tmp/cis490-metasploitable-fetch}" log() { printf '[fetch-metasploitable2] %s\n' "$*" >&2; } die() { log "FATAL: $*"; exit 1; } -[[ -n "$IMAGE_URL" ]] || die "set IMAGE_URL to the Metasploitable2 download URL" -[[ -n "$IMAGE_SHA256" ]] || die "set IMAGE_SHA256 to the expected sha256 of the archive" - mkdir -p "$OUT_DIR" "$WORK_DIR" +# Short-circuit if the qcow2 is already on disk. +if [[ -f "$OUT_DIR/metasploitable2.qcow2" ]]; then + log "$OUT_DIR/metasploitable2.qcow2 already present; nothing to do" + exit 0 +fi + +# Use the recorded sha256 from a prior successful fetch if present +# and the env var didn't override it. This pins TOFU across runs +# so a tampered re-download fails noisily. +SHA_FILE="$OUT_DIR/metasploitable2.qcow2.sha256" +if [[ -z "$IMAGE_SHA256" && -f "$SHA_FILE" ]]; then + IMAGE_SHA256="$(awk '{print $1}' "$SHA_FILE")" + log "using pinned sha256 from $SHA_FILE: $IMAGE_SHA256" +fi + ARCHIVE="$WORK_DIR/$(basename "$IMAGE_URL")" log "downloading $IMAGE_URL → $ARCHIVE" if [[ -f "$ARCHIVE" ]]; then - log "archive already present; skipping download" + log "archive already present in work dir; skipping download" else + # -L follows SourceForge's redirect to the actual mirror. curl -fL --retry 3 --retry-delay 5 -o "$ARCHIVE.partial" "$IMAGE_URL" mv "$ARCHIVE.partial" "$ARCHIVE" fi -log "verifying sha256" ACTUAL="$(sha256sum "$ARCHIVE" | awk '{print $1}')" -if [[ "$ACTUAL" != "$IMAGE_SHA256" ]]; then - die "sha256 mismatch: expected $IMAGE_SHA256, got $ACTUAL" +if [[ -n "$IMAGE_SHA256" ]]; then + if [[ "$ACTUAL" != "$IMAGE_SHA256" ]]; then + die "sha256 mismatch: expected $IMAGE_SHA256, got $ACTUAL" + fi + log "sha256 ok" +else + log "no sha256 supplied — first-run TOFU; pinning $ACTUAL for future runs" fi -log "sha256 ok" +# Always (re)record so future runs verify against the working hash. +echo "$ACTUAL $(basename "$ARCHIVE")" > "$SHA_FILE" # Extract — handle either zip or 7z, since various mirrors choose one # or the other. @@ -65,5 +91,8 @@ log "converting $VMDK → qcow2" command -v qemu-img >/dev/null || die "qemu-img required (apt install qemu-utils)" qemu-img convert -O qcow2 "$VMDK" "$OUT_DIR/metasploitable2.qcow2" +# Best-effort cleanup of the work dir — keeps lab-host disk clean. +rm -rf "$WORK_DIR" + log "done: $OUT_DIR/metasploitable2.qcow2" log "Tier-3 ready when msfrpcd is up. See scripts/install-msfrpcd.sh." diff --git a/scripts/install-lab-host.sh b/scripts/install-lab-host.sh index e0ad874..386eefc 100755 --- a/scripts/install-lab-host.sh +++ b/scripts/install-lab-host.sh @@ -203,6 +203,28 @@ install -d -o "$SERVICE_USER" -g "$SERVICE_USER" -m 0755 "$INSTALL_ROOT/vm/image ln -sf "$ALPINE_IMG" "$INSTALL_ROOT/vm/images/alpine-baseline.qcow2" 2>/dev/null || true ln -sf "$CIDATA_ISO" "$INSTALL_ROOT/vm/images/cidata.iso" 2>/dev/null || true +# --- 8. Tier-3 + Tier-4 deploy (auto, idempotent) ---------------------- +# Bring up msfrpcd + Metasploitable2 + bridge + verify. Skipped only if +# certs aren't on disk yet (Tier-3 fire writes episodes that the +# shipper ships, so it's pointless to run before mTLS is live) or the +# operator passed --skip-tier3. +SKIP_TIER3="${SKIP_TIER3:-}" +for arg in "$@"; do + [[ "$arg" == "--skip-tier3" ]] && SKIP_TIER3=1 +done +if [[ -z "$SKIP_TIER3" && -f "$ETC_ROOT/certs/lab-host.pem" ]]; then + log "deploying Tier 3 (msfrpcd + Metasploitable2 + bridge)" + if "$INSTALL_ROOT/scripts/install-tier-3-4.sh"; then + log "Tier-3 deploy ✓" + else + log "WARN: Tier-3 deploy failed — Tier 2 will keep running." + log " Re-run later: sudo $INSTALL_ROOT/scripts/install-tier-3-4.sh" + fi +elif [[ -z "$SKIP_TIER3" ]]; then + log "skipping Tier-3 deploy (no mTLS cert yet — re-run this script after" + log "host_id is set so the cert auto-fetches first)" +fi + if [[ "$NEW_INSTALL" == "1" ]]; then log "" log "=================================================================" diff --git a/scripts/install-msfrpcd.sh b/scripts/install-msfrpcd.sh index 8fb9b57..ec7477a 100755 --- a/scripts/install-msfrpcd.sh +++ b/scripts/install-msfrpcd.sh @@ -32,33 +32,55 @@ die() { log "FATAL: $*"; exit 1; } command -v systemctl >/dev/null || die "systemd not found" # --- 1. install metasploit-framework ----------------------------------- -if ! command -v msfrpcd >/dev/null; then - log "msfrpcd not found; installing metasploit-framework" +# Auto-install paths per package manager. Rapid7's omnibus installer +# is the canonical zero-touch path for Debian/Ubuntu — it adds the +# apt repo, the GPG key, and apt-installs the framework. Other +# distros use their native package or fall back to the omnibus shell +# script. +if ! command -v msfrpcd >/dev/null && [[ ! -x /opt/metasploit-framework/bin/msfrpcd ]]; then + log "msfrpcd not found; installing metasploit-framework (~1 GiB)" if command -v apt-get >/dev/null; then - # The Debian/Ubuntu metasploit-framework package isn't in - # the default repos for most distros. Use Rapid7's official - # nightly installer when available. - if [[ ! -x /opt/metasploit-framework/bin/msfrpcd ]]; then - log "fetching Rapid7 nightly installer" - curl -fsSL https://raw.githubusercontent.com/rapid7/metasploit-omnibus/master/config/templates/metasploit-framework-wrappers/msfupdate.erb \ - -o /tmp/msfinstall.sh || true - log "automated install not available — install manually:" - log " https://docs.metasploit.com/docs/using-metasploit/getting-started/nightly-installers.html" - die "rerun once msfrpcd is on PATH" - fi - # Symlink the wrapper so ``msfrpcd`` is on PATH. - ln -sf /opt/metasploit-framework/bin/msfrpcd /usr/local/bin/msfrpcd + # Rapid7's omnibus installer wraps the apt-repo + GPG-key + # bootstrap + apt install in a single script. We fetch and + # exec it non-interactively. The script does: + # 1. add apt.metasploit.com to /etc/apt/sources.list.d/ + # 2. install the GPG key + # 3. apt-get install -y metasploit-framework + log "running Rapid7 omnibus installer" + TMP="$(mktemp -d)" + curl -fsSL \ + https://raw.githubusercontent.com/rapid7/metasploit-omnibus/master/config/templates/metasploit-framework-wrappers/msfupdate.erb \ + -o "$TMP/msfinstall" + chmod +x "$TMP/msfinstall" + DEBIAN_FRONTEND=noninteractive "$TMP/msfinstall" /dev/null; then log "pacman -S metasploit" pacman -Sy --noconfirm metasploit elif command -v dnf >/dev/null; then - die "Fedora/RHEL: install metasploit-framework manually, then re-run" + # The omnibus installer also supports rpm distros via the + # same script — it auto-detects and uses dnf/yum. + log "running Rapid7 omnibus installer (dnf path)" + TMP="$(mktemp -d)" + curl -fsSL \ + https://raw.githubusercontent.com/rapid7/metasploit-omnibus/master/config/templates/metasploit-framework-wrappers/msfupdate.erb \ + -o "$TMP/msfinstall" + chmod +x "$TMP/msfinstall" + "$TMP/msfinstall" /dev/null || die "msfrpcd still missing after install attempt" +# After install, msfrpcd may live at /opt/metasploit-framework/bin/ +# (omnibus) or on PATH (apt repo). Symlink so callers find it. +if ! command -v msfrpcd >/dev/null; then + if [[ -x /opt/metasploit-framework/bin/msfrpcd ]]; then + ln -sf /opt/metasploit-framework/bin/msfrpcd /usr/local/bin/msfrpcd + fi +fi +command -v msfrpcd >/dev/null || die "msfrpcd still missing after install — see journalctl" # --- 2. generate password ---------------------------------------------- install -d -m 0755 -o root -g root "$ETC_ROOT" diff --git a/scripts/install-tier-3-4.sh b/scripts/install-tier-3-4.sh new file mode 100755 index 0000000..cf8434c --- /dev/null +++ b/scripts/install-tier-3-4.sh @@ -0,0 +1,142 @@ +#!/usr/bin/env bash +# Tier-3 + Tier-4 deploy orchestrator. Idempotent. Zero operator +# interaction in the default path. +# +# Steps (each idempotent on its own): +# 1. install-msfrpcd.sh — auto-install metasploit-framework via +# Rapid7 omnibus + drop systemd unit +# 2. fetch-metasploitable2.sh — pull the disk image from the +# SourceForge public mirror (TOFU) +# 3. setup_bridge.sh — bring up br-malware host-only bridge +# for callback-payload modules +# 4. Tier-3 verify — fire vsftpd_234_backdoor against the +# freshly-fetched VM, confirm session +# lands and an episode is recorded +# 5. Tier-4 auto-fetch — if MALWAREBAZAAR_API_KEY is set, run +# tools/auto_fetch_samples.py to pull +# one real binary per sample family and +# update samples/manifest.toml +# +# Inputs (env, all optional): +# SKIP_VERIFY — set to skip the live Tier-3 fire test +# SKIP_BRIDGE — set to skip bridge setup (limits to non-callback modules) +# SKIP_TIER4 — set to skip the Tier-4 auto-fetch even if API key present +# MALWAREBAZAAR_API_KEY — opt-in: present means run Tier-4 fetch +# +# Run as root from anywhere on the lab host. Sub-scripts handle their +# own root checks. + +set -euo pipefail + +REPO_ROOT="$(cd "$(dirname "$0")/.." && pwd)" +INSTALL_ROOT="${INSTALL_ROOT:-/opt/cis490}" +DATA_ROOT="${DATA_ROOT:-/var/lib/cis490}" +ETC_ROOT="${ETC_ROOT:-/etc/cis490}" + +log() { printf '[install-tier-3-4] %s\n' "$*" >&2; } +die() { log "FATAL: $*"; exit 1; } + +[[ $EUID -eq 0 ]] || die "must run as root" + +# Resolve script paths — prefer $INSTALL_ROOT (production) over +# $REPO_ROOT (dev clone) so a re-run under systemd uses the same +# scripts the orchestrator does. +script_path() { + local name="$1" + if [[ -x "$INSTALL_ROOT/scripts/$name" ]]; then echo "$INSTALL_ROOT/scripts/$name"; return + elif [[ -x "$REPO_ROOT/scripts/$name" ]]; then echo "$REPO_ROOT/scripts/$name"; return + elif [[ -x "$INSTALL_ROOT/vm/$name" ]]; then echo "$INSTALL_ROOT/vm/$name"; return + elif [[ -x "$REPO_ROOT/vm/$name" ]]; then echo "$REPO_ROOT/vm/$name"; return + else die "$name not found in $INSTALL_ROOT or $REPO_ROOT" + fi +} + +# --- 1. msfrpcd -------------------------------------------------------- +log "[1/5] install metasploit-framework + msfrpcd unit" +"$(script_path install-msfrpcd.sh)" + +if ! systemctl is-active --quiet cis490-msfrpcd; then + log "starting cis490-msfrpcd" + systemctl enable --now cis490-msfrpcd +fi +sleep 3 +if ! ss -ltn 2>/dev/null | grep -q ':55553'; then + log "cis490-msfrpcd not listening on 127.0.0.1:55553 yet — waiting up to 30s" + for _ in $(seq 1 30); do + ss -ltn 2>/dev/null | grep -q ':55553' && break + sleep 1 + done +fi +ss -ltn 2>/dev/null | grep -q ':55553' || \ + die "msfrpcd never bound to :55553 — check 'journalctl -u cis490-msfrpcd'" +log "msfrpcd ✓" + +# --- 2. metasploitable2 image ------------------------------------------ +log "[2/5] fetch Metasploitable2 disk image" +OUT_DIR="$DATA_ROOT/vm/images" +install -d -m 0755 -o cis490 -g cis490 "$OUT_DIR" +OUT_DIR="$OUT_DIR" "$(script_path fetch-metasploitable2.sh)" +chown cis490:cis490 "$OUT_DIR/metasploitable2.qcow2" 2>/dev/null || true +log "metasploitable2.qcow2 ✓" + +# --- 3. bridge --------------------------------------------------------- +if [[ -z "${SKIP_BRIDGE:-}" ]]; then + log "[3/5] bring up br-malware host-only bridge" + "$(script_path setup_bridge.sh)" || log "bridge setup failed (non-fatal); only non-callback modules will fire" + log "br-malware ✓" +else + log "[3/5] SKIP_BRIDGE set — limiting to non-callback modules" +fi + +# --- 4. Tier-3 verify -------------------------------------------------- +if [[ -z "${SKIP_VERIFY:-}" ]]; then + log "[4/5] verify Tier-3 fire (vsftpd_234_backdoor)" + set -a + # shellcheck disable=SC1091 + . "$ETC_ROOT/msfrpc.env" + set +a + PY="$INSTALL_ROOT/.venv/bin/python" + [[ -x "$PY" ]] || PY="$(command -v python3)" + if ! sudo -E -u cis490 "$PY" "$INSTALL_ROOT/tools/run_tier3_demo.py" \ + --module vsftpd_234_backdoor \ + --target-port 21 \ + --target-boot-timeout 240 \ + > /tmp/cis490-tier3-verify.log 2>&1; then + log "verify run failed — log at /tmp/cis490-tier3-verify.log; dumping last 30 lines:" + tail -30 /tmp/cis490-tier3-verify.log >&2 || true + die "Tier-3 fire failed" + fi + if grep -q '^episode_id = ' /tmp/cis490-tier3-verify.log; then + log "Tier-3 verified ✓ ($(grep '^episode_id = ' /tmp/cis490-tier3-verify.log))" + else + log "verify run finished but no episode_id seen — log at /tmp/cis490-tier3-verify.log" + fi +else + log "[4/5] SKIP_VERIFY set" +fi + +# --- 5. Tier-4 auto-fetch ---------------------------------------------- +if [[ -z "${SKIP_TIER4:-}" && -n "${MALWAREBAZAAR_API_KEY:-}" ]]; then + log "[5/5] Tier-4 auto-fetch (MALWAREBAZAAR_API_KEY set)" + PY="$INSTALL_ROOT/.venv/bin/python" + [[ -x "$PY" ]] || PY="$(command -v python3)" + sudo -E -u cis490 "$PY" "$INSTALL_ROOT/tools/auto_fetch_samples.py" || \ + log "Tier-4 auto-fetch failed (non-fatal) — Tier 3 still active" +elif [[ -z "${MALWAREBAZAAR_API_KEY:-}" ]]; then + log "[5/5] Tier-4 skipped — set MALWAREBAZAAR_API_KEY to enable real-binary fetch" +else + log "[5/5] SKIP_TIER4 set" +fi + +log "" +log "=================================================================" +log " Tier-3 deploy complete on $(hostname)" +log "=================================================================" +log " - metasploit-framework + cis490-msfrpcd.service active" +log " - $OUT_DIR/metasploitable2.qcow2 staged" +log " - bridge: $(ip link show br-malware >/dev/null 2>&1 && echo up || echo skipped)" +log " - Tier-4: $(ls "$INSTALL_ROOT/samples/store/" 2>/dev/null | wc -l) real binaries staged" +log "" +log " Restart the orchestrator so the next wave runs Tier-3:" +log " sudo systemctl restart cis490-orchestrator" +log "=================================================================" diff --git a/tools/auto_fetch_samples.py b/tools/auto_fetch_samples.py new file mode 100644 index 0000000..9ab0b9e --- /dev/null +++ b/tools/auto_fetch_samples.py @@ -0,0 +1,199 @@ +"""``cis490-auto-fetch-samples`` — pull one real binary per manifest +family from MalwareBazaar and update ``samples/manifest.toml``. + +The selection is automatic: for each entry in ``samples/manifest.toml`` +that doesn't already have a sha256, we query MalwareBazaar for a +recent sample whose ``signature`` field matches the entry's ``family`` +(e.g. ``family = "XMRig"`` → MB signature ``XMRig``). The first +result is downloaded via ``tools.fetch_sample.fetch_sample``, the +sha256 lands in ``samples/store/``, and the manifest entry +gains ``source``, ``sha256``, and ``url`` fields. + +Idempotent: entries that already have a sha256 are skipped. Manifest +edits are atomic (tempfile + os.replace) and preserve the file's +ownership and mode. + +Run on the lab host as root (or as the cis490 service user, if it +has write permission to ``samples/``): + + MALWAREBAZAAR_API_KEY= \\ + sudo -E -u cis490 /opt/cis490/.venv/bin/python \\ + /opt/cis490/tools/auto_fetch_samples.py + +Without an API key, exits 0 with no work done — keeps the install +script's call site uncomplicated. +""" + +from __future__ import annotations + +import argparse +import json +import logging +import os +import sys +import urllib.parse +import urllib.request +from pathlib import Path + + +REPO_ROOT = Path(__file__).resolve().parent.parent +sys.path.insert(0, str(REPO_ROOT)) +sys.path.insert(0, str(REPO_ROOT / "tools")) + +from samples.manifest import SampleManifest # noqa: E402 + +# fetch_sample is a sibling tool — load via its module path. +import importlib.util # noqa: E402 +_spec = importlib.util.spec_from_file_location( + "fetch_sample", REPO_ROOT / "tools" / "fetch_sample.py" +) +_fetch_sample = importlib.util.module_from_spec(_spec) +_spec.loader.exec_module(_fetch_sample) + + +log = logging.getLogger("cis490.auto_fetch_samples") + + +MB_ENDPOINT = "https://mb-api.abuse.ch/api/v1/" + + +def query_mb_by_signature(signature: str, api_key: str, *, limit: int = 5, + timeout_s: float = 30.0) -> list[dict]: + """Return up to ``limit`` recent MB samples whose signature matches. + + Uses the ``get_siginfo`` query, which returns the latest samples + for a given Yara/community signature. Falls back to an empty list + on any error so the caller can move on to the next family.""" + body = urllib.parse.urlencode({ + "query": "get_siginfo", + "signature": signature, + "limit": str(limit), + }).encode() + req = urllib.request.Request( + MB_ENDPOINT, data=body, + headers={"Auth-Key": api_key}, + ) + try: + with urllib.request.urlopen(req, timeout=timeout_s) as r: + payload = json.loads(r.read().decode("utf-8")) + except Exception as e: + log.warning("MB get_siginfo(%r) failed: %s", signature, e) + return [] + if payload.get("query_status") != "ok": + log.warning("MB returned %r for signature %r", + payload.get("query_status"), signature) + return [] + rows = payload.get("data") or [] + return rows if isinstance(rows, list) else [] + + +def update_manifest_entry(manifest_path: Path, name: str, + source: str, sha256: str, url: str) -> None: + """In-place add ``source`` / ``sha256`` / ``url`` to the entry + whose ``name`` matches. Preserves ownership and mode across the + tempfile-replace dance.""" + text = manifest_path.read_text() + needle = f'name = "{name}"' + idx = text.find(needle) + if idx < 0: + raise ValueError(f"name = {name!r} not found in {manifest_path}") + # Find the end of this [[sample]] block (next "[[" or EOF). + next_block = text.find("[[", idx + len(needle)) + end = next_block if next_block != -1 else len(text) + block = text[idx:end] + # Skip if already has sha256. + if "sha256 =" in block and "TBD" not in block: + log.info("entry %s already has sha256; skipping", name) + return + # Insert the three new lines before the description (or at end). + insert = ( + f'source = "{source}"\n' + f'sha256 = "{sha256}"\n' + f'url = "{url}"\n' + ) + desc_idx = block.find("description = ") + if desc_idx >= 0: + new_block = block[:desc_idx] + insert + block[desc_idx:] + else: + new_block = block.rstrip() + "\n" + insert + "\n" + new_text = text[:idx] + new_block + text[end:] + + st = manifest_path.stat() + tmp = manifest_path.with_suffix(".toml.partial") + tmp.write_text(new_text) + os.replace(tmp, manifest_path) + try: + os.chown(manifest_path, st.st_uid, st.st_gid) + except (PermissionError, OSError): + pass + os.chmod(manifest_path, st.st_mode & 0o7777) + + +def main(argv: list[str] | None = None) -> int: + p = argparse.ArgumentParser(prog="cis490-auto-fetch-samples") + p.add_argument("--manifest", + default=str(REPO_ROOT / "samples" / "manifest.toml")) + p.add_argument("--store-root", + default=str(REPO_ROOT / "samples" / "store")) + p.add_argument("--limit-per-family", type=int, default=1, + help="how many real binaries to fetch per family") + p.add_argument("--dry-run", action="store_true") + args = p.parse_args(argv) + + logging.basicConfig(level=logging.INFO, + format="%(asctime)s %(levelname)s %(message)s") + + api_key = _fetch_sample._read_api_key(REPO_ROOT) + if not api_key: + log.warning("MALWAREBAZAAR_API_KEY not set — nothing to do") + return 0 + + manifest_path = Path(args.manifest) + store_root = Path(args.store_root) + manifest = SampleManifest.load(manifest_path) + + fetched = 0 + skipped = 0 + failed = 0 + for sample in manifest.samples: + if sample.sha256: + log.info("%s: already real (sha256=%s); skipping", + sample.name, sample.sha256[:12]) + skipped += 1 + continue + log.info("%s: querying MB for family=%r", sample.name, sample.family) + rows = query_mb_by_signature(sample.family, api_key, + limit=args.limit_per_family) + if not rows: + log.warning("%s: no MB matches for family=%r — leaving as mimic", + sample.name, sample.family) + failed += 1 + continue + # Pick the first non-corrupt-looking row that has a sha256. + chosen = next((r for r in rows if r.get("sha256_hash")), None) + if not chosen: + log.warning("%s: MB rows had no sha256_hash — skipping", sample.name) + failed += 1 + continue + sha = chosen["sha256_hash"].lower() + url = f"https://bazaar.abuse.ch/sample/{sha}/" + if args.dry_run: + log.info("%s [dry-run]: would fetch %s", sample.name, sha) + continue + try: + _fetch_sample.fetch_sample(sha, store_root, api_key) + update_manifest_entry(manifest_path, sample.name, + source="MalwareBazaar", sha256=sha, url=url) + log.info("%s: fetched + manifest updated (sha256=%s)", + sample.name, sha[:12]) + fetched += 1 + except Exception as e: + log.warning("%s: fetch failed: %s — leaving as mimic", sample.name, e) + failed += 1 + + log.info("done: fetched=%d skipped=%d failed=%d", fetched, skipped, failed) + return 0 if (failed == 0 or fetched > 0) else 1 + + +if __name__ == "__main__": + sys.exit(main())