CIS490

bolyai/CIS490

Fork 0

Commit graph

Author	SHA1	Message	Date
max	b809e1e26e	auto_fetch_samples: pick Linux i386 ELF; manifest matches theZoo User caught it: I shipped the theZoo path without running it end-to-end. A real fetch on the Pi exposed two bugs: 1. Family-name matcher was substring-strict. "Cryptolocker-class" wouldn't match the dir "CryptoLocker_22Jan2014" because "-class" isn't in the dir name. Now expands to a sequence of tokens (full, head-of-dash, head-of-dot, head-of-underscore) and tries each. First match wins. 2. Extraction picker was "largest non-text" — a bad heuristic for theZoo, where each Linux.* zip often contains MULTIPLE binaries for different platforms (Linux i386, x86-64, ARM, FreeBSD, sometimes even Windows PE). The largest is rarely the i386 Linux ELF that would actually run on Metasploitable2. Now sniffs ELF magic bytes in stdlib and tiers: 1. Linux i386 ELF (largest first) 2. any other ELF (best-effort, may not execute) 3. largest non-text (Wine fallback) Verified end-to-end on the Pi against a real theZoo clone (~500 MB, 263 family dirs, 2026-05-01 fresh pull): linux-encoder-ransomware → ELF 32-bit Intel i386 SYSV (278 KB) linux-wirenet-rat → ELF 32-bit Intel i386 SYSV (64 KB) linux-rex-ransomware → ELF 32-bit Intel i386 SYSV Go (7.6 MB) linux-neurevt-bot → ELF 32-bit Intel i386 SYSV (3.0 MB) linux-earthkrahang-apt → ELF 32-bit Intel i386 GNU/Linux (5.8 MB) 5/5 picks are runnable Linux i386 ELFs. Manifest rewrites in place add source/sha256/url; meta.sample.kind goes to "real" automatically. Manifest rewritten: - Old families (XMRig, Mirai, Cryptolocker-class, Dridex, Kovter, Reverse-Shell) → mostly absent from theZoo's Linux catalog or matched the wrong arch. - New families chosen against a verified theZoo presence list: Linux.Encoder, Linux.Wirenet, Ransomware.Rex, Neurevt, EarthKrahang. - XMRig + Kovter remain as mimic-only fallbacks (theZoo lacks a runnable Linux i386 binary for these; orchestrator falls back to the mimic profile). Tests added (tests/test_auto_fetch_samples.py): 13 cases covering ELF magic detection (i386 accepted, FreeBSD/x86-64/ARM/PE32/text all rejected), family-token expansion (the "-class" suffix bug), extraction picker (prefers Linux i386 over larger non-Linux ELFs), manifest in-place rewrite preserves mode + skips entries that already have sha256. What's still NOT verified end-to-end (requires a lab host with KVM x86): - Metasploitable2 boot under QEMU - vsftpd_234_backdoor exploit fire via msfrpcd - chunked binary upload through a real shell session - real binary executing inside a Metasploitable2 guest The Pi is ARM64 — can't run Metasploitable2. install-tier-3-4.sh's verify step (run_tier3_demo.py) covers all four on a real lab host; deploy verifies on first run there. 171/171 tests pass.	2026-05-01 03:28:26 -05:00

Author

SHA1

Message

Date

max

b809e1e26e

auto_fetch_samples: pick Linux i386 ELF; manifest matches theZoo

User caught it: I shipped the theZoo path without running it
end-to-end. A real fetch on the Pi exposed two bugs:

1. Family-name matcher was substring-strict. "Cryptolocker-class"
   wouldn't match the dir "CryptoLocker_22Jan2014" because "-class"
   isn't in the dir name. Now expands to a sequence of tokens
   (full, head-of-dash, head-of-dot, head-of-underscore) and tries
   each. First match wins.

2. Extraction picker was "largest non-text" — a bad heuristic for
   theZoo, where each Linux.* zip often contains MULTIPLE binaries
   for different platforms (Linux i386, x86-64, ARM, FreeBSD, sometimes
   even Windows PE). The largest is rarely the i386 Linux ELF that
   would actually run on Metasploitable2. Now sniffs ELF magic bytes
   in stdlib and tiers:
     1. Linux i386 ELF (largest first)
     2. any other ELF (best-effort, may not execute)
     3. largest non-text (Wine fallback)

Verified end-to-end on the Pi against a real theZoo clone (~500 MB,
263 family dirs, 2026-05-01 fresh pull):

  linux-encoder-ransomware  → ELF 32-bit Intel i386 SYSV (278 KB)
  linux-wirenet-rat         → ELF 32-bit Intel i386 SYSV (64 KB)
  linux-rex-ransomware      → ELF 32-bit Intel i386 SYSV Go (7.6 MB)
  linux-neurevt-bot         → ELF 32-bit Intel i386 SYSV (3.0 MB)
  linux-earthkrahang-apt    → ELF 32-bit Intel i386 GNU/Linux (5.8 MB)

5/5 picks are runnable Linux i386 ELFs. Manifest rewrites in place
add source/sha256/url; meta.sample.kind goes to "real" automatically.

Manifest rewritten:
  - Old families (XMRig, Mirai, Cryptolocker-class, Dridex, Kovter,
    Reverse-Shell) → mostly absent from theZoo's Linux catalog or
    matched the wrong arch.
  - New families chosen against a verified theZoo presence list:
    Linux.Encoder, Linux.Wirenet, Ransomware.Rex, Neurevt,
    EarthKrahang.
  - XMRig + Kovter remain as mimic-only fallbacks (theZoo lacks a
    runnable Linux i386 binary for these; orchestrator falls back
    to the mimic profile).

Tests added (tests/test_auto_fetch_samples.py): 13 cases covering
ELF magic detection (i386 accepted, FreeBSD/x86-64/ARM/PE32/text
all rejected), family-token expansion (the "-class" suffix bug),
extraction picker (prefers Linux i386 over larger non-Linux ELFs),
manifest in-place rewrite preserves mode + skips entries that
already have sha256.

What's still NOT verified end-to-end (requires a lab host with
KVM x86):
  - Metasploitable2 boot under QEMU
  - vsftpd_234_backdoor exploit fire via msfrpcd
  - chunked binary upload through a real shell session
  - real binary executing inside a Metasploitable2 guest

The Pi is ARM64 — can't run Metasploitable2. install-tier-3-4.sh's
verify step (run_tier3_demo.py) covers all four on a real lab host;
deploy verifies on first run there.

171/171 tests pass.

2026-05-01 03:28:26 -05:00

1 commit