Compare commits
8 commits
main
...
Dev_REL3_0
| Author | SHA1 | Date | |
|---|---|---|---|
| 0fb2f3b9a6 | |||
| 656a015443 | |||
| d294eb9f52 | |||
| 42626259c7 | |||
| d2716b485e | |||
| f4eef81807 | |||
| ae4b80dc32 | |||
| 1dd484dd5c |
9 changed files with 195 additions and 5 deletions
142
docs/fix-notes-Dev_REL3_050126.md
Normal file
142
docs/fix-notes-Dev_REL3_050126.md
Normal file
|
|
@ -0,0 +1,142 @@
|
||||||
|
# Fix Notes — Dev_REL3_050126
|
||||||
|
|
||||||
|
Branch HEAD: `656a015443f54dffeab66ae29fa726eee36a51ed`
|
||||||
|
Date: 2026-05-02
|
||||||
|
Author: elliott (k-gamingcom lab host)
|
||||||
|
|
||||||
|
## Summary
|
||||||
|
|
||||||
|
Seven bugs found and fixed during Tier-3 + Tier-4 bring-up on k-gamingcom,
|
||||||
|
following the AGENTS.md runbook. All fixes are committed to `Dev_REL3_050126`
|
||||||
|
and deployed to `/opt/cis490`.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Fixes (oldest → newest)
|
||||||
|
|
||||||
|
### 1. `cis490-msfrpcd` crashes with EROFS on `/root/.msf4` — commit `1dd484d`
|
||||||
|
|
||||||
|
**File:** `scripts/install-msfrpcd.sh`
|
||||||
|
|
||||||
|
**Symptom:** msfrpcd service failed immediately with `EROFS` because
|
||||||
|
`ProtectHome=true` in the generated systemd unit made `/root` a read-only
|
||||||
|
overlay. msfrpcd defaulted `$HOME` to `/root` and could not create `.msf4/`.
|
||||||
|
|
||||||
|
**Fix:** Pre-create `/var/lib/cis490/msf4`, add `Environment=HOME=/var/lib/cis490/msf4`
|
||||||
|
and `ReadWritePaths=/var/lib/cis490` to the generated unit.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### 2. Two Tier-3 install bugs — `metasploitable2` symlink + msfrpcd HOME — commit `ae4b80d`
|
||||||
|
|
||||||
|
**Files:** `scripts/install-tier-3-4.sh`
|
||||||
|
|
||||||
|
**Symptom A:** `install-tier-3-4.sh` fetched the Metasploitable2 image to
|
||||||
|
`$DATA_ROOT/vm/images/` but never symlinked it to `$INSTALL_ROOT/vm/images/`.
|
||||||
|
`launch_target.sh` resolved `IMAGE` relative to `$INSTALL_ROOT/vm/images/`
|
||||||
|
and exited immediately; `qemu.pid` never appeared.
|
||||||
|
|
||||||
|
**Fix:** Added `install -d` + `ln -sf` step after the fetch.
|
||||||
|
|
||||||
|
**Symptom B:** Same install bug also carried over the `HOME` fix above into
|
||||||
|
the install script's live-patch path.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### 3. PORT_BASE=21 is privileged; RPORT not propagated — commit `f4eef81`
|
||||||
|
|
||||||
|
**Files:** `vm/launch_target.sh`, `exploits/driver.py`, `tools/run_tier3_demo.py`
|
||||||
|
|
||||||
|
**Symptom:** `launch_target.sh` defaulted `PORT_BASE` to `$((21 + SLOT * 100))`.
|
||||||
|
Slot 0 → port 21, which `cis490` (non-root) cannot bind. QEMU printed
|
||||||
|
`bind(AF_INET, ...): Permission denied` and exited before booting the guest.
|
||||||
|
Even if the port had worked, `DriverConfig` had no way to override `RPORT`,
|
||||||
|
so the exploit module would have still connected to port 21 (not the hostfwd'd
|
||||||
|
port).
|
||||||
|
|
||||||
|
**Fix:**
|
||||||
|
- `launch_target.sh`: `PORT_BASE` default → `$((2121 + SLOT * 100))`
|
||||||
|
- `DriverConfig`: added `target_port: int | None` field
|
||||||
|
- `MSFExploitDriver._fire()`: if `target_port` set and RPORT in opts, override
|
||||||
|
- `run_tier3_demo.py`: pass `target_port=args.target_port` to `DriverConfig`
|
||||||
|
- `install-tier-3-4.sh` verify call: `--target-port 2121`
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### 4. `run_tier3_demo.py --data-root` defaults to relative `"data"` — commit `d2716b4`
|
||||||
|
|
||||||
|
**Files:** `scripts/install-tier-3-4.sh`
|
||||||
|
|
||||||
|
**Symptom:** `run_tier3_demo.py` defaults `--data-root` to `"data"` (relative).
|
||||||
|
When invoked via `sudo -u cis490`, the CWD was `/`, so episode dirs resolved to
|
||||||
|
`/data/episodes/` which doesn't exist; `mkdir` raised `PermissionError`.
|
||||||
|
|
||||||
|
**Fix:** Pass `--data-root "$DATA_ROOT/data"` explicitly in the install script.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### 5. msfrpc bytes/str normalisation — commit `4262625` (closes #20)
|
||||||
|
|
||||||
|
**File:** `exploits/msfrpc.py`
|
||||||
|
|
||||||
|
**Symptom:** msfrpcd encodes all response strings as msgpack `bin` type (always
|
||||||
|
Python `bytes`). `unpackb(raw=False)` only converts the legacy `raw` type;
|
||||||
|
`bin` comes out as `bytes` regardless. `auth.login` received
|
||||||
|
`{b'result': b'success', b'token': b'TEMP...'}` and `resp.get("result")`
|
||||||
|
returned `None` → `MSFRpcError("auth.login failed: ...")`.
|
||||||
|
|
||||||
|
**Fix:** Added `_decode_response()` recursive `bytes → str` normaliser and
|
||||||
|
called it in `_raw_call` immediately after `msgpack.unpackb`.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### 6. Orchestrator never received `MSFRPC_PASSWORD` — commit `d294eb9`
|
||||||
|
|
||||||
|
**File:** `etc/cis490-orchestrator.service`
|
||||||
|
|
||||||
|
**Symptom:** The orchestrator unit only loaded `lab-host.env`, which contains
|
||||||
|
`FLEET_HOST_ID` and `BRIDGE` but not `MSFRPC_PASSWORD`. `run_tier3_demo.py`
|
||||||
|
checks for the env var at startup and exits `rc=2` immediately if unset.
|
||||||
|
All tier3 slots were failing in ~240 ms with `rc=2`.
|
||||||
|
|
||||||
|
**Fix:** Added `EnvironmentFile=-/etc/cis490/msfrpc.env` to the unit (the `-`
|
||||||
|
prefix silences the error on Tier-2-only hosts where the file doesn't exist).
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### 7. Fleet port formula produces privileged ports; boot timeout too tight — commit `656a015`
|
||||||
|
|
||||||
|
**File:** `orchestrator/fleet.py`
|
||||||
|
|
||||||
|
**Symptom A:** `PORT_BASE = target_port + slot * 1000` produced host ports
|
||||||
|
< 1024 for `samba_usermap_script` (RPORT=139, slot 0 → port 139) and
|
||||||
|
`php_cgi_arg_injection` (RPORT=80, slot 0 → port 80). `cis490` lacks
|
||||||
|
`CAP_NET_BIND_SERVICE`; QEMU's SLIRP `hostfwd` silently failed. The service
|
||||||
|
was never reachable. All 7 slots returned `rc=1` after timing out.
|
||||||
|
|
||||||
|
**Symptom B:** `--target-boot-timeout` was not passed to `run_tier3_demo.py`,
|
||||||
|
which uses a 180 s default. 7 concurrent VMs contending on I/O during boot
|
||||||
|
cannot reliably start their services within 180 s.
|
||||||
|
|
||||||
|
**Fix:**
|
||||||
|
- Port formula: `host_port = (target_port % 1000) + 2000 + slot * 1000`
|
||||||
|
(minimum host port 2000, no collisions across module types or slots)
|
||||||
|
- Pass `--target-boot-timeout 300` explicitly from the fleet runner
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Verification
|
||||||
|
|
||||||
|
After all fixes were applied:
|
||||||
|
|
||||||
|
- `install-tier-3-4.sh` step 4 produced episode `01KQJM5WGWC33P0QWJXRDJV1EN`
|
||||||
|
- `install-tier-3-4.sh` step 5 staged 6 real binaries in `samples/store/`
|
||||||
|
- Fleet wave at 19:55:57 UTC-6 confirmed slot 0 samba probing port 2139
|
||||||
|
with 300 s timeout — first wave to actually run to completion
|
||||||
|
|
||||||
|
## Still outstanding
|
||||||
|
|
||||||
|
- Pi-side mTLS cert for k-gamingcom not yet issued (shipper in
|
||||||
|
"waiting on mTLS material" state). Blocked on Pi operator running
|
||||||
|
`deploy-cis490-cert.sh k-gamingcom <wg_ip>`. No action needed on
|
||||||
|
lab-host side.
|
||||||
|
|
@ -14,6 +14,9 @@ WorkingDirectory=/opt/cis490
|
||||||
# /etc/cis490/lab-host.env is written by scripts/install-lab-host.sh;
|
# /etc/cis490/lab-host.env is written by scripts/install-lab-host.sh;
|
||||||
# carries FLEET_HOST_ID, BRIDGE, and any operator-supplied overrides.
|
# carries FLEET_HOST_ID, BRIDGE, and any operator-supplied overrides.
|
||||||
EnvironmentFile=/etc/cis490/lab-host.env
|
EnvironmentFile=/etc/cis490/lab-host.env
|
||||||
|
# msfrpc.env only exists after install-tier-3-4.sh; the '-' prefix makes
|
||||||
|
# this a no-op on Tier-2-only hosts where it hasn't run yet.
|
||||||
|
EnvironmentFile=-/etc/cis490/msfrpc.env
|
||||||
# Fleet mode: detect host capacity, run that many concurrent episodes
|
# Fleet mode: detect host capacity, run that many concurrent episodes
|
||||||
# per wave with samples drawn from the manifest. Each invocation runs
|
# per wave with samples drawn from the manifest. Each invocation runs
|
||||||
# one wave and exits; systemd respawns per Restart= below, giving us
|
# one wave and exits; systemd respawns per Restart= below, giving us
|
||||||
|
|
|
||||||
|
|
@ -51,6 +51,9 @@ EmitEvent = Callable[..., None]
|
||||||
@dataclass
|
@dataclass
|
||||||
class DriverConfig:
|
class DriverConfig:
|
||||||
target_ip: str
|
target_ip: str
|
||||||
|
# Override the module's static RPORT when the host-side SLIRP
|
||||||
|
# hostfwd uses a non-privileged port (e.g. 2121 → guest:21).
|
||||||
|
target_port: int | None = None
|
||||||
session_open_timeout_s: float = 30.0
|
session_open_timeout_s: float = 30.0
|
||||||
# Driver v1 fallback workload — used only when no Sample is passed
|
# Driver v1 fallback workload — used only when no Sample is passed
|
||||||
# in (Sample-driven runs override these via exploits.workloads).
|
# in (Sample-driven runs override these via exploits.workloads).
|
||||||
|
|
@ -185,6 +188,8 @@ class MSFExploitDriver:
|
||||||
log.debug("module already fired; skipping re-fire")
|
log.debug("module already fired; skipping re-fire")
|
||||||
return
|
return
|
||||||
opts = self.module.render_options(target_ip=self.cfg.target_ip)
|
opts = self.module.render_options(target_ip=self.cfg.target_ip)
|
||||||
|
if self.cfg.target_port is not None and "RPORT" in opts:
|
||||||
|
opts["RPORT"] = self.cfg.target_port
|
||||||
self.emit(
|
self.emit(
|
||||||
"exploit_fire",
|
"exploit_fire",
|
||||||
module=self.module.module_path,
|
module=self.module.module_path,
|
||||||
|
|
|
||||||
|
|
@ -45,6 +45,24 @@ except ImportError as e: # pragma: no cover - import-time guard
|
||||||
log = logging.getLogger("cis490.msfrpc")
|
log = logging.getLogger("cis490.msfrpc")
|
||||||
|
|
||||||
|
|
||||||
|
def _decode_response(v: Any) -> Any:
|
||||||
|
"""Recursively convert bytes → str in a msgpack-decoded structure.
|
||||||
|
|
||||||
|
msfrpcd encodes string values as msgpack bin (binary) type, not as
|
||||||
|
msgpack raw/str. Python msgpack's raw=False only decodes the legacy
|
||||||
|
'raw' type; 'bin' always comes out as bytes. Normalise here so
|
||||||
|
callers can do resp.get("result") regardless of which wire encoding
|
||||||
|
msfrpcd uses in a given version.
|
||||||
|
"""
|
||||||
|
if isinstance(v, bytes):
|
||||||
|
return v.decode("utf-8", errors="replace")
|
||||||
|
if isinstance(v, dict):
|
||||||
|
return {_decode_response(k): _decode_response(val) for k, val in v.items()}
|
||||||
|
if isinstance(v, list):
|
||||||
|
return [_decode_response(i) for i in v]
|
||||||
|
return v
|
||||||
|
|
||||||
|
|
||||||
class MSFRpcError(RuntimeError):
|
class MSFRpcError(RuntimeError):
|
||||||
"""Raised when msfrpcd returns an error or a malformed response."""
|
"""Raised when msfrpcd returns an error or a malformed response."""
|
||||||
|
|
||||||
|
|
@ -184,6 +202,8 @@ class MSFRpcClient:
|
||||||
except Exception as e:
|
except Exception as e:
|
||||||
raise MSFRpcError(f"could not decode msfrpcd response: {e}") from e
|
raise MSFRpcError(f"could not decode msfrpcd response: {e}") from e
|
||||||
|
|
||||||
|
decoded = _decode_response(decoded)
|
||||||
|
|
||||||
if isinstance(decoded, dict) and decoded.get("error") is True:
|
if isinstance(decoded, dict) and decoded.get("error") is True:
|
||||||
raise MSFRpcError(
|
raise MSFRpcError(
|
||||||
f"{payload[0]!r}: {decoded.get('error_class')} "
|
f"{payload[0]!r}: {decoded.get('error_class')} "
|
||||||
|
|
|
||||||
|
|
@ -285,8 +285,12 @@ def _run_slot(
|
||||||
run_dir = f"{run_dir_base}-target-{slot}"
|
run_dir = f"{run_dir_base}-target-{slot}"
|
||||||
env["RUN_DIR"] = run_dir
|
env["RUN_DIR"] = run_dir
|
||||||
# Each slot gets a unique host-side hostfwd port so concurrent
|
# Each slot gets a unique host-side hostfwd port so concurrent
|
||||||
# targets don't collide on the loopback port.
|
# targets don't collide on the loopback port. Base at 2000+
|
||||||
env["PORT_BASE"] = str(target_port + slot * 1000)
|
# (target_port % 1000) so privileged-port modules (samba/139,
|
||||||
|
# php/80, vsftpd/21) never try to bind a port < 1024 on the
|
||||||
|
# host — cis490 user lacks CAP_NET_BIND_SERVICE.
|
||||||
|
host_port = (target_port % 1000) + 2000 + slot * 1000
|
||||||
|
env["PORT_BASE"] = str(host_port)
|
||||||
if bridge_iface:
|
if bridge_iface:
|
||||||
env["BRIDGE"] = bridge_iface
|
env["BRIDGE"] = bridge_iface
|
||||||
cmd = [
|
cmd = [
|
||||||
|
|
@ -296,7 +300,10 @@ def _run_slot(
|
||||||
"--run-dir", run_dir,
|
"--run-dir", run_dir,
|
||||||
"--module", module.name,
|
"--module", module.name,
|
||||||
"--sample", sample.name,
|
"--sample", sample.name,
|
||||||
"--target-port", str(target_port + slot * 1000),
|
"--target-port", str(host_port),
|
||||||
|
# Concurrent VMs contend on I/O during boot; 300 s gives
|
||||||
|
# a full fleet of 7 slots room to start their services.
|
||||||
|
"--target-boot-timeout", "300",
|
||||||
]
|
]
|
||||||
tier = "tier3"
|
tier = "tier3"
|
||||||
module_name: str | None = module.name
|
module_name: str | None = module.name
|
||||||
|
|
|
||||||
|
|
@ -103,6 +103,10 @@ fi
|
||||||
|
|
||||||
# --- 3. systemd unit ----------------------------------------------------
|
# --- 3. systemd unit ----------------------------------------------------
|
||||||
log "installing systemd unit"
|
log "installing systemd unit"
|
||||||
|
# msfrpcd writes ~/.msf4/ for module cache and logs. ProtectHome=true in
|
||||||
|
# the unit makes /root inaccessible, so redirect HOME to a writable path
|
||||||
|
# under /var/lib/cis490/. Pre-create so msfrpcd doesn't race mkdir.
|
||||||
|
install -d -m 0755 -o root -g root /var/lib/cis490/msf4
|
||||||
cat > "$UNIT" <<EOF
|
cat > "$UNIT" <<EOF
|
||||||
[Unit]
|
[Unit]
|
||||||
Description=CIS490 — Metasploit RPC daemon (loopback only)
|
Description=CIS490 — Metasploit RPC daemon (loopback only)
|
||||||
|
|
@ -119,6 +123,7 @@ EnvironmentFile=$ENV_FILE
|
||||||
# -a <ip> bind address (loopback only — Tier-3 driver runs locally)
|
# -a <ip> bind address (loopback only — Tier-3 driver runs locally)
|
||||||
# -p <port> port
|
# -p <port> port
|
||||||
# -f foreground (no daemonization, so systemd manages PID)
|
# -f foreground (no daemonization, so systemd manages PID)
|
||||||
|
Environment=HOME=/var/lib/cis490/msf4
|
||||||
ExecStart=/usr/bin/env msfrpcd -P \${MSFRPC_PASSWORD} -U \${MSFRPC_USER} -a 127.0.0.1 -p \${MSFRPC_PORT} -f
|
ExecStart=/usr/bin/env msfrpcd -P \${MSFRPC_PASSWORD} -U \${MSFRPC_USER} -a 127.0.0.1 -p \${MSFRPC_PORT} -f
|
||||||
Restart=on-failure
|
Restart=on-failure
|
||||||
RestartSec=5
|
RestartSec=5
|
||||||
|
|
@ -126,6 +131,7 @@ NoNewPrivileges=true
|
||||||
PrivateTmp=true
|
PrivateTmp=true
|
||||||
ProtectSystem=full
|
ProtectSystem=full
|
||||||
ProtectHome=true
|
ProtectHome=true
|
||||||
|
ReadWritePaths=/var/lib/cis490
|
||||||
|
|
||||||
[Install]
|
[Install]
|
||||||
WantedBy=multi-user.target
|
WantedBy=multi-user.target
|
||||||
|
|
|
||||||
|
|
@ -79,6 +79,11 @@ OUT_DIR="$DATA_ROOT/vm/images"
|
||||||
install -d -m 0755 -o cis490 -g cis490 "$OUT_DIR"
|
install -d -m 0755 -o cis490 -g cis490 "$OUT_DIR"
|
||||||
OUT_DIR="$OUT_DIR" "$(script_path fetch-metasploitable2.sh)"
|
OUT_DIR="$OUT_DIR" "$(script_path fetch-metasploitable2.sh)"
|
||||||
chown cis490:cis490 "$OUT_DIR/metasploitable2.qcow2" 2>/dev/null || true
|
chown cis490:cis490 "$OUT_DIR/metasploitable2.qcow2" 2>/dev/null || true
|
||||||
|
# launch_target.sh resolves IMAGE relative to $REPO_ROOT/vm/images/.
|
||||||
|
# Symlink the canonical path so it resolves correctly from /opt/cis490.
|
||||||
|
install -d -o cis490 -g cis490 -m 0755 "$INSTALL_ROOT/vm/images"
|
||||||
|
ln -sf "$OUT_DIR/metasploitable2.qcow2" \
|
||||||
|
"$INSTALL_ROOT/vm/images/metasploitable2.qcow2" || true
|
||||||
log "metasploitable2.qcow2 ✓"
|
log "metasploitable2.qcow2 ✓"
|
||||||
|
|
||||||
# --- 3. bridge ---------------------------------------------------------
|
# --- 3. bridge ---------------------------------------------------------
|
||||||
|
|
@ -101,7 +106,8 @@ if [[ -z "${SKIP_VERIFY:-}" ]]; then
|
||||||
[[ -x "$PY" ]] || PY="$(command -v python3)"
|
[[ -x "$PY" ]] || PY="$(command -v python3)"
|
||||||
if ! sudo -E -u cis490 "$PY" "$INSTALL_ROOT/tools/run_tier3_demo.py" \
|
if ! sudo -E -u cis490 "$PY" "$INSTALL_ROOT/tools/run_tier3_demo.py" \
|
||||||
--module vsftpd_234_backdoor \
|
--module vsftpd_234_backdoor \
|
||||||
--target-port 21 \
|
--target-port 2121 \
|
||||||
|
--data-root "$DATA_ROOT/data" \
|
||||||
--target-boot-timeout 240 \
|
--target-boot-timeout 240 \
|
||||||
> /tmp/cis490-tier3-verify.log 2>&1; then
|
> /tmp/cis490-tier3-verify.log 2>&1; then
|
||||||
log "verify run failed — log at /tmp/cis490-tier3-verify.log; dumping last 30 lines:"
|
log "verify run failed — log at /tmp/cis490-tier3-verify.log; dumping last 30 lines:"
|
||||||
|
|
|
||||||
|
|
@ -260,6 +260,7 @@ def main() -> int:
|
||||||
module=module,
|
module=module,
|
||||||
cfg=DriverConfig(
|
cfg=DriverConfig(
|
||||||
target_ip=args.target_ip,
|
target_ip=args.target_ip,
|
||||||
|
target_port=args.target_port,
|
||||||
sample_store_root=repo_root / "samples" / "store",
|
sample_store_root=repo_root / "samples" / "store",
|
||||||
),
|
),
|
||||||
emit_event=runner.emit_event,
|
emit_event=runner.emit_event,
|
||||||
|
|
|
||||||
|
|
@ -36,7 +36,7 @@ TAP="${TAP:-cis490target$SLOT}"
|
||||||
# Ports the host should forward to the guest. Comma-separated host:guest pairs.
|
# Ports the host should forward to the guest. Comma-separated host:guest pairs.
|
||||||
# Default covers the vsftpd module's RPORT. Slot offset makes per-VM
|
# Default covers the vsftpd module's RPORT. Slot offset makes per-VM
|
||||||
# fleet runs collision-free (slot 0 → 21, slot 1 → 121, slot 2 → 221, ...).
|
# fleet runs collision-free (slot 0 → 21, slot 1 → 121, slot 2 → 221, ...).
|
||||||
PORT_BASE="${PORT_BASE:-$((21 + SLOT * 100))}"
|
PORT_BASE="${PORT_BASE:-$((2121 + SLOT * 100))}"
|
||||||
TARGET_PORTS="${TARGET_PORTS:-${PORT_BASE}:21}"
|
TARGET_PORTS="${TARGET_PORTS:-${PORT_BASE}:21}"
|
||||||
# KVM if the host can take it; otherwise fall back to TCG. Cross-arch
|
# KVM if the host can take it; otherwise fall back to TCG. Cross-arch
|
||||||
# images (Metasploitable2 is x86-only) on aarch64 hosts will need TCG.
|
# images (Metasploitable2 is x86-only) on aarch64 hosts will need TCG.
|
||||||
|
|
|
||||||
Loading…
Add table
Reference in a new issue