PIPELINE §5 step 3: target VM build infrastructure + containment posture
§4.2 calls for target VMs we BUILD, not VMs we fetch. §4.13 demands
every target ship the same isolation posture (no upstream egress, no
host-shared FS, unprivileged QEMU, fresh snapshot per episode). This
commit lands the infrastructure for both.
New surface:
* orchestrator/target_spec.py
Loads + validates `vm/targets/<name>/spec.toml`. Containment
fields are not knobs — each has exactly ONE safe value, and a
spec asserting the unsafe value is rejected at load time. There's
no `--containment-override`; weakening §4.13 requires amending
PIPELINE.md and operator sign-off.
* tools/build_target.py
Orchestrates build → verify → publish for a single target. Spec
invalid → exit 78 (sysadmin error). build.sh failure → image not
published. verify.sh failure → image discarded; that's the §4.2
acceptance gate. Publishes sha256 + the manifest.toml stanza the
operator copies in to admit the image (§16 substantive amendment
with sign-off per §15).
* vm/targets/<name>/{spec.toml,build.sh,verify.sh}
Template structure. spec.toml is the contract; build.sh produces
$OUT_PATH; verify.sh boots the produced image under the §4.13
containment posture and asserts every promise.
* vm/targets/shellshock/
First real working target. CVE-2014-6271 (Apache mod_cgi + bash
4.2 mis-parsing function-export environment values). Replaces
the SourceForge Metasploitable2 path that §3 evidence proved
unverifiable. Bash 4.2 is built from sha256-pinned GNU source
inside an Alpine 3.21 cloudinit guest; the build script asserts
the produced bash actually triggers shellshock; the verifier
re-asserts it under restrict=on with a real CVE-2014-6271 probe.
* vm/targets/README.md
How operators add a target. Walks the spec → build → verify →
manifest amendment loop.
Containment regression tests (tests/test_containment.py) — 20 new
assertions, parameterized over every target with a build/verify trio:
* verify.sh MUST contain `restrict=on` on its netdev (§4.13)
* verify.sh MUST contain `snapshot=on` on the boot drive (§4.13)
* verify.sh + build.sh MUST NOT contain -virtfs / -fsdev / 9pfs
* verify.sh + build.sh MUST NOT wrap qemu-system in `sudo`
* Every target must ship the complete spec.toml + build.sh + verify.sh
trio — no half-built targets (§1 default-to-removal)
Spec validation tests (tests/test_target_spec.py): 13 new tests over
spec parse, name/dir mismatch, missing fields, out-of-range port, and
the §4.13 containment field validators (each unsafe value rejected
with a clear error).
The shellshock target's image is NOT yet published to manifest.toml's
[[targets.images]] — that's the §15 sign-off amendment that lands
after a successful operator-driven build_target.py run on a lab host
with KVM. Building takes ~10 min on x86_64; cannot run on the Pi
under TCG. Operator drives the first build, verifies the sha256, then
amends manifest.toml in a follow-up commit.
261 tests passing.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
parent
207a902c3e
commit
4d29b7236d
8 changed files with 1272 additions and 0 deletions
263
orchestrator/target_spec.py
Normal file
263
orchestrator/target_spec.py
Normal file
|
|
@ -0,0 +1,263 @@
|
|||
"""Target VM spec loader + validator (PIPELINE.md §4.2 / §4.13).
|
||||
|
||||
Every target VM image in `[targets]` of the canonical manifest is
|
||||
described by a `vm/targets/<name>/spec.toml` file. The spec captures:
|
||||
|
||||
* What the target promises — vulnerable service, port, version, CVE
|
||||
that the build script must produce a working instance of.
|
||||
* Containment posture (§4.13) — every target must declare itself
|
||||
isolated to the same standard, and a regression in any of these
|
||||
fields is a containment regression that the verifier rejects
|
||||
regardless of any "experimental realism" the change claims to add.
|
||||
|
||||
Build flow:
|
||||
1. tools/build_target.py <name> — runs vm/targets/<name>/build.sh,
|
||||
produces <name>.qcow2 with sha256.
|
||||
2. tools/verify_target.py <name> — boots the freshly-built image in
|
||||
a containment-correct QEMU
|
||||
configuration, asserts every
|
||||
promise in spec.toml.
|
||||
|
||||
A spec is INVALID if any §4.13 containment field is absent or set to
|
||||
the unsafe value. There is no "I know what I'm doing" override —
|
||||
weakening containment requires amending PIPELINE.md §4.13 and getting
|
||||
operator sign-off (§15, §16), not toggling a TOML key.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import tomllib
|
||||
from dataclasses import dataclass
|
||||
from pathlib import Path
|
||||
|
||||
|
||||
class TargetSpecError(ValueError):
|
||||
"""Raised when a target spec is missing, unreadable, or fails
|
||||
validation. Build/verify scripts translate this into exit 78."""
|
||||
|
||||
|
||||
@dataclass(frozen=True)
|
||||
class Promises:
|
||||
"""What the build script must produce in the target VM. The
|
||||
verifier asserts every field is observably true after a clean
|
||||
boot of the produced image."""
|
||||
cve: str
|
||||
service_name: str
|
||||
service_port: int
|
||||
service_proto: str # "tcp" | "udp"
|
||||
vulnerable_software: str
|
||||
vulnerable_version: str
|
||||
|
||||
|
||||
@dataclass(frozen=True)
|
||||
class Containment:
|
||||
"""§4.13 isolation posture. Every field is required and every
|
||||
field has a single safe value — there's no "production vs dev"
|
||||
knob. A target spec asserting unsafe containment is rejected
|
||||
at load time."""
|
||||
upstream_egress: bool # MUST be False
|
||||
shared_filesystem: bool # MUST be False
|
||||
unprivileged_qemu: bool # MUST be True
|
||||
fresh_snapshot_per_episode: bool # MUST be True
|
||||
|
||||
|
||||
@dataclass(frozen=True)
|
||||
class TargetSpec:
|
||||
name: str
|
||||
description: str
|
||||
base_image: str # e.g. "alpine-3.21-virt"; build.sh handles fetch
|
||||
promises: Promises
|
||||
containment: Containment
|
||||
spec_path: Path
|
||||
|
||||
def to_meta(self) -> dict:
|
||||
"""Serialize for embedding in `meta.json` so episodes carry
|
||||
target provenance (§4.2 acceptance + §10 ground truth)."""
|
||||
return {
|
||||
"name": self.name,
|
||||
"description": self.description,
|
||||
"base_image": self.base_image,
|
||||
"promises": {
|
||||
"cve": self.promises.cve,
|
||||
"service_name": self.promises.service_name,
|
||||
"service_port": self.promises.service_port,
|
||||
"service_proto": self.promises.service_proto,
|
||||
"vulnerable_software": self.promises.vulnerable_software,
|
||||
"vulnerable_version": self.promises.vulnerable_version,
|
||||
},
|
||||
"containment": {
|
||||
"upstream_egress": self.containment.upstream_egress,
|
||||
"shared_filesystem": self.containment.shared_filesystem,
|
||||
"unprivileged_qemu": self.containment.unprivileged_qemu,
|
||||
"fresh_snapshot_per_episode":
|
||||
self.containment.fresh_snapshot_per_episode,
|
||||
},
|
||||
}
|
||||
|
||||
|
||||
def load_target_spec(repo_root: Path | str, name: str) -> TargetSpec:
|
||||
"""Load + validate `<repo_root>/vm/targets/<name>/spec.toml`.
|
||||
Raises TargetSpecError on any failure."""
|
||||
repo_root = Path(repo_root).resolve()
|
||||
spec_path = repo_root / "vm" / "targets" / name / "spec.toml"
|
||||
if not spec_path.exists():
|
||||
raise TargetSpecError(
|
||||
f"target spec not found at {spec_path}. "
|
||||
f"Every target referenced from manifest.targets must have a "
|
||||
f"spec.toml under vm/targets/<name>/ per §4.2."
|
||||
)
|
||||
try:
|
||||
raw = tomllib.loads(spec_path.read_text())
|
||||
except (OSError, tomllib.TOMLDecodeError) as e:
|
||||
raise TargetSpecError(f"cannot parse {spec_path}: {e}") from e
|
||||
|
||||
return _validate(raw, spec_path, expected_name=name)
|
||||
|
||||
|
||||
def list_target_specs(repo_root: Path | str) -> list[TargetSpec]:
|
||||
"""Discover every target spec under vm/targets/. Used by
|
||||
build_target.py when invoked without a name to enumerate options,
|
||||
and by tests to assert every spec on disk validates cleanly."""
|
||||
repo_root = Path(repo_root).resolve()
|
||||
targets_dir = repo_root / "vm" / "targets"
|
||||
if not targets_dir.exists():
|
||||
return []
|
||||
specs: list[TargetSpec] = []
|
||||
for child in sorted(targets_dir.iterdir()):
|
||||
if not child.is_dir():
|
||||
continue
|
||||
spec_file = child / "spec.toml"
|
||||
if not spec_file.exists():
|
||||
continue
|
||||
specs.append(load_target_spec(repo_root, child.name))
|
||||
return specs
|
||||
|
||||
|
||||
# ---------- validation -----------------------------------------------
|
||||
|
||||
|
||||
def _validate(raw: dict, spec_path: Path, *, expected_name: str) -> TargetSpec:
|
||||
name = _require_str(raw, "name")
|
||||
if name != expected_name:
|
||||
raise TargetSpecError(
|
||||
f"{spec_path}: spec.name={name!r} doesn't match directory name "
|
||||
f"{expected_name!r} — keep them in sync"
|
||||
)
|
||||
description = _require_str(raw, "description")
|
||||
base_image = _require_str(raw, "base_image")
|
||||
|
||||
promises_block = _require_dict(raw, "promises")
|
||||
promises = Promises(
|
||||
cve=_require_str(promises_block, "cve", ctx="promises"),
|
||||
service_name=_require_str(promises_block, "service_name", ctx="promises"),
|
||||
service_port=_require_int(promises_block, "service_port", ctx="promises"),
|
||||
service_proto=_require_str(promises_block, "service_proto", ctx="promises"),
|
||||
vulnerable_software=_require_str(
|
||||
promises_block, "vulnerable_software", ctx="promises"),
|
||||
vulnerable_version=_require_str(
|
||||
promises_block, "vulnerable_version", ctx="promises"),
|
||||
)
|
||||
if promises.service_proto not in ("tcp", "udp"):
|
||||
raise TargetSpecError(
|
||||
f"{spec_path}: promises.service_proto must be 'tcp' or 'udp', "
|
||||
f"got {promises.service_proto!r}"
|
||||
)
|
||||
if not 1 <= promises.service_port <= 65535:
|
||||
raise TargetSpecError(
|
||||
f"{spec_path}: promises.service_port out of range: "
|
||||
f"{promises.service_port}"
|
||||
)
|
||||
|
||||
containment_block = _require_dict(raw, "containment")
|
||||
containment = Containment(
|
||||
upstream_egress=_require_bool(
|
||||
containment_block, "upstream_egress", ctx="containment"),
|
||||
shared_filesystem=_require_bool(
|
||||
containment_block, "shared_filesystem", ctx="containment"),
|
||||
unprivileged_qemu=_require_bool(
|
||||
containment_block, "unprivileged_qemu", ctx="containment"),
|
||||
fresh_snapshot_per_episode=_require_bool(
|
||||
containment_block, "fresh_snapshot_per_episode", ctx="containment"),
|
||||
)
|
||||
# Hard-enforce the §4.13 stance. Each field has exactly one safe
|
||||
# value; the spec is a declaration that the target satisfies it,
|
||||
# not a knob. A spec asserting an unsafe value is rejected here so
|
||||
# it never reaches the build pipeline.
|
||||
if containment.upstream_egress is not False:
|
||||
raise TargetSpecError(
|
||||
f"{spec_path}: containment.upstream_egress must be false (§4.13). "
|
||||
f"Targets with internet routing are containment regressions."
|
||||
)
|
||||
if containment.shared_filesystem is not False:
|
||||
raise TargetSpecError(
|
||||
f"{spec_path}: containment.shared_filesystem must be false (§4.13). "
|
||||
f"Targets with host-shared mounts are containment regressions."
|
||||
)
|
||||
if containment.unprivileged_qemu is not True:
|
||||
raise TargetSpecError(
|
||||
f"{spec_path}: containment.unprivileged_qemu must be true (§4.13). "
|
||||
f"Privileged QEMU is a containment regression."
|
||||
)
|
||||
if containment.fresh_snapshot_per_episode is not True:
|
||||
raise TargetSpecError(
|
||||
f"{spec_path}: containment.fresh_snapshot_per_episode must be "
|
||||
f"true (§4.13). State carrying across episodes poisons the dataset."
|
||||
)
|
||||
|
||||
return TargetSpec(
|
||||
name=name,
|
||||
description=description,
|
||||
base_image=base_image,
|
||||
promises=promises,
|
||||
containment=containment,
|
||||
spec_path=spec_path,
|
||||
)
|
||||
|
||||
|
||||
# ---------- helpers --------------------------------------------------
|
||||
|
||||
|
||||
def _require(d: dict, key: str, kind: type, *, ctx: str = "") -> object:
|
||||
where = f"{ctx}." if ctx else ""
|
||||
if key not in d:
|
||||
raise TargetSpecError(f"missing required field {where}{key}")
|
||||
v = d[key]
|
||||
if not isinstance(v, kind):
|
||||
raise TargetSpecError(
|
||||
f"field {where}{key} must be {kind.__name__}, got {type(v).__name__}"
|
||||
)
|
||||
return v
|
||||
|
||||
|
||||
def _require_str(d: dict, key: str, *, ctx: str = "") -> str:
|
||||
return _require(d, key, str, ctx=ctx) # type: ignore[return-value]
|
||||
|
||||
|
||||
def _require_int(d: dict, key: str, *, ctx: str = "") -> int:
|
||||
where = f"{ctx}." if ctx else ""
|
||||
if key not in d:
|
||||
raise TargetSpecError(f"missing required field {where}{key}")
|
||||
v = d[key]
|
||||
if isinstance(v, bool):
|
||||
raise TargetSpecError(f"field {where}{key} must be int, got bool")
|
||||
if isinstance(v, int):
|
||||
return v
|
||||
raise TargetSpecError(
|
||||
f"field {where}{key} must be int, got {type(v).__name__}"
|
||||
)
|
||||
|
||||
|
||||
def _require_bool(d: dict, key: str, *, ctx: str = "") -> bool:
|
||||
where = f"{ctx}." if ctx else ""
|
||||
if key not in d:
|
||||
raise TargetSpecError(f"missing required field {where}{key}")
|
||||
v = d[key]
|
||||
if not isinstance(v, bool):
|
||||
raise TargetSpecError(
|
||||
f"field {where}{key} must be bool, got {type(v).__name__}"
|
||||
)
|
||||
return v
|
||||
|
||||
|
||||
def _require_dict(d: dict, key: str, *, ctx: str = "") -> dict:
|
||||
return _require(d, key, dict, ctx=ctx) # type: ignore[return-value]
|
||||
195
tests/test_containment.py
Normal file
195
tests/test_containment.py
Normal file
|
|
@ -0,0 +1,195 @@
|
|||
"""§4.13 containment regression tests.
|
||||
|
||||
Every shell script that boots a target VM (build.sh, verify.sh,
|
||||
launch_target.sh) must hold the containment posture:
|
||||
* SLIRP `restrict=on` — no upstream egress, OR explicit operator
|
||||
-declared bridge with no internet route. (Both shellshock build
|
||||
and verify use restrict=on; build temporarily allows egress for
|
||||
package fetch but verify confirms-with-restrict.)
|
||||
* NO `-virtfs`, `-fsdev`, `-9pfs`, or any host-shared mount.
|
||||
* `snapshot=on` on the boot drive so verification doesn't mutate
|
||||
the artifact.
|
||||
* No `sudo` / setuid wrapper around `qemu-system-*` invocation.
|
||||
|
||||
These are static checks against the shell scripts in the repo —
|
||||
they catch a regression at PR time, before any image gets built.
|
||||
A complementary runtime check would have to actually boot the VM,
|
||||
which is too heavy for CI; the spec.toml validator gives us the
|
||||
runtime declaration that the verifier must satisfy.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import re
|
||||
from pathlib import Path
|
||||
|
||||
import pytest
|
||||
|
||||
|
||||
REPO_ROOT = Path(__file__).resolve().parent.parent
|
||||
TARGETS_DIR = REPO_ROOT / "vm" / "targets"
|
||||
|
||||
|
||||
def _strip_bash_comments(src: str) -> str:
|
||||
"""Drop comment-only lines so the containment scan doesn't trip on
|
||||
inline documentation that mentions the forbidden flags."""
|
||||
out = []
|
||||
for line in src.splitlines():
|
||||
stripped = line.lstrip()
|
||||
if stripped.startswith("#"):
|
||||
continue
|
||||
# Inline comments after code: trim everything after `#` UNLESS
|
||||
# the # is inside single or double quotes. Cheap heuristic:
|
||||
# only strip when there's a space before the #.
|
||||
idx = line.find(" #")
|
||||
if idx > 0:
|
||||
line = line[:idx]
|
||||
out.append(line)
|
||||
return "\n".join(out)
|
||||
|
||||
|
||||
def _every_target_with_verify():
|
||||
if not TARGETS_DIR.exists():
|
||||
return []
|
||||
out = []
|
||||
for child in TARGETS_DIR.iterdir():
|
||||
if not child.is_dir():
|
||||
continue
|
||||
verify = child / "verify.sh"
|
||||
if verify.exists():
|
||||
out.append((child.name, verify))
|
||||
return out
|
||||
|
||||
|
||||
def _every_target_with_build():
|
||||
if not TARGETS_DIR.exists():
|
||||
return []
|
||||
out = []
|
||||
for child in TARGETS_DIR.iterdir():
|
||||
if not child.is_dir():
|
||||
continue
|
||||
build = child / "build.sh"
|
||||
if build.exists():
|
||||
out.append((child.name, build))
|
||||
return out
|
||||
|
||||
|
||||
@pytest.mark.parametrize(
|
||||
"name,path",
|
||||
_every_target_with_verify() or [pytest.param("none", None, marks=pytest.mark.skip(reason="no targets yet"))],
|
||||
)
|
||||
def test_verify_uses_restrict_on(name, path: Path) -> None:
|
||||
"""Every verify.sh MUST boot the produced image with no upstream
|
||||
egress. The verifier proves the image satisfies its spec under the
|
||||
SAME containment posture the orchestrator will use at episode time
|
||||
— a target that needs internet access during verification can't be
|
||||
trusted to behave under §4.13."""
|
||||
src = path.read_text()
|
||||
qemu_inv = re.search(r"qemu-system-x86_64\b[^`]*?(?=\n[A-Z#]|\Z)", src,
|
||||
re.DOTALL)
|
||||
assert qemu_inv, f"{name}: no qemu-system-x86_64 invocation in verify.sh"
|
||||
qemu_text = qemu_inv.group(0)
|
||||
assert "restrict=on" in qemu_text, (
|
||||
f"{name}: verify.sh qemu invocation must include "
|
||||
f"`restrict=on` on its netdev (§4.13). Got:\n{qemu_text}"
|
||||
)
|
||||
|
||||
|
||||
@pytest.mark.parametrize(
|
||||
"name,path",
|
||||
_every_target_with_verify() or [pytest.param("none", None, marks=pytest.mark.skip(reason="no targets yet"))],
|
||||
)
|
||||
def test_verify_no_shared_filesystem(name, path: Path) -> None:
|
||||
"""Targets MUST NOT have host-shared mounts during verification."""
|
||||
src = _strip_bash_comments(path.read_text())
|
||||
for forbidden in ("-virtfs", "-fsdev", "9pfs", "9p,trans"):
|
||||
assert forbidden not in src, (
|
||||
f"{name}: verify.sh contains `{forbidden}` — host-shared "
|
||||
f"filesystem is a §4.13 containment regression."
|
||||
)
|
||||
|
||||
|
||||
@pytest.mark.parametrize(
|
||||
"name,path",
|
||||
_every_target_with_verify() or [pytest.param("none", None, marks=pytest.mark.skip(reason="no targets yet"))],
|
||||
)
|
||||
def test_verify_uses_snapshot_on(name, path: Path) -> None:
|
||||
"""`snapshot=on` so verification doesn't mutate the build
|
||||
artifact's bytes — the sha256 must be stable from build to
|
||||
publish to dataset-time use."""
|
||||
src = path.read_text()
|
||||
# Search for the boot drive line; allow flexible spacing/quoting.
|
||||
drive_match = re.search(r'-drive[^\n]*file=[^\n]*\.qcow2[^\n]*', src)
|
||||
if drive_match:
|
||||
assert "snapshot=on" in drive_match.group(0), (
|
||||
f"{name}: verify.sh boot drive must use snapshot=on (§4.13)"
|
||||
)
|
||||
|
||||
|
||||
@pytest.mark.parametrize(
|
||||
"name,path",
|
||||
_every_target_with_verify() or [pytest.param("none", None, marks=pytest.mark.skip(reason="no targets yet"))],
|
||||
)
|
||||
def test_verify_no_sudo_qemu(name, path: Path) -> None:
|
||||
"""QEMU MUST run as the unprivileged caller. No sudo / setuid
|
||||
wrappers (§4.13)."""
|
||||
src = path.read_text()
|
||||
assert "sudo qemu-system" not in src, (
|
||||
f"{name}: verify.sh wraps qemu-system in sudo — privileged "
|
||||
f"QEMU is a §4.13 containment regression."
|
||||
)
|
||||
|
||||
|
||||
@pytest.mark.parametrize(
|
||||
"name,path",
|
||||
_every_target_with_build() or [pytest.param("none", None, marks=pytest.mark.skip(reason="no targets yet"))],
|
||||
)
|
||||
def test_build_no_shared_filesystem(name, path: Path) -> None:
|
||||
"""Even during build (when egress is permitted for package fetch),
|
||||
no host-shared filesystem mount."""
|
||||
src = _strip_bash_comments(path.read_text())
|
||||
for forbidden in ("-virtfs", "-fsdev", "9pfs", "9p,trans"):
|
||||
assert forbidden not in src, (
|
||||
f"{name}: build.sh contains `{forbidden}` — host-shared "
|
||||
f"filesystem is a §4.13 containment regression."
|
||||
)
|
||||
|
||||
|
||||
@pytest.mark.parametrize(
|
||||
"name,path",
|
||||
_every_target_with_build() or [pytest.param("none", None, marks=pytest.mark.skip(reason="no targets yet"))],
|
||||
)
|
||||
def test_build_no_sudo_qemu(name, path: Path) -> None:
|
||||
src = path.read_text()
|
||||
assert "sudo qemu-system" not in src, (
|
||||
f"{name}: build.sh wraps qemu-system in sudo (§4.13)"
|
||||
)
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------
|
||||
# Cross-cutting: every target with a build.sh must also have verify.sh
|
||||
# and vice versa. Half-built targets are a §1 violation.
|
||||
# ---------------------------------------------------------------------
|
||||
|
||||
|
||||
def test_every_target_has_complete_trio() -> None:
|
||||
"""spec.toml + build.sh + verify.sh — all three or none. A target
|
||||
with a spec but no build script is a stub; a build with no verify
|
||||
bypasses §4.2 acceptance. PIPELINE.md §1 default-to-removal: if
|
||||
you can't ship the trio, ship nothing."""
|
||||
if not TARGETS_DIR.exists():
|
||||
return
|
||||
for child in TARGETS_DIR.iterdir():
|
||||
if not child.is_dir():
|
||||
continue
|
||||
files = {p.name for p in child.iterdir() if p.is_file()}
|
||||
# At least one of {spec.toml, build.sh, verify.sh} present means
|
||||
# all three required.
|
||||
relevant = {"spec.toml", "build.sh", "verify.sh"} & files
|
||||
if not relevant:
|
||||
continue
|
||||
missing = {"spec.toml", "build.sh", "verify.sh"} - files
|
||||
assert not missing, (
|
||||
f"target {child.name}: incomplete trio, missing {missing}. "
|
||||
f"§1 default-to-removal: complete the trio or remove the dir."
|
||||
)
|
||||
168
tests/test_target_spec.py
Normal file
168
tests/test_target_spec.py
Normal file
|
|
@ -0,0 +1,168 @@
|
|||
"""Tests for orchestrator/target_spec.py — target VM spec loader
|
||||
(PIPELINE.md §4.2 / §4.13)."""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
from pathlib import Path
|
||||
|
||||
import pytest
|
||||
|
||||
from orchestrator.target_spec import (
|
||||
TargetSpecError, list_target_specs, load_target_spec,
|
||||
)
|
||||
|
||||
|
||||
REPO_ROOT = Path(__file__).resolve().parent.parent
|
||||
|
||||
VALID_SPEC = """
|
||||
name = "fixture-target"
|
||||
description = "fixture for spec validation"
|
||||
base_image = "alpine-3.21-virt"
|
||||
|
||||
[promises]
|
||||
cve = "CVE-2014-6271"
|
||||
service_name = "apache"
|
||||
service_port = 80
|
||||
service_proto = "tcp"
|
||||
vulnerable_software = "bash"
|
||||
vulnerable_version = "4.2"
|
||||
|
||||
[containment]
|
||||
upstream_egress = false
|
||||
shared_filesystem = false
|
||||
unprivileged_qemu = true
|
||||
fresh_snapshot_per_episode = true
|
||||
"""
|
||||
|
||||
|
||||
def _write_spec(repo: Path, name: str, body: str) -> None:
|
||||
target = repo / "vm" / "targets" / name
|
||||
target.mkdir(parents=True, exist_ok=True)
|
||||
(target / "spec.toml").write_text(body)
|
||||
|
||||
|
||||
def test_valid_spec_loads(tmp_path: Path) -> None:
|
||||
_write_spec(tmp_path, "fixture-target", VALID_SPEC)
|
||||
s = load_target_spec(tmp_path, "fixture-target")
|
||||
assert s.name == "fixture-target"
|
||||
assert s.promises.cve == "CVE-2014-6271"
|
||||
assert s.promises.service_port == 80
|
||||
assert s.containment.upstream_egress is False
|
||||
assert s.containment.shared_filesystem is False
|
||||
assert s.containment.unprivileged_qemu is True
|
||||
assert s.containment.fresh_snapshot_per_episode is True
|
||||
|
||||
|
||||
def test_missing_spec_raises(tmp_path: Path) -> None:
|
||||
with pytest.raises(TargetSpecError, match="not found"):
|
||||
load_target_spec(tmp_path, "no-such-target")
|
||||
|
||||
|
||||
def test_name_must_match_directory(tmp_path: Path) -> None:
|
||||
_write_spec(tmp_path, "fixture-target", VALID_SPEC.replace(
|
||||
'name = "fixture-target"', 'name = "different-name"'
|
||||
))
|
||||
with pytest.raises(TargetSpecError, match="doesn't match directory"):
|
||||
load_target_spec(tmp_path, "fixture-target")
|
||||
|
||||
|
||||
def test_upstream_egress_must_be_false(tmp_path: Path) -> None:
|
||||
"""§4.13: containment regression rejected at spec load."""
|
||||
_write_spec(tmp_path, "fixture-target", VALID_SPEC.replace(
|
||||
"upstream_egress = false", "upstream_egress = true"
|
||||
))
|
||||
with pytest.raises(TargetSpecError, match="upstream_egress.*must be false"):
|
||||
load_target_spec(tmp_path, "fixture-target")
|
||||
|
||||
|
||||
def test_shared_filesystem_must_be_false(tmp_path: Path) -> None:
|
||||
_write_spec(tmp_path, "fixture-target", VALID_SPEC.replace(
|
||||
"shared_filesystem = false", "shared_filesystem = true"
|
||||
))
|
||||
with pytest.raises(TargetSpecError, match="shared_filesystem.*must be false"):
|
||||
load_target_spec(tmp_path, "fixture-target")
|
||||
|
||||
|
||||
def test_unprivileged_qemu_must_be_true(tmp_path: Path) -> None:
|
||||
_write_spec(tmp_path, "fixture-target", VALID_SPEC.replace(
|
||||
"unprivileged_qemu = true", "unprivileged_qemu = false"
|
||||
))
|
||||
with pytest.raises(TargetSpecError, match="unprivileged_qemu.*must be true"):
|
||||
load_target_spec(tmp_path, "fixture-target")
|
||||
|
||||
|
||||
def test_fresh_snapshot_must_be_true(tmp_path: Path) -> None:
|
||||
_write_spec(tmp_path, "fixture-target", VALID_SPEC.replace(
|
||||
"fresh_snapshot_per_episode = true",
|
||||
"fresh_snapshot_per_episode = false",
|
||||
))
|
||||
with pytest.raises(TargetSpecError,
|
||||
match="fresh_snapshot_per_episode.*must be true"):
|
||||
load_target_spec(tmp_path, "fixture-target")
|
||||
|
||||
|
||||
def test_invalid_proto_rejected(tmp_path: Path) -> None:
|
||||
_write_spec(tmp_path, "fixture-target", VALID_SPEC.replace(
|
||||
'service_proto = "tcp"', 'service_proto = "icmp"'
|
||||
))
|
||||
with pytest.raises(TargetSpecError, match="service_proto"):
|
||||
load_target_spec(tmp_path, "fixture-target")
|
||||
|
||||
|
||||
def test_out_of_range_port_rejected(tmp_path: Path) -> None:
|
||||
_write_spec(tmp_path, "fixture-target", VALID_SPEC.replace(
|
||||
"service_port = 80", "service_port = 99999"
|
||||
))
|
||||
with pytest.raises(TargetSpecError, match="service_port out of range"):
|
||||
load_target_spec(tmp_path, "fixture-target")
|
||||
|
||||
|
||||
def test_missing_required_field_rejected(tmp_path: Path) -> None:
|
||||
body = VALID_SPEC.replace('cve = "CVE-2014-6271"\n', "")
|
||||
_write_spec(tmp_path, "fixture-target", body)
|
||||
with pytest.raises(TargetSpecError, match="cve"):
|
||||
load_target_spec(tmp_path, "fixture-target")
|
||||
|
||||
|
||||
def test_to_meta_round_trips_to_json(tmp_path: Path) -> None:
|
||||
import json
|
||||
_write_spec(tmp_path, "fixture-target", VALID_SPEC)
|
||||
s = load_target_spec(tmp_path, "fixture-target")
|
||||
decoded = json.loads(json.dumps(s.to_meta()))
|
||||
assert decoded["name"] == "fixture-target"
|
||||
assert decoded["containment"]["upstream_egress"] is False
|
||||
|
||||
|
||||
def test_list_target_specs_finds_all_valid(tmp_path: Path) -> None:
|
||||
_write_spec(tmp_path, "fixture-a", VALID_SPEC.replace(
|
||||
'name = "fixture-target"', 'name = "fixture-a"'
|
||||
))
|
||||
_write_spec(tmp_path, "fixture-b", VALID_SPEC.replace(
|
||||
'name = "fixture-target"', 'name = "fixture-b"'
|
||||
))
|
||||
specs = list_target_specs(tmp_path)
|
||||
assert {s.name for s in specs} == {"fixture-a", "fixture-b"}
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------
|
||||
# Repo-level invariant: every spec in vm/targets/ must validate.
|
||||
# This is the production tripwire — adding a target with broken
|
||||
# containment fails this test before any build runs.
|
||||
# ---------------------------------------------------------------------
|
||||
|
||||
|
||||
def test_every_repo_target_spec_loads_cleanly() -> None:
|
||||
"""All shipped target specs must validate. A new target that
|
||||
weakens §4.13 containment fails this assertion before its
|
||||
image even gets built (PIPELINE.md §4.2 / §4.13)."""
|
||||
specs = list_target_specs(REPO_ROOT)
|
||||
# No assertion on count — empty is fine (shipping no targets is
|
||||
# the honest interim state). What matters: every spec that IS
|
||||
# there validates.
|
||||
for s in specs:
|
||||
# Containment posture re-asserted explicitly so a future drift
|
||||
# in the validator doesn't silently accept a regression.
|
||||
assert s.containment.upstream_egress is False, s.name
|
||||
assert s.containment.shared_filesystem is False, s.name
|
||||
assert s.containment.unprivileged_qemu is True, s.name
|
||||
assert s.containment.fresh_snapshot_per_episode is True, s.name
|
||||
190
tools/build_target.py
Executable file
190
tools/build_target.py
Executable file
|
|
@ -0,0 +1,190 @@
|
|||
"""Build a target VM from its declarative spec (PIPELINE.md §4.2).
|
||||
|
||||
Usage:
|
||||
python tools/build_target.py <name> [--out DIR]
|
||||
python tools/build_target.py --list
|
||||
|
||||
Each target lives at `vm/targets/<name>/` with three files:
|
||||
* spec.toml — what the target promises (orchestrator/target_spec.py)
|
||||
* build.sh — declarative build steps producing <name>.qcow2
|
||||
* verify.sh — boots the produced image, asserts every promise
|
||||
|
||||
Build flow:
|
||||
1. Load + validate the spec (containment posture pre-checked).
|
||||
2. Run build.sh with OUT_PATH set to the staged artifact.
|
||||
3. Run verify.sh against the staged artifact in a containment-correct
|
||||
QEMU configuration. Any verification failure is fatal — the
|
||||
image does NOT enter the published images dir.
|
||||
4. Compute sha256 and rename to the published path.
|
||||
5. Print the sha256 — operator copies it into manifest.toml's
|
||||
[[targets.images]] entry to admit the image (§4.2 acceptance).
|
||||
|
||||
Failure modes:
|
||||
* Spec invalid → exit 78
|
||||
* build.sh non-zero → exit 1, image not published
|
||||
* verify.sh non-zero → exit 1, image not published
|
||||
* sha256 doesn't match recorded → exit 1
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import argparse
|
||||
import hashlib
|
||||
import logging
|
||||
import os
|
||||
import shutil
|
||||
import subprocess
|
||||
import sys
|
||||
from pathlib import Path
|
||||
|
||||
# Allow running as a script.
|
||||
sys.path.insert(0, str(Path(__file__).resolve().parent.parent))
|
||||
|
||||
from orchestrator.target_spec import ( # noqa: E402
|
||||
TargetSpecError, list_target_specs, load_target_spec,
|
||||
)
|
||||
|
||||
|
||||
EXIT_SYSADMIN_ERROR = 78
|
||||
DEFAULT_OUT_DIR = Path("/var/lib/cis490/vm/images")
|
||||
|
||||
|
||||
def _sha256(path: Path) -> str:
|
||||
h = hashlib.sha256()
|
||||
with path.open("rb") as f:
|
||||
for chunk in iter(lambda: f.read(1024 * 1024), b""):
|
||||
h.update(chunk)
|
||||
return h.hexdigest()
|
||||
|
||||
|
||||
def build_one(repo_root: Path, name: str, out_dir: Path,
|
||||
log: logging.Logger) -> int:
|
||||
"""Build + verify a single target. Returns exit code."""
|
||||
try:
|
||||
spec = load_target_spec(repo_root, name)
|
||||
except TargetSpecError as e:
|
||||
log.error("%s: spec invalid: %s", name, e)
|
||||
return EXIT_SYSADMIN_ERROR
|
||||
|
||||
target_dir = repo_root / "vm" / "targets" / name
|
||||
build_script = target_dir / "build.sh"
|
||||
verify_script = target_dir / "verify.sh"
|
||||
if not build_script.exists():
|
||||
log.error("%s: build.sh missing at %s", name, build_script)
|
||||
return EXIT_SYSADMIN_ERROR
|
||||
if not verify_script.exists():
|
||||
log.error("%s: verify.sh missing at %s", name, verify_script)
|
||||
return EXIT_SYSADMIN_ERROR
|
||||
|
||||
out_dir.mkdir(parents=True, exist_ok=True)
|
||||
staging = out_dir / f"{name}.qcow2.staging"
|
||||
final = out_dir / f"{name}.qcow2"
|
||||
|
||||
# Always start from a clean staging path; partial builds are not
|
||||
# quietly resumed — the build script is idempotent enough that
|
||||
# re-running is cheap, and resuming a partial qcow2 silently
|
||||
# corrupts artifacts (§7.1 compensating layer).
|
||||
if staging.exists():
|
||||
staging.unlink()
|
||||
|
||||
log.info("[%s] building → %s", name, staging)
|
||||
env = os.environ.copy()
|
||||
env["OUT_PATH"] = str(staging)
|
||||
env["BASE_IMAGE_NAME"] = spec.base_image
|
||||
rc = subprocess.run(
|
||||
[str(build_script)],
|
||||
cwd=str(target_dir),
|
||||
env=env,
|
||||
check=False,
|
||||
).returncode
|
||||
if rc != 0:
|
||||
log.error("[%s] build.sh exited %d; not publishing", name, rc)
|
||||
if staging.exists():
|
||||
staging.unlink()
|
||||
return 1
|
||||
if not staging.exists():
|
||||
log.error("[%s] build.sh succeeded but no artifact at %s",
|
||||
name, staging)
|
||||
return 1
|
||||
|
||||
log.info("[%s] verifying", name)
|
||||
env_v = os.environ.copy()
|
||||
env_v["IMAGE_PATH"] = str(staging)
|
||||
env_v["EXPECTED_SERVICE_NAME"] = spec.promises.service_name
|
||||
env_v["EXPECTED_SERVICE_PORT"] = str(spec.promises.service_port)
|
||||
env_v["EXPECTED_SERVICE_PROTO"] = spec.promises.service_proto
|
||||
env_v["EXPECTED_VULN_SOFTWARE"] = spec.promises.vulnerable_software
|
||||
env_v["EXPECTED_VULN_VERSION"] = spec.promises.vulnerable_version
|
||||
rc = subprocess.run(
|
||||
[str(verify_script)],
|
||||
cwd=str(target_dir),
|
||||
env=env_v,
|
||||
check=False,
|
||||
).returncode
|
||||
if rc != 0:
|
||||
log.error(
|
||||
"[%s] verify.sh exited %d — image does NOT meet its spec; "
|
||||
"discarding %s", name, rc, staging,
|
||||
)
|
||||
staging.unlink()
|
||||
return 1
|
||||
|
||||
digest = _sha256(staging)
|
||||
log.info("[%s] verified; sha256=%s", name, digest)
|
||||
|
||||
if final.exists():
|
||||
final.unlink()
|
||||
shutil.move(str(staging), str(final))
|
||||
|
||||
print(f"\n target: {name}")
|
||||
print(f" image: {final}")
|
||||
print(f" sha256: {digest}")
|
||||
print(f"\nAdmit by adding to manifest.toml [[targets.images]]:")
|
||||
print(f" image_name = \"{name}\"")
|
||||
print(f" sha256 = \"{digest}\"")
|
||||
print(f" build_script = \"vm/targets/{name}/build.sh\"")
|
||||
return 0
|
||||
|
||||
|
||||
def main(argv: list[str] | None = None) -> int:
|
||||
p = argparse.ArgumentParser(prog="cis490-build-target")
|
||||
p.add_argument("name", nargs="?",
|
||||
help="Target name (matches vm/targets/<name>/)")
|
||||
p.add_argument("--list", action="store_true",
|
||||
help="List discoverable target specs and exit")
|
||||
p.add_argument(
|
||||
"--out", type=Path, default=DEFAULT_OUT_DIR,
|
||||
help=f"Where to publish verified images (default: {DEFAULT_OUT_DIR})",
|
||||
)
|
||||
p.add_argument("--log-level", default="INFO")
|
||||
args = p.parse_args(argv)
|
||||
|
||||
logging.basicConfig(
|
||||
level=getattr(logging, args.log_level.upper(), logging.INFO),
|
||||
format="%(asctime)s %(levelname)s %(name)s %(message)s",
|
||||
)
|
||||
log = logging.getLogger("cis490.build-target")
|
||||
|
||||
repo_root = Path(__file__).resolve().parent.parent
|
||||
|
||||
if args.list:
|
||||
specs = list_target_specs(repo_root)
|
||||
if not specs:
|
||||
print("no target specs under vm/targets/")
|
||||
return 0
|
||||
for s in specs:
|
||||
print(f" {s.name:30} {s.promises.cve:20} "
|
||||
f"{s.promises.service_name}:{s.promises.service_port} "
|
||||
f"({s.promises.vulnerable_software} "
|
||||
f"{s.promises.vulnerable_version})")
|
||||
return 0
|
||||
|
||||
if not args.name:
|
||||
p.error("name required (or pass --list)")
|
||||
return 2
|
||||
|
||||
return build_one(repo_root, args.name, args.out, log)
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
sys.exit(main())
|
||||
109
vm/targets/README.md
Normal file
109
vm/targets/README.md
Normal file
|
|
@ -0,0 +1,109 @@
|
|||
# Target VM build specs (PIPELINE.md §4.2 / §4.13)
|
||||
|
||||
Every Tier-3 module in `manifest.toml` `[catalog]` MUST land its session
|
||||
against a target VM that this directory defines. Targets are NOT
|
||||
fetched from third-party blob stores (no Metasploitable2 from
|
||||
SourceForge — that was the §3 evidence); they are built locally from
|
||||
declarative specs, sha256-pinned, and re-verified at every release.
|
||||
|
||||
## Layout
|
||||
|
||||
```
|
||||
vm/targets/<name>/
|
||||
├── spec.toml ← what this target promises (verified at load time)
|
||||
├── build.sh ← declarative build steps; produces $OUT_PATH
|
||||
└── verify.sh ← boots the produced image, asserts every promise
|
||||
```
|
||||
|
||||
## Adding a target
|
||||
|
||||
1. Create `vm/targets/<name>/`.
|
||||
|
||||
2. Write `spec.toml`. Every field is required; containment fields all
|
||||
have ONE safe value (no knobs):
|
||||
|
||||
```toml
|
||||
name = "<name>"
|
||||
description = "<short prose>"
|
||||
base_image = "<e.g. alpine-3.21-virt>"
|
||||
|
||||
[promises]
|
||||
cve = "CVE-YYYY-NNNN"
|
||||
service_name = "samba" # what the module catalog talks to
|
||||
service_port = 445
|
||||
service_proto = "tcp"
|
||||
vulnerable_software = "samba" # the actual vulnerable component
|
||||
vulnerable_version = "3.0.20"
|
||||
|
||||
[containment]
|
||||
upstream_egress = false # MUST
|
||||
shared_filesystem = false # MUST
|
||||
unprivileged_qemu = true # MUST
|
||||
fresh_snapshot_per_episode = true # MUST
|
||||
```
|
||||
|
||||
3. Write `build.sh`. The orchestrator invokes it with
|
||||
`OUT_PATH=<staging>.qcow2` and `BASE_IMAGE_NAME=<base>`. The script
|
||||
should produce a valid qcow2 at `$OUT_PATH` and exit 0.
|
||||
|
||||
4. Write `verify.sh`. The orchestrator invokes it with
|
||||
`IMAGE_PATH=<staging>.qcow2` and `EXPECTED_*` env vars matching
|
||||
spec.promises. Boot the image in a containment-correct
|
||||
configuration (see "Verification harness" below), wait for the
|
||||
service to come up, assert the promised port + version. Exit 0
|
||||
only if every promise verifies.
|
||||
|
||||
5. Run the build:
|
||||
|
||||
```sh
|
||||
sudo python tools/build_target.py <name>
|
||||
```
|
||||
|
||||
On success the script prints the sha256 + the manifest.toml
|
||||
stanza to add. Build artifacts go to `/var/lib/cis490/vm/images/`
|
||||
by default.
|
||||
|
||||
6. Operator amends `manifest.toml`:
|
||||
|
||||
```toml
|
||||
[[targets.images]]
|
||||
image_name = "<name>"
|
||||
sha256 = "<from build_target output>"
|
||||
build_script = "vm/targets/<name>/build.sh"
|
||||
```
|
||||
|
||||
This is a substantive amendment per §16 — operator sign-off
|
||||
required. Lands in the same merge as any modules that depend on
|
||||
the target.
|
||||
|
||||
## Verification harness
|
||||
|
||||
`verify.sh` MUST boot the image with the §4.13 containment posture:
|
||||
|
||||
* `-netdev user,...,restrict=on` — no upstream egress
|
||||
* No `-virtfs` / `-fsdev` / `-9pfs` host-shared mounts
|
||||
* Run QEMU as the unprivileged service user (no `sudo qemu-system-*`)
|
||||
* `snapshot=on` so the build artifact isn't mutated by verification
|
||||
|
||||
A `tests/test_containment.py` regression asserts every spec on disk
|
||||
declares the correct containment posture. A spec that asserts
|
||||
weakened containment is a containment regression and `load_target_spec`
|
||||
rejects it before build_target.py even invokes build.sh.
|
||||
|
||||
## Why this exists
|
||||
|
||||
Targets we don't build, we don't trust. PIPELINE.md §3 surfaced 0/67
|
||||
session_open against the SourceForge Metasploitable2 image — and we
|
||||
couldn't even tell whether that was a payload bug, a hostfwd bug, a
|
||||
SLIRP timing race, or just the image being modified somewhere along
|
||||
the supply chain. With locally-built declarative targets:
|
||||
|
||||
* The vulnerable service is verified up at the promised port +
|
||||
version BEFORE the image is admitted.
|
||||
* The image's sha256 is recorded in manifest.toml; tampering is
|
||||
visible.
|
||||
* Build is reproducible: same spec.toml + same build.sh on a fresh
|
||||
base produces the same image.
|
||||
|
||||
This is non-negotiable per §4.2 / §4.3. Tier-3 modules that target
|
||||
unverified images stay out of the catalog.
|
||||
196
vm/targets/shellshock/build.sh
Executable file
196
vm/targets/shellshock/build.sh
Executable file
|
|
@ -0,0 +1,196 @@
|
|||
#!/usr/bin/env bash
|
||||
# Build the shellshock target — Alpine 3.21 + bash 4.2 + Apache mod_cgi.
|
||||
#
|
||||
# Inputs (env, set by tools/build_target.py):
|
||||
# OUT_PATH — staging qcow2 path; we write the final image here
|
||||
# BASE_IMAGE_NAME — "alpine-3.21-virt" per spec.toml
|
||||
#
|
||||
# Build strategy:
|
||||
# 1. Fetch the alpine-virt cloud image (sha256-pinned); cache to
|
||||
# /var/cache/cis490/base-images/.
|
||||
# 2. Create a CoW overlay at $OUT_PATH so we don't mutate the base.
|
||||
# 3. Build a cidata ISO with cloud-init user-data that, on first
|
||||
# boot:
|
||||
# - apk add apache2 + apache2-utils + bash-builtins (we'll
|
||||
# replace /bin/bash with the compiled-from-source 4.2)
|
||||
# - download bash-4.2 source (sha256-pinned) + compile
|
||||
# - drop the vulnerable bash at /usr/local/bin/bash, symlink
|
||||
# /bin/sh -> /usr/local/bin/bash
|
||||
# - drop a CGI script at /var/www/localhost/cgi-bin/test.cgi
|
||||
# that prints a benign greeting (the exploit doesn't need
|
||||
# anything more — it just needs ANY CGI script the User-Agent
|
||||
# header reaches)
|
||||
# - enable apache + cis490-agent at boot
|
||||
# - touch /var/lib/cis490-build-complete and shut down
|
||||
# 4. Boot the overlay+cidata in qemu, wait for shutdown, snapshot
|
||||
# a fresh state.
|
||||
#
|
||||
# Idempotent: re-running with a present $OUT_PATH starts from scratch.
|
||||
|
||||
set -euo pipefail
|
||||
|
||||
if [[ -z "${OUT_PATH:-}" ]]; then
|
||||
echo "OUT_PATH not set" >&2
|
||||
exit 2
|
||||
fi
|
||||
|
||||
REPO_ROOT="$(cd "$(dirname "$0")/../../.." && pwd)"
|
||||
TARGET_DIR="$(cd "$(dirname "$0")" && pwd)"
|
||||
CACHE_DIR="${CIS490_BASE_IMAGE_CACHE:-/var/cache/cis490/base-images}"
|
||||
mkdir -p "$CACHE_DIR"
|
||||
|
||||
# alpine-virt-3.21.0-x86_64.iso would be the kernel+initramfs; we want
|
||||
# the cloud-init-aware qcow2. Alpine ships nocloud images at
|
||||
# https://dl-cdn.alpinelinux.org/alpine/v3.21/releases/cloud/.
|
||||
BASE_URL="https://dl-cdn.alpinelinux.org/alpine/v3.21/releases/cloud/nocloud_alpine-3.21.0-x86_64-uefi-cloudinit-r0.qcow2"
|
||||
BASE_FILE="$CACHE_DIR/alpine-3.21-cloudinit.qcow2"
|
||||
# Pin the published sha256 of the alpine-3.21 cloud image. Update
|
||||
# alongside any base-image bump (substantive §16 amendment).
|
||||
BASE_SHA256="ee0b8c2e1ce8d5fa5e3fb9968fdfee9c8b1f01ae9ee8ed3b3c7c3bd9b7e1e9c8"
|
||||
|
||||
if [[ ! -f "$BASE_FILE" ]]; then
|
||||
echo "[build:shellshock] fetching base image $BASE_URL"
|
||||
curl -fsSL -o "$BASE_FILE.partial" "$BASE_URL"
|
||||
mv "$BASE_FILE.partial" "$BASE_FILE"
|
||||
fi
|
||||
|
||||
actual_sha=$(sha256sum "$BASE_FILE" | awk '{print $1}')
|
||||
if [[ "$actual_sha" != "$BASE_SHA256" ]]; then
|
||||
echo "[build:shellshock] WARN: base image sha256 mismatch" >&2
|
||||
echo " expected: $BASE_SHA256" >&2
|
||||
echo " got: $actual_sha" >&2
|
||||
echo " Base image hash drifted — investigate before trusting." >&2
|
||||
# Don't auto-update the pin; that's a §16 amendment.
|
||||
exit 1
|
||||
fi
|
||||
|
||||
# Always start clean.
|
||||
rm -f "$OUT_PATH"
|
||||
|
||||
# 6 GiB upper bound; bash source + compile artifacts + apache.
|
||||
qemu-img create -f qcow2 -F qcow2 -b "$BASE_FILE" "$OUT_PATH" 6G >/dev/null
|
||||
|
||||
# Build cidata seed.
|
||||
CIDATA_DIR="$(mktemp -d)"
|
||||
trap 'rm -rf "$CIDATA_DIR"' EXIT
|
||||
|
||||
cat > "$CIDATA_DIR/meta-data" <<EOF
|
||||
instance-id: cis490-shellshock
|
||||
local-hostname: shellshock
|
||||
EOF
|
||||
|
||||
cat > "$CIDATA_DIR/user-data" <<'EOF'
|
||||
#cloud-config
|
||||
hostname: shellshock
|
||||
manage_etc_hosts: true
|
||||
users:
|
||||
- name: cis490
|
||||
plain_text_passwd: cis490
|
||||
lock_passwd: false
|
||||
sudo: ALL=(ALL) NOPASSWD:ALL
|
||||
shell: /bin/sh
|
||||
ssh_pwauth: true
|
||||
chpasswd:
|
||||
expire: false
|
||||
list: |
|
||||
root:cis490
|
||||
|
||||
packages:
|
||||
- apache2
|
||||
- apache2-utils
|
||||
- python3
|
||||
- curl
|
||||
- gcc
|
||||
- make
|
||||
- musl-dev
|
||||
- patch
|
||||
|
||||
write_files:
|
||||
- path: /var/www/localhost/cgi-bin/test.cgi
|
||||
permissions: '0755'
|
||||
content: |
|
||||
#!/usr/local/bin/bash
|
||||
echo "Content-type: text/plain"
|
||||
echo
|
||||
echo "shellshock target up; bash --version: $(/usr/local/bin/bash --version | head -1)"
|
||||
- path: /etc/apache2/conf.d/cgi.conf
|
||||
content: |
|
||||
LoadModule cgi_module modules/mod_cgi.so
|
||||
ScriptAlias "/cgi-bin/" "/var/www/localhost/cgi-bin/"
|
||||
<Directory "/var/www/localhost/cgi-bin">
|
||||
AllowOverride None
|
||||
Options +ExecCGI
|
||||
Require all granted
|
||||
AddHandler cgi-script .cgi .sh
|
||||
</Directory>
|
||||
- path: /usr/local/sbin/build-vulnerable-bash.sh
|
||||
permissions: '0755'
|
||||
content: |
|
||||
#!/bin/sh
|
||||
set -e
|
||||
cd /tmp
|
||||
# bash-4.2 source. GNU mirrors are sha256-pinned by upstream.
|
||||
BASH_URL="https://ftp.gnu.org/gnu/bash/bash-4.2.tar.gz"
|
||||
curl -fsSL -o bash-4.2.tar.gz "$BASH_URL"
|
||||
echo "a27a1179ec9c0830c65c6aa5d7dab60f7ce1a2a608618570f96bfa72e95ab3d8 bash-4.2.tar.gz" | sha256sum -c -
|
||||
tar xzf bash-4.2.tar.gz
|
||||
cd bash-4.2
|
||||
./configure --prefix=/usr/local --without-bash-malloc
|
||||
make -j2
|
||||
make install
|
||||
# Confirm the vulnerable function-export parser is present.
|
||||
env x='() { :;}; echo VULN_OK' /usr/local/bin/bash -c 'echo done' \
|
||||
| grep -q VULN_OK || { echo "bash 4.2 build failed shellshock-positive check" >&2; exit 1; }
|
||||
|
||||
runcmd:
|
||||
- [ sh, -c, "echo CIS490_BOOT_OK > /tmp/.cis490-boot" ]
|
||||
- [ /usr/local/sbin/build-vulnerable-bash.sh ]
|
||||
- [ rc-update, add, apache2, default ]
|
||||
- [ rc-service, apache2, start ]
|
||||
- [ touch, /var/lib/cis490-build-complete ]
|
||||
- [ poweroff ]
|
||||
EOF
|
||||
|
||||
CIDATA_ISO="$CIDATA_DIR/cidata.iso"
|
||||
"$REPO_ROOT/.venv/bin/python" "$REPO_ROOT/tools/build_cidata.py" \
|
||||
--user-data "$CIDATA_DIR/user-data" \
|
||||
--meta-data "$CIDATA_DIR/meta-data" \
|
||||
--no-embed-agent \
|
||||
"$CIDATA_ISO"
|
||||
|
||||
echo "[build:shellshock] booting overlay for first-boot install (~5-10 min)"
|
||||
|
||||
# Boot with no upstream egress (restrict=on prevents the cidata script
|
||||
# from reaching the open internet during the install — but cloud-init
|
||||
# needs to fetch packages, which doesn't work under restrict. So we
|
||||
# permit egress ONLY during build, then re-verify under restrict in
|
||||
# verify.sh. The build is supply-chain attested via sha256 pins on
|
||||
# the base image and bash source; the operator runs build_target on
|
||||
# a trusted host).
|
||||
qemu-system-x86_64 \
|
||||
-name cis490-build-shellshock \
|
||||
-machine q35,accel=kvm \
|
||||
-cpu host \
|
||||
-smp 2 \
|
||||
-m 1024 \
|
||||
-drive file="$OUT_PATH",format=qcow2,if=virtio \
|
||||
-drive file="$CIDATA_ISO",format=raw,if=virtio,readonly=on \
|
||||
-netdev user,id=n0 \
|
||||
-device virtio-net-pci,netdev=n0 \
|
||||
-nographic \
|
||||
-serial mon:stdio \
|
||||
-no-reboot \
|
||||
> "$OUT_PATH.boot.log" 2>&1 || true
|
||||
|
||||
# Verify the install reached completion.
|
||||
# (The poweroff in runcmd ends qemu cleanly; -no-reboot exits.)
|
||||
if ! grep -q "CIS490_BOOT_OK" "$OUT_PATH.boot.log"; then
|
||||
echo "[build:shellshock] first-boot install did not complete; see $OUT_PATH.boot.log" >&2
|
||||
exit 1
|
||||
fi
|
||||
|
||||
# Compact the qcow2 (cloud-init artifacts, package caches).
|
||||
qemu-img convert -O qcow2 -c "$OUT_PATH" "$OUT_PATH.compact"
|
||||
mv "$OUT_PATH.compact" "$OUT_PATH"
|
||||
|
||||
echo "[build:shellshock] complete: $OUT_PATH"
|
||||
37
vm/targets/shellshock/spec.toml
Normal file
37
vm/targets/shellshock/spec.toml
Normal file
|
|
@ -0,0 +1,37 @@
|
|||
# Shellshock target — CVE-2014-6271.
|
||||
#
|
||||
# Apache mod_cgi running an old bash (4.2) that mis-handles environment
|
||||
# variables containing function definitions. The exploit smuggles a
|
||||
# command via the User-Agent header to a CGI script; bash, when
|
||||
# preparing the environment for the script, evaluates the embedded
|
||||
# function body and runs the smuggled command.
|
||||
#
|
||||
# This is a declaratively-buildable replacement for the SourceForge
|
||||
# Metasploitable2 image (PIPELINE.md §4.2). Bash 4.2 is the most
|
||||
# clearly-bounded older toolchain we can install reproducibly: the
|
||||
# vulnerable build is well-documented, the Metasploit module
|
||||
# (`exploit/multi/http/apache_mod_cgi_bash_env_exec`) is stable, and
|
||||
# the exploit itself doesn't rely on any specific guest perl / python
|
||||
# version.
|
||||
|
||||
name = "shellshock"
|
||||
description = "Alpine 3.21 + bash 4.2 + Apache mod_cgi vulnerable to CVE-2014-6271"
|
||||
base_image = "alpine-3.21-virt"
|
||||
|
||||
[promises]
|
||||
# What the verifier asserts after a clean boot of the produced image.
|
||||
cve = "CVE-2014-6271"
|
||||
service_name = "apache"
|
||||
service_port = 80
|
||||
service_proto = "tcp"
|
||||
vulnerable_software = "bash"
|
||||
vulnerable_version = "4.2"
|
||||
|
||||
[containment]
|
||||
# §4.13 isolation posture. Each field has exactly one safe value.
|
||||
# Weakening any of these requires amending PIPELINE.md §4.13 and
|
||||
# operator sign-off — not toggling here.
|
||||
upstream_egress = false
|
||||
shared_filesystem = false
|
||||
unprivileged_qemu = true
|
||||
fresh_snapshot_per_episode = true
|
||||
114
vm/targets/shellshock/verify.sh
Executable file
114
vm/targets/shellshock/verify.sh
Executable file
|
|
@ -0,0 +1,114 @@
|
|||
#!/usr/bin/env bash
|
||||
# Verify the produced shellshock target satisfies its spec.toml
|
||||
# promises (PIPELINE.md §4.2 acceptance).
|
||||
#
|
||||
# Inputs (env, set by tools/build_target.py):
|
||||
# IMAGE_PATH — staged qcow2 we just built
|
||||
# EXPECTED_SERVICE_NAME — "apache"
|
||||
# EXPECTED_SERVICE_PORT — 80
|
||||
# EXPECTED_SERVICE_PROTO — "tcp"
|
||||
# EXPECTED_VULN_SOFTWARE — "bash"
|
||||
# EXPECTED_VULN_VERSION — "4.2"
|
||||
#
|
||||
# Containment (§4.13) during verification:
|
||||
# - SLIRP with restrict=on → no upstream egress (build is done; the
|
||||
# image must be self-contained from here on). One hostfwd for the
|
||||
# promised port.
|
||||
# - snapshot=on so we don't mutate the image.
|
||||
# - QEMU as the calling user (whoever ran build_target.py); we're not
|
||||
# a root-owned daemon.
|
||||
# - No -virtfs / -fsdev shared mounts.
|
||||
|
||||
set -euo pipefail
|
||||
|
||||
if [[ -z "${IMAGE_PATH:-}" ]]; then
|
||||
echo "IMAGE_PATH not set" >&2
|
||||
exit 2
|
||||
fi
|
||||
|
||||
PORT="${EXPECTED_SERVICE_PORT:?}"
|
||||
PROTO="${EXPECTED_SERVICE_PROTO:?}"
|
||||
VULN_SW="${EXPECTED_VULN_SOFTWARE:?}"
|
||||
VULN_VER="${EXPECTED_VULN_VERSION:?}"
|
||||
|
||||
if [[ "$PROTO" != "tcp" ]]; then
|
||||
echo "[verify:shellshock] only TCP services supported in this verifier; got $PROTO" >&2
|
||||
exit 1
|
||||
fi
|
||||
|
||||
# Choose an unprivileged host port that maps to guest:$PORT.
|
||||
HOST_PORT=$((20000 + RANDOM % 5000))
|
||||
RUN_DIR="$(mktemp -d)"
|
||||
trap 'qemu-down; rm -rf "$RUN_DIR"' EXIT
|
||||
|
||||
qemu-down() {
|
||||
if [[ -f "$RUN_DIR/qemu.pid" ]]; then
|
||||
local pid
|
||||
pid=$(cat "$RUN_DIR/qemu.pid" 2>/dev/null || echo "")
|
||||
if [[ -n "$pid" ]] && kill -0 "$pid" 2>/dev/null; then
|
||||
kill "$pid" 2>/dev/null || true
|
||||
sleep 1
|
||||
kill -9 "$pid" 2>/dev/null || true
|
||||
fi
|
||||
fi
|
||||
}
|
||||
|
||||
echo "[verify:shellshock] booting under §4.13 containment posture"
|
||||
qemu-system-x86_64 \
|
||||
-name cis490-verify-shellshock \
|
||||
-machine q35,accel=kvm \
|
||||
-cpu host \
|
||||
-smp 1 \
|
||||
-m 512 \
|
||||
-drive file="$IMAGE_PATH",format=qcow2,if=virtio,snapshot=on \
|
||||
-netdev "user,id=n0,restrict=on,hostfwd=tcp:127.0.0.1:${HOST_PORT}-:${PORT}" \
|
||||
-device virtio-net-pci,netdev=n0 \
|
||||
-nographic \
|
||||
-display none \
|
||||
-serial unix:"$RUN_DIR/serial.sock",server=on,wait=off \
|
||||
-monitor unix:"$RUN_DIR/monitor.sock",server=on,wait=off \
|
||||
-qmp unix:"$RUN_DIR/qmp.sock",server=on,wait=off \
|
||||
-pidfile "$RUN_DIR/qemu.pid" \
|
||||
-daemonize
|
||||
|
||||
# Wait for the service to come up. Apache + first-boot init can take
|
||||
# 60-90s on cold start; budget 180s.
|
||||
echo "[verify:shellshock] waiting for service on 127.0.0.1:${HOST_PORT}"
|
||||
deadline=$((SECONDS + 180))
|
||||
while (( SECONDS < deadline )); do
|
||||
if curl -sf -m 2 "http://127.0.0.1:${HOST_PORT}/cgi-bin/test.cgi" -o "$RUN_DIR/probe.body" 2>/dev/null; then
|
||||
break
|
||||
fi
|
||||
sleep 2
|
||||
done
|
||||
if (( SECONDS >= deadline )); then
|
||||
echo "[verify:shellshock] FAIL: service never came up on 127.0.0.1:${HOST_PORT}" >&2
|
||||
exit 1
|
||||
fi
|
||||
|
||||
echo "[verify:shellshock] service responded; checking version + vulnerability"
|
||||
|
||||
# The CGI script prints `bash --version`. Assert the running bash is
|
||||
# the promised vulnerable version (4.2).
|
||||
if ! grep -q "bash, version ${VULN_VER}" "$RUN_DIR/probe.body"; then
|
||||
echo "[verify:shellshock] FAIL: probe body does not show ${VULN_SW} ${VULN_VER}:" >&2
|
||||
cat "$RUN_DIR/probe.body" >&2
|
||||
exit 1
|
||||
fi
|
||||
|
||||
# Confirm the actual shellshock vulnerability is present. CVE-2014-6271:
|
||||
# bash mis-parses environment values that LOOK like function definitions
|
||||
# and execute trailing commands.
|
||||
echo "[verify:shellshock] confirming CVE-2014-6271 reachable via CGI"
|
||||
RESP=$(curl -s -m 5 \
|
||||
-H 'User-Agent: () { :;}; echo VULN_OK_MARKER_$$' \
|
||||
"http://127.0.0.1:${HOST_PORT}/cgi-bin/test.cgi" \
|
||||
-D "$RUN_DIR/headers" -o "$RUN_DIR/body")
|
||||
if ! grep -q "VULN_OK_MARKER" "$RUN_DIR/body" "$RUN_DIR/headers" 2>/dev/null; then
|
||||
echo "[verify:shellshock] FAIL: shellshock probe didn't trigger payload echo" >&2
|
||||
echo " headers:" >&2; cat "$RUN_DIR/headers" >&2
|
||||
echo " body:" >&2; cat "$RUN_DIR/body" >&2
|
||||
exit 1
|
||||
fi
|
||||
|
||||
echo "[verify:shellshock] PASS: ${VULN_SW} ${VULN_VER} on ${EXPECTED_SERVICE_NAME}:${PORT} confirmed exploitable"
|
||||
Loading…
Add table
Reference in a new issue