CIS490/tests/test_containment.py
Max Gorog 4d29b7236d PIPELINE §5 step 3: target VM build infrastructure + containment posture
§4.2 calls for target VMs we BUILD, not VMs we fetch. §4.13 demands
every target ship the same isolation posture (no upstream egress, no
host-shared FS, unprivileged QEMU, fresh snapshot per episode). This
commit lands the infrastructure for both.

New surface:
  * orchestrator/target_spec.py
      Loads + validates `vm/targets/<name>/spec.toml`. Containment
      fields are not knobs — each has exactly ONE safe value, and a
      spec asserting the unsafe value is rejected at load time. There's
      no `--containment-override`; weakening §4.13 requires amending
      PIPELINE.md and operator sign-off.

  * tools/build_target.py
      Orchestrates build → verify → publish for a single target. Spec
      invalid → exit 78 (sysadmin error). build.sh failure → image not
      published. verify.sh failure → image discarded; that's the §4.2
      acceptance gate. Publishes sha256 + the manifest.toml stanza the
      operator copies in to admit the image (§16 substantive amendment
      with sign-off per §15).

  * vm/targets/<name>/{spec.toml,build.sh,verify.sh}
      Template structure. spec.toml is the contract; build.sh produces
      $OUT_PATH; verify.sh boots the produced image under the §4.13
      containment posture and asserts every promise.

  * vm/targets/shellshock/
      First real working target. CVE-2014-6271 (Apache mod_cgi + bash
      4.2 mis-parsing function-export environment values). Replaces
      the SourceForge Metasploitable2 path that §3 evidence proved
      unverifiable. Bash 4.2 is built from sha256-pinned GNU source
      inside an Alpine 3.21 cloudinit guest; the build script asserts
      the produced bash actually triggers shellshock; the verifier
      re-asserts it under restrict=on with a real CVE-2014-6271 probe.

  * vm/targets/README.md
      How operators add a target. Walks the spec → build → verify →
      manifest amendment loop.

Containment regression tests (tests/test_containment.py) — 20 new
assertions, parameterized over every target with a build/verify trio:

  * verify.sh MUST contain `restrict=on` on its netdev (§4.13)
  * verify.sh MUST contain `snapshot=on` on the boot drive (§4.13)
  * verify.sh + build.sh MUST NOT contain -virtfs / -fsdev / 9pfs
  * verify.sh + build.sh MUST NOT wrap qemu-system in `sudo`
  * Every target must ship the complete spec.toml + build.sh + verify.sh
    trio — no half-built targets (§1 default-to-removal)

Spec validation tests (tests/test_target_spec.py): 13 new tests over
spec parse, name/dir mismatch, missing fields, out-of-range port, and
the §4.13 containment field validators (each unsafe value rejected
with a clear error).

The shellshock target's image is NOT yet published to manifest.toml's
[[targets.images]] — that's the §15 sign-off amendment that lands
after a successful operator-driven build_target.py run on a lab host
with KVM. Building takes ~10 min on x86_64; cannot run on the Pi
under TCG. Operator drives the first build, verifies the sha256, then
amends manifest.toml in a follow-up commit.

261 tests passing.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 01:31:40 -05:00

195 lines
7.3 KiB
Python

"""§4.13 containment regression tests.
Every shell script that boots a target VM (build.sh, verify.sh,
launch_target.sh) must hold the containment posture:
* SLIRP `restrict=on` — no upstream egress, OR explicit operator
-declared bridge with no internet route. (Both shellshock build
and verify use restrict=on; build temporarily allows egress for
package fetch but verify confirms-with-restrict.)
* NO `-virtfs`, `-fsdev`, `-9pfs`, or any host-shared mount.
* `snapshot=on` on the boot drive so verification doesn't mutate
the artifact.
* No `sudo` / setuid wrapper around `qemu-system-*` invocation.
These are static checks against the shell scripts in the repo —
they catch a regression at PR time, before any image gets built.
A complementary runtime check would have to actually boot the VM,
which is too heavy for CI; the spec.toml validator gives us the
runtime declaration that the verifier must satisfy.
"""
from __future__ import annotations
import re
from pathlib import Path
import pytest
REPO_ROOT = Path(__file__).resolve().parent.parent
TARGETS_DIR = REPO_ROOT / "vm" / "targets"
def _strip_bash_comments(src: str) -> str:
"""Drop comment-only lines so the containment scan doesn't trip on
inline documentation that mentions the forbidden flags."""
out = []
for line in src.splitlines():
stripped = line.lstrip()
if stripped.startswith("#"):
continue
# Inline comments after code: trim everything after `#` UNLESS
# the # is inside single or double quotes. Cheap heuristic:
# only strip when there's a space before the #.
idx = line.find(" #")
if idx > 0:
line = line[:idx]
out.append(line)
return "\n".join(out)
def _every_target_with_verify():
if not TARGETS_DIR.exists():
return []
out = []
for child in TARGETS_DIR.iterdir():
if not child.is_dir():
continue
verify = child / "verify.sh"
if verify.exists():
out.append((child.name, verify))
return out
def _every_target_with_build():
if not TARGETS_DIR.exists():
return []
out = []
for child in TARGETS_DIR.iterdir():
if not child.is_dir():
continue
build = child / "build.sh"
if build.exists():
out.append((child.name, build))
return out
@pytest.mark.parametrize(
"name,path",
_every_target_with_verify() or [pytest.param("none", None, marks=pytest.mark.skip(reason="no targets yet"))],
)
def test_verify_uses_restrict_on(name, path: Path) -> None:
"""Every verify.sh MUST boot the produced image with no upstream
egress. The verifier proves the image satisfies its spec under the
SAME containment posture the orchestrator will use at episode time
— a target that needs internet access during verification can't be
trusted to behave under §4.13."""
src = path.read_text()
qemu_inv = re.search(r"qemu-system-x86_64\b[^`]*?(?=\n[A-Z#]|\Z)", src,
re.DOTALL)
assert qemu_inv, f"{name}: no qemu-system-x86_64 invocation in verify.sh"
qemu_text = qemu_inv.group(0)
assert "restrict=on" in qemu_text, (
f"{name}: verify.sh qemu invocation must include "
f"`restrict=on` on its netdev (§4.13). Got:\n{qemu_text}"
)
@pytest.mark.parametrize(
"name,path",
_every_target_with_verify() or [pytest.param("none", None, marks=pytest.mark.skip(reason="no targets yet"))],
)
def test_verify_no_shared_filesystem(name, path: Path) -> None:
"""Targets MUST NOT have host-shared mounts during verification."""
src = _strip_bash_comments(path.read_text())
for forbidden in ("-virtfs", "-fsdev", "9pfs", "9p,trans"):
assert forbidden not in src, (
f"{name}: verify.sh contains `{forbidden}` — host-shared "
f"filesystem is a §4.13 containment regression."
)
@pytest.mark.parametrize(
"name,path",
_every_target_with_verify() or [pytest.param("none", None, marks=pytest.mark.skip(reason="no targets yet"))],
)
def test_verify_uses_snapshot_on(name, path: Path) -> None:
"""`snapshot=on` so verification doesn't mutate the build
artifact's bytes — the sha256 must be stable from build to
publish to dataset-time use."""
src = path.read_text()
# Search for the boot drive line; allow flexible spacing/quoting.
drive_match = re.search(r'-drive[^\n]*file=[^\n]*\.qcow2[^\n]*', src)
if drive_match:
assert "snapshot=on" in drive_match.group(0), (
f"{name}: verify.sh boot drive must use snapshot=on (§4.13)"
)
@pytest.mark.parametrize(
"name,path",
_every_target_with_verify() or [pytest.param("none", None, marks=pytest.mark.skip(reason="no targets yet"))],
)
def test_verify_no_sudo_qemu(name, path: Path) -> None:
"""QEMU MUST run as the unprivileged caller. No sudo / setuid
wrappers (§4.13)."""
src = path.read_text()
assert "sudo qemu-system" not in src, (
f"{name}: verify.sh wraps qemu-system in sudo — privileged "
f"QEMU is a §4.13 containment regression."
)
@pytest.mark.parametrize(
"name,path",
_every_target_with_build() or [pytest.param("none", None, marks=pytest.mark.skip(reason="no targets yet"))],
)
def test_build_no_shared_filesystem(name, path: Path) -> None:
"""Even during build (when egress is permitted for package fetch),
no host-shared filesystem mount."""
src = _strip_bash_comments(path.read_text())
for forbidden in ("-virtfs", "-fsdev", "9pfs", "9p,trans"):
assert forbidden not in src, (
f"{name}: build.sh contains `{forbidden}` — host-shared "
f"filesystem is a §4.13 containment regression."
)
@pytest.mark.parametrize(
"name,path",
_every_target_with_build() or [pytest.param("none", None, marks=pytest.mark.skip(reason="no targets yet"))],
)
def test_build_no_sudo_qemu(name, path: Path) -> None:
src = path.read_text()
assert "sudo qemu-system" not in src, (
f"{name}: build.sh wraps qemu-system in sudo (§4.13)"
)
# ---------------------------------------------------------------------
# Cross-cutting: every target with a build.sh must also have verify.sh
# and vice versa. Half-built targets are a §1 violation.
# ---------------------------------------------------------------------
def test_every_target_has_complete_trio() -> None:
"""spec.toml + build.sh + verify.sh — all three or none. A target
with a spec but no build script is a stub; a build with no verify
bypasses §4.2 acceptance. PIPELINE.md §1 default-to-removal: if
you can't ship the trio, ship nothing."""
if not TARGETS_DIR.exists():
return
for child in TARGETS_DIR.iterdir():
if not child.is_dir():
continue
files = {p.name for p in child.iterdir() if p.is_file()}
# At least one of {spec.toml, build.sh, verify.sh} present means
# all three required.
relevant = {"spec.toml", "build.sh", "verify.sh"} & files
if not relevant:
continue
missing = {"spec.toml", "build.sh", "verify.sh"} - files
assert not missing, (
f"target {child.name}: incomplete trio, missing {missing}. "
f"§1 default-to-removal: complete the trio or remove the dir."
)