PIPELINE §5 step 2: canonical manifest at <repo>/manifest.toml

The experiment is now defined by a single version-pinned file — manifest.toml at the repo root. PIPELINE.md §4.1 / §13 / §16. Every lab host loads THIS exact file; per-host overrides of experiment shape are forbidden. Drops the following per-host CLI overrides that previously violated the canonical-manifest principle: * --manifest, --modules-dir (paths now derived) * --ram-per-vm-mib (in manifest.experiment) * --max-concurrent (manifest.experiment.fleet.max_concurrent_ceiling) * --max-tier3-slots (manifest.experiment.fleet.max_tier3_slots) * --force-tier2 (not a §14 sanctioned override knob — ship empty catalog to disable Tier-3) * --require-real-samples (sample-side concern; out of fleet scope) * tools/run_*_demo.py --manifest (samples path now from canonical) New surface: * manifest.toml — the single source of truth * orchestrator/manifest.py — load_canonical() + Manifest dataclass with strict validation, raises ManifestError on any failure * EpisodeConfig.experiment_meta — populated by run_*_demo.py from the canonical manifest; stamped into every episode's meta.json under "experiment" key for provenance * cis490-orchestrator.service — RestartPreventExitStatus=78 so manifest-load failures stay stuck-and-loud (§9, §4.7) * install-lab-host.sh — validates manifest.toml at install time; missing or invalid = die with clear message Catalog admission semantics: only modules whose name appears in manifest.catalog get loaded into the runtime catalog (§4.3 in miniature, will tighten further in step 4 when verified_against / last_verified actually gate admission). Missing toml for an admitted name is a sysadmin error → exit 78. Renames cfg.manifest → cfg.samples + adds cfg.experiment to disambiguate sample-manifest from experiment-manifest. Rewrites test_fleet.py fixture to construct synthetic Manifest objects so test outcomes don't depend on the on-disk manifest.toml content. 12 new tests in tests/test_manifest.py: schema-version mismatch, unknown collector, duplicate collector, unknown phase, negative phase seconds, negative ram, missing catalog fields, json round-trip. Local run: `python tools/run_fleet.py --capacity` correctly logs the loaded manifest and prints capacity. 241 tests passing. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 01:25:01 -05:00 · 2026-05-04 01:25:01 -05:00 · 207a902c3e
commit 207a902c3e
parent dca6144a4a
11 changed files with 971 additions and 105 deletions
--- a/etc/cis490-orchestrator.service
+++ b/etc/cis490-orchestrator.service
@ -18,17 +18,25 @@ EnvironmentFile=/etc/cis490/lab-host.env
 # unit still starts on Tier-2-only hosts where msfrpcd isn't installed.
 EnvironmentFile=-/etc/cis490/msfrpc.env
 # Fleet mode: detect host capacity, run that many concurrent episodes
-# per wave with samples drawn from the manifest. Each invocation runs
-# one wave and exits; systemd respawns per Restart= below, giving us
-# a continuous stream of fresh-sample episodes per host. The shipper
-# picks them up as `done.marker` files appear.
+# per wave with samples + experiment shape drawn from the canonical
+# manifest at /opt/cis490/manifest.toml. Each invocation runs one wave
+# and exits; systemd respawns per Restart= below.
+#
+# Per PIPELINE.md §4.1 there are no --manifest, --max-tier3-slots,
+# --ram-per-vm-mib, --max-concurrent, --force-tier2, or
+# --require-real-samples flags. Experiment-shape parameters live in
+# manifest.toml. Per-host overrides are forbidden.
+#
+# Exit 78 (sysadmin error) when the canonical manifest fails to load
+# or when the host can't run the experiment. RestartPreventExitStatus=78
+# keeps the unit stuck-and-loud rather than respawning into the same
+# broken state — operator notices and fixes.
 ExecStart=/opt/cis490/.venv/bin/python /opt/cis490/tools/run_fleet.py \
    --data-root /var/lib/cis490/data \
-    --manifest /opt/cis490/samples/manifest.toml \
-    --waves 1 \
-    --max-tier3-slots 4
+    --waves 1
 Restart=always
 RestartSec=15
+RestartPreventExitStatus=78

 # Hardening — explicitly grant CAP_NET_RAW for tcpdump (source 4) and
 # CAP_SYS_ADMIN / CAP_PERFMON for perf (source 3) when the operator
--- a/manifest.toml
+++ b/manifest.toml
@ -0,0 +1,145 @@
+# CIS490 canonical experiment manifest.
+#
+# This file is the single source of truth for what the experiment IS:
+# which collectors run, at what cadence, against what target images,
+# with which exploit modules in rotation, walking which phase budget,
+# under what concurrency cap. Every lab host loads THIS exact file.
+# Per-host overrides are forbidden — PIPELINE.md §4.1 / §13.
+#
+# Hosts that cannot run the canonical experiment exit 78 at
+# orchestrator startup (PIPELINE.md §4.7). They produce zero episodes
+# and say so loudly. There is no "ship what we can" fallback.
+#
+# Substantive amendments to this file follow PIPELINE.md §16:
+# operator sign-off (§15), §8 decision tests, lands in the same merge
+# as the code change it justifies. "This is just config" is not a
+# free pass — the manifest is admission scope (§13).
+
+schema_version = 1
+name = "cis490-spectral-v1"
+
+# ---------------------------------------------------------------------
+# [experiment] — episode-shape parameters
+# ---------------------------------------------------------------------
+[experiment]
+# Per-VM RAM in mebibytes. The capacity detector divides available
+# host memory by this number to compute max_concurrent. Every slot
+# gets the same RAM — no per-host fuzz.
+ram_per_vm_mib = 320
+
+# ---------------------------------------------------------------------
+# [experiment.schedule] — phase budget (§4.5)
+# ---------------------------------------------------------------------
+# Each phase carries a duration in seconds — the MAXIMUM time the
+# orchestrator will wait in that phase. Phase TRANSITIONS are written
+# from observed events (PIPELINE.md §4.5), not from this clock. The
+# clock is just an upper bound: if no event fires within the budget,
+# the orchestrator advances and records the failure.
+[[experiment.schedule.phases]]
+name = "clean"
+seconds = 10.0
+
+[[experiment.schedule.phases]]
+name = "armed"
+seconds = 3.0
+
+[[experiment.schedule.phases]]
+name = "infecting"
+seconds = 5.0
+
+[[experiment.schedule.phases]]
+name = "infected_running"
+seconds = 25.0
+
+[[experiment.schedule.phases]]
+name = "dormant"
+seconds = 15.0
+
+[[experiment.schedule.phases]]
+name = "infected_running"
+seconds = 20.0
+
+[[experiment.schedule.phases]]
+name = "dormant"
+seconds = 5.0
+
+[[experiment.schedule.phases]]
+name = "clean"
+seconds = 5.0
+
+# ---------------------------------------------------------------------
+# [experiment.fleet] — concurrency policy
+# ---------------------------------------------------------------------
+[experiment.fleet]
+# Hard ceiling on per-wave slots regardless of host capacity. 0 = no
+# ceiling (capacity detector decides). Hosts whose capacity exceeds
+# this still run only `max_concurrent_ceiling` slots, so the dataset
+# isn't dominated by the largest host.
+max_concurrent_ceiling = 0
+
+# Cap of Tier-3 (real-exploit) slots per wave. Slots beyond this fall
+# back to Tier-2. 0 = no cap. Useful when msfrpcd contention starts
+# to matter; left at 0 by default.
+max_tier3_slots = 0
+
+# ---------------------------------------------------------------------
+# [collectors] — active set (§4.4)
+# ---------------------------------------------------------------------
+# A collector listed here MUST emit ≥1 row when run against the
+# canonical-manifest experiment (§4.4 admission). A collector that
+# can't pass admission is REMOVED from this list — never silently
+# included with zero rows.
+[collectors]
+active = [
+    "proc",
+    "qmp",
+    "perf",
+    "guest_agent",
+    "pcap",
+    "netflow",
+]
+
+[collectors.intervals]
+proc_ms = 100
+qmp_ms = 1000
+perf_ms = 100
+guest_agent_ms = 100
+pcap_snaplen = 256
+netflow_bucket_ms = 100
+
+# ---------------------------------------------------------------------
+# [catalog] — Tier-3 module catalog (§4.3)
+# ---------------------------------------------------------------------
+# Each entry references a module config in `exploits/modules/<name>.toml`.
+# Every entry MUST carry `verified_against` and `last_verified`. The
+# absence of either drops the module from the active catalog.
+#
+# Empty as of 2026-05-04: PIPELINE.md §3 found 0/67 session_open on
+# samba_usermap_script against the SourceForge Metasploitable2 image,
+# and no in-house verified target exists yet. §5 step 3 builds the
+# target VM, step 4 re-admits modules with verification recorded.
+# Until then, Tier-3 episodes do not run — the dataset is honest
+# Tier-2 only.
+[catalog]
+modules = []
+
+# ---------------------------------------------------------------------
+# [targets] — target VM images (§4.2)
+# ---------------------------------------------------------------------
+# Each entry pins (image_name, sha256, build_script). The image MUST
+# have been produced by `build_script` and verified per §4.2. Hosts
+# that don't have the expected image at the expected sha256 fail
+# preflight (§4.7) and produce zero episodes.
+#
+# Empty until §5 step 3 lands declarative target builds.
+[targets]
+images = []
+
+# ---------------------------------------------------------------------
+# [samples] — pointer to the malware-sample manifest
+# ---------------------------------------------------------------------
+# Samples roll forward independently of experiment shape (new sample
+# = manifest entry; doesn't change collector set or schedule). Path
+# is relative to repo root.
+[samples]
+manifest_path = "samples/manifest.toml"
--- a/orchestrator/episode.py
+++ b/orchestrator/episode.py
@ -160,6 +160,13 @@ class EpisodeConfig:
    # doesn't need to import exploits.modules; populated by callers
    # that have a ModuleConfig in hand.
    exploit_meta: dict | None = None
+    # Canonical experiment manifest provenance (PIPELINE.md §4.1).
+    # Plain dict (output of Manifest.to_meta()) so the runner doesn't
+    # need to import orchestrator.manifest; populated by run_fleet /
+    # run_*_demo via load_canonical(). When None, meta.experiment is
+    # null — useful for unit tests that don't go through the canonical
+    # path; production episodes always carry it.
+    experiment_meta: dict | None = None
    # Snapshot/revert (Tier 0+):
    #   revert_at_start — before any phase walks, loadvm <snapshot_name>.
    #     Use this to drop the guest back to a known-good baseline at
@ -466,6 +473,7 @@ class EpisodeRunner:
            },
            "exploit": self.cfg.exploit_meta,
            "sample": sample_meta,
+            "experiment": self.cfg.experiment_meta,
            "schedule": {
                "baseline_seconds": self.cfg.duration_s,
                "interval_ms": self.cfg.interval_ms,
--- a/orchestrator/fleet.py
+++ b/orchestrator/fleet.py
@ -43,6 +43,8 @@ from exploits.modules import (
 )
 from samples.manifest import Sample, SampleManifest

+from .manifest import Manifest
+

 log = logging.getLogger("cis490.fleet")

@ -93,32 +95,51 @@ class FleetCapacity:

@dataclass
 class FleetConfig:
+    """Per-host config that combines (a) the canonical experiment
+    manifest, (b) per-host identity (host_id), and (c) host-local
+    paths (data_root, repo_root). Experiment-shape parameters live
+    in `experiment` (the canonical Manifest). Per-host overrides of
+    experiment shape are forbidden — see PIPELINE.md §4.1."""
    host_id: str
    repo_root: Path
    data_root: Path
-    manifest: SampleManifest
+    # Canonical experiment manifest — drives schedule, ram_per_vm_mib,
+    # collector active set, intervals, fleet ceilings, catalog, targets.
+    experiment: "Manifest"
+    # Sample manifest (separate file at samples/manifest.toml). Holds
+    # the per-malware-family sample list; rolls forward independently
+    # of experiment shape so a new sample is one-line, not a
+    # canonical-manifest amendment.
+    samples: SampleManifest
    # Module catalog for Tier-3 dispatch. Required for fleet-driven
-    # exploit-fire variety; empty catalog forces Tier-2 fallback.
+    # exploit-fire variety; empty catalog forces Tier-2 fallback. The
+    # canonical manifest's catalog admission gates which modules end
+    # up here (only those whose toml exists AND whose name is in
+    # experiment.catalog get loaded).
    modules: dict[str, ModuleConfig] = field(default_factory=dict)
-    # VM resource shape — must match what the launcher requests.
-    ram_per_vm_mib: int = 320
-    # Cap concurrency below the calculated max (e.g. for a smoke test).
-    max_concurrent_override: int | None = None
-    # Skip episodes whose sample requires a real binary that's not present.
-    require_real_samples: bool = False
-    # Force Tier-2 even when msfrpcd is up; used by tests + dev runs
-    # that want a no-exploit baseline.
-    force_tier2: bool = False
-    # Limit how many slots per wave run as Tier-3. Slots 0..N-1 get
-    # Tier-3; the rest fall back to Tier-2. Metasploitable2 boot is IO-
-    # bound: running >~6 concurrent target VMs saturates disk and causes
-    # all slots to timeout waiting for the guest service to come up.
-    # None = no cap (all eligible slots use Tier-3).
-    max_tier3_slots: int | None = None
-    # msfrpcd connectivity (read by tier-3 driver via env).
+    # msfrpcd connectivity (read by tier-3 driver via env). Per-host
+    # because msfrpcd may be on a non-loopback bind on some hosts;
+    # NOT an experiment-shape parameter.
    msfrpcd_host: str = "127.0.0.1"
    msfrpcd_port: int = 55553

+    # ---- experiment-shape derived properties (manifest-driven) -----
+
+    @property
+    def ram_per_vm_mib(self) -> int:
+        return self.experiment.ram_per_vm_mib
+
+    @property
+    def max_concurrent_ceiling(self) -> int:
+        """Hard ceiling on per-wave slots; 0 = no ceiling (capacity
+        detector decides)."""
+        return self.experiment.fleet.max_concurrent_ceiling
+
+    @property
+    def max_tier3_slots(self) -> int:
+        """Cap of Tier-3 slots per wave; 0 = no cap."""
+        return self.experiment.fleet.max_tier3_slots
+

 def _read_meminfo() -> dict[str, int]:
    out: dict[str, int] = {}
@ -263,11 +284,16 @@ def _run_slot(
        {k: v for k, v in cfg.modules.items() if not v.requires_bridge}
        if cfg.modules else {}
    )
+    # Tier-3 dispatch requires (a) at least one verified module loaded
+    # AND not filtered out by requires_bridge, (b) msfrpcd reachable,
+    # (c) the slot index is below the manifest's max_tier3_slots cap
+    # (0 = no cap). force_tier2 is no longer a sanctioned override knob
+    # per §14 — to disable Tier-3 globally, ship an empty catalog (the
+    # current state); to disable per-slot, raise the cap.
    tier3_ready = (
-        not cfg.force_tier2
-        and bool(usable_modules)
+        bool(usable_modules)
        and _msfrpcd_available(cfg.msfrpcd_host, cfg.msfrpcd_port)
-        and (cfg.max_tier3_slots is None or slot < cfg.max_tier3_slots)
+        and (cfg.max_tier3_slots == 0 or slot < cfg.max_tier3_slots)
    )

    env = os.environ.copy()
@ -345,13 +371,13 @@ def _run_slot(
        ]
        tier = "tier2"
        module_name = None
-        if not cfg.force_tier2 and not cfg.modules:
+        if not cfg.modules:
            log.warning("slot=%d falling back to Tier 2: empty module catalog", slot)
-        elif not cfg.force_tier2 and not usable_modules:
+        elif not usable_modules:
            log.warning("slot=%d falling back to Tier 2: no non-bridge modules available", slot)
-        elif not cfg.force_tier2 and cfg.max_tier3_slots is not None and slot >= cfg.max_tier3_slots:
+        elif cfg.max_tier3_slots > 0 and slot >= cfg.max_tier3_slots:
            log.debug("slot=%d Tier 2 by max_tier3_slots=%d cap", slot, cfg.max_tier3_slots)
-        elif not cfg.force_tier2:
+        else:
            log.warning("slot=%d falling back to Tier 2: msfrpcd unreachable at %s:%d",
                        slot, cfg.msfrpcd_host, cfg.msfrpcd_port)

@ -422,8 +448,12 @@ class FleetRunner:
            ram_per_vm_mib=self.cfg.ram_per_vm_mib,
        )
        n_slots = capacity.max_concurrent
-        if self.cfg.max_concurrent_override is not None:
-            n_slots = min(n_slots, self.cfg.max_concurrent_override)
+        # Manifest ceiling: 0 = no ceiling. The capacity detector
+        # decides per-host based on hardware; the manifest caps that
+        # decision to keep the dataset balanced across hosts of
+        # different sizes (PIPELINE.md §4.1).
+        if self.cfg.max_concurrent_ceiling > 0:
+            n_slots = min(n_slots, self.cfg.max_concurrent_ceiling)
        if n_slots <= 0:
            log.warning(
                "fleet capacity is zero (%s); cannot run", capacity.rationale,
@ -433,8 +463,9 @@ class FleetRunner:
            )

        log.info(
-            "fleet host=%s slots=%d episodes=%d manifest_size=%d",
-            self.cfg.host_id, n_slots, episodes, len(self.cfg.manifest),
+            "fleet host=%s slots=%d episodes=%d samples=%d experiment=%s",
+            self.cfg.host_id, n_slots, episodes, len(self.cfg.samples),
+            self.cfg.experiment.name,
        )

        all_results: list[SlotResult] = []
@ -444,18 +475,13 @@ class FleetRunner:
                break
            episode_index = episode_index_base + ep
            slot_samples = [
-                self.cfg.manifest.select(
+                self.cfg.samples.select(
                    host_id=self.cfg.host_id,
                    slot=slot,
                    episode_index=episode_index,
                )
                for slot in range(n_slots)
            ]
-            if self.cfg.require_real_samples:
-                slot_samples = [s for s in slot_samples if s.kind == "real"]
-                if not slot_samples:
-                    log.warning("require_real_samples: no real samples in manifest; skipping wave")
-                    continue

            log.info(
                "wave %d/%d: %s",
@ -491,8 +517,8 @@ class FleetRunner:
 # ---------------------------------------------------------------------------


-def capacity_report() -> str:
-    c = detect_capacity()
+def capacity_report(*, ram_per_vm_mib: int = 320) -> str:
+    c = detect_capacity(ram_per_vm_mib=ram_per_vm_mib)
    return (
        f"cores: {c.cores_total} (reserve {c.cores_reserved})\n"
        f"ram:   {c.ram_total_mib} MiB total, {c.ram_available_mib} MiB available "
--- a/orchestrator/manifest.py
+++ b/orchestrator/manifest.py
@ -0,0 +1,371 @@
+"""Canonical experiment manifest (PIPELINE.md §4.1 / §13).
+
+The manifest at `<repo_root>/manifest.toml` is the single source of
+truth for what the experiment is: which collectors run, at what
+cadence, against what targets, with which exploit modules in rotation,
+walking which phase budget. Every lab host loads THIS file. There is
+no per-host override flag, no `--manifest <path>` argument, no
+fallback. A host that can't load and validate the canonical manifest
+must exit 78 and ship zero episodes.
+
+`load_canonical(repo_root)` reads from the fixed path and validates.
+On any failure it raises `ManifestError`; callers translate that into
+exit 78. `Manifest` is a frozen dataclass — once loaded the values
+don't move under us mid-run.
+
+Substantive amendments follow PIPELINE.md §16: operator sign-off,
+landed in the same merge as the code change, with §8 decision tests
+applied to the amendment itself.
+"""
+
+from __future__ import annotations
+
+import tomllib
+from dataclasses import dataclass, field
+from pathlib import Path
+
+
+CANONICAL_FILENAME = "manifest.toml"
+
+# Closed enums — keep in sync with the corresponding code that
+# implements each name. A name not in these sets means the manifest
+# is asking for something the orchestrator doesn't know how to do.
+KNOWN_COLLECTORS: frozenset[str] = frozenset({
+    "proc",
+    "qmp",
+    "perf",
+    "guest_agent",
+    "pcap",
+    "netflow",
+})
+
+KNOWN_PHASES: frozenset[str] = frozenset({
+    "clean",
+    "armed",
+    "infecting",
+    "infected_running",
+    "dormant",
+    "failed",
+})
+
+
+class ManifestError(ValueError):
+    """Raised when the canonical manifest is missing, unreadable, or
+    fails validation. The orchestrator translates this into exit 78
+    (PIPELINE.md §4.7 / §9)."""
+
+
+@dataclass(frozen=True)
+class Phase:
+    name: str
+    seconds: float
+
+
+@dataclass(frozen=True)
+class CollectorIntervals:
+    proc_ms: int
+    qmp_ms: int
+    perf_ms: int
+    guest_agent_ms: int
+    pcap_snaplen: int
+    netflow_bucket_ms: int
+
+
+@dataclass(frozen=True)
+class FleetPolicy:
+    max_concurrent_ceiling: int
+    max_tier3_slots: int
+
+
+@dataclass(frozen=True)
+class CatalogEntry:
+    name: str
+    verified_against: str
+    last_verified: str
+
+
+@dataclass(frozen=True)
+class TargetSpec:
+    image_name: str
+    sha256: str
+    build_script: str
+
+
+@dataclass(frozen=True)
+class Manifest:
+    schema_version: int
+    name: str
+    ram_per_vm_mib: int
+    schedule: tuple[Phase, ...]
+    fleet: FleetPolicy
+    collectors_active: tuple[str, ...]
+    intervals: CollectorIntervals
+    catalog: tuple[CatalogEntry, ...]
+    targets: tuple[TargetSpec, ...]
+    samples_manifest_path: str
+    # Resolved repo root + manifest path so callers can stamp them
+    # into meta.json for provenance without re-deriving.
+    repo_root: Path = field(repr=False)
+    manifest_path: Path = field(repr=False)
+
+    def to_meta(self) -> dict:
+        """Lightweight representation suitable for embedding in
+        meta.json so episodes carry their experiment provenance.
+        Excludes the resolved Path fields (host-specific paths don't
+        belong in the wire-format)."""
+        return {
+            "schema_version": self.schema_version,
+            "name": self.name,
+            "ram_per_vm_mib": self.ram_per_vm_mib,
+            "phases": [
+                {"name": p.name, "seconds": p.seconds} for p in self.schedule
+            ],
+            "fleet": {
+                "max_concurrent_ceiling": self.fleet.max_concurrent_ceiling,
+                "max_tier3_slots": self.fleet.max_tier3_slots,
+            },
+            "collectors_active": list(self.collectors_active),
+            "intervals": {
+                "proc_ms": self.intervals.proc_ms,
+                "qmp_ms": self.intervals.qmp_ms,
+                "perf_ms": self.intervals.perf_ms,
+                "guest_agent_ms": self.intervals.guest_agent_ms,
+                "pcap_snaplen": self.intervals.pcap_snaplen,
+                "netflow_bucket_ms": self.intervals.netflow_bucket_ms,
+            },
+            "catalog": [
+                {"name": c.name, "verified_against": c.verified_against,
+                 "last_verified": c.last_verified}
+                for c in self.catalog
+            ],
+            "targets": [
+                {"image_name": t.image_name, "sha256": t.sha256,
+                 "build_script": t.build_script}
+                for t in self.targets
+            ],
+        }
+
+
+def load_canonical(repo_root: Path | str) -> Manifest:
+    """Load + validate `<repo_root>/manifest.toml`. There is no
+    `manifest_path` parameter on purpose — per §4.1 the canonical
+    manifest lives at exactly one path. Callers that pass the path
+    directly are off the runbook.
+
+    Raises ManifestError on any failure. Successful return guarantees
+    every field is present, every collector name is known, every
+    phase name is known, and every catalog entry has both
+    verified_against and last_verified.
+    """
+    repo_root = Path(repo_root).resolve()
+    path = repo_root / CANONICAL_FILENAME
+    if not path.exists():
+        raise ManifestError(
+            f"canonical manifest not found at {path}. "
+            f"PIPELINE.md §4.1 requires exactly one manifest at the "
+            f"repo root; this host cannot run the experiment without it."
+        )
+    try:
+        raw = tomllib.loads(path.read_text())
+    except (OSError, tomllib.TOMLDecodeError) as e:
+        raise ManifestError(f"cannot parse {path}: {e}") from e
+
+    return _validate(raw, repo_root, path)
+
+
+def _validate(raw: dict, repo_root: Path, path: Path) -> Manifest:
+    schema_version = _require_int(raw, "schema_version")
+    if schema_version != 1:
+        raise ManifestError(
+            f"manifest schema_version={schema_version} not supported; "
+            f"this orchestrator handles version 1 only. "
+            f"Upgrade orchestrator or downgrade manifest to match."
+        )
+    name = _require_str(raw, "name")
+
+    experiment = _require_dict(raw, "experiment")
+    ram_per_vm_mib = _require_int(experiment, "ram_per_vm_mib")
+    if ram_per_vm_mib <= 0:
+        raise ManifestError(
+            f"experiment.ram_per_vm_mib must be positive, got {ram_per_vm_mib}"
+        )
+
+    schedule_block = _require_dict(experiment, "schedule")
+    phases_raw = schedule_block.get("phases")
+    if not isinstance(phases_raw, list) or not phases_raw:
+        raise ManifestError(
+            "experiment.schedule.phases must be a non-empty array"
+        )
+    phases: list[Phase] = []
+    for i, p in enumerate(phases_raw):
+        if not isinstance(p, dict):
+            raise ManifestError(
+                f"experiment.schedule.phases[{i}] must be a table"
+            )
+        pname = _require_str(p, "name", ctx=f"phases[{i}]")
+        if pname not in KNOWN_PHASES:
+            raise ManifestError(
+                f"experiment.schedule.phases[{i}].name={pname!r} not in "
+                f"KNOWN_PHASES {sorted(KNOWN_PHASES)}"
+            )
+        secs = _require_float(p, "seconds", ctx=f"phases[{i}]")
+        if secs <= 0:
+            raise ManifestError(
+                f"experiment.schedule.phases[{i}].seconds must be > 0, "
+                f"got {secs}"
+            )
+        phases.append(Phase(name=pname, seconds=secs))
+
+    fleet_block = _require_dict(experiment, "fleet")
+    fleet = FleetPolicy(
+        max_concurrent_ceiling=_require_int(fleet_block, "max_concurrent_ceiling"),
+        max_tier3_slots=_require_int(fleet_block, "max_tier3_slots"),
+    )
+    if fleet.max_concurrent_ceiling < 0 or fleet.max_tier3_slots < 0:
+        raise ManifestError(
+            "experiment.fleet ceilings must be >= 0 (0 = no cap)"
+        )
+
+    collectors_block = _require_dict(raw, "collectors")
+    active_raw = collectors_block.get("active")
+    if not isinstance(active_raw, list):
+        raise ManifestError("collectors.active must be an array")
+    if len(set(active_raw)) != len(active_raw):
+        raise ManifestError(
+            f"collectors.active contains duplicates: {active_raw}"
+        )
+    for c in active_raw:
+        if c not in KNOWN_COLLECTORS:
+            raise ManifestError(
+                f"collectors.active references unknown collector "
+                f"{c!r}; known: {sorted(KNOWN_COLLECTORS)}"
+            )
+    collectors_active = tuple(active_raw)
+
+    intervals_block = _require_dict(collectors_block, "intervals")
+    intervals = CollectorIntervals(
+        proc_ms=_require_int(intervals_block, "proc_ms"),
+        qmp_ms=_require_int(intervals_block, "qmp_ms"),
+        perf_ms=_require_int(intervals_block, "perf_ms"),
+        guest_agent_ms=_require_int(intervals_block, "guest_agent_ms"),
+        pcap_snaplen=_require_int(intervals_block, "pcap_snaplen"),
+        netflow_bucket_ms=_require_int(intervals_block, "netflow_bucket_ms"),
+    )
+    for fname, fval in (
+        ("proc_ms", intervals.proc_ms),
+        ("qmp_ms", intervals.qmp_ms),
+        ("perf_ms", intervals.perf_ms),
+        ("guest_agent_ms", intervals.guest_agent_ms),
+        ("pcap_snaplen", intervals.pcap_snaplen),
+        ("netflow_bucket_ms", intervals.netflow_bucket_ms),
+    ):
+        if fval <= 0:
+            raise ManifestError(
+                f"collectors.intervals.{fname} must be > 0, got {fval}"
+            )
+
+    catalog_block = _require_dict(raw, "catalog")
+    modules_raw = catalog_block.get("modules")
+    if not isinstance(modules_raw, list):
+        raise ManifestError("catalog.modules must be an array")
+    catalog: list[CatalogEntry] = []
+    for i, entry in enumerate(modules_raw):
+        if not isinstance(entry, dict):
+            raise ManifestError(f"catalog.modules[{i}] must be a table")
+        cname = _require_str(entry, "name", ctx=f"catalog[{i}]")
+        verified_against = _require_str(
+            entry, "verified_against", ctx=f"catalog[{i}]"
+        )
+        last_verified = _require_str(
+            entry, "last_verified", ctx=f"catalog[{i}]"
+        )
+        catalog.append(CatalogEntry(
+            name=cname,
+            verified_against=verified_against,
+            last_verified=last_verified,
+        ))
+
+    targets_block = _require_dict(raw, "targets")
+    images_raw = targets_block.get("images")
+    if not isinstance(images_raw, list):
+        raise ManifestError("targets.images must be an array")
+    targets: list[TargetSpec] = []
+    for i, t in enumerate(images_raw):
+        if not isinstance(t, dict):
+            raise ManifestError(f"targets.images[{i}] must be a table")
+        targets.append(TargetSpec(
+            image_name=_require_str(t, "image_name", ctx=f"targets[{i}]"),
+            sha256=_require_str(t, "sha256", ctx=f"targets[{i}]"),
+            build_script=_require_str(t, "build_script", ctx=f"targets[{i}]"),
+        ))
+
+    samples_block = _require_dict(raw, "samples")
+    samples_manifest_path = _require_str(samples_block, "manifest_path")
+
+    return Manifest(
+        schema_version=schema_version,
+        name=name,
+        ram_per_vm_mib=ram_per_vm_mib,
+        schedule=tuple(phases),
+        fleet=fleet,
+        collectors_active=collectors_active,
+        intervals=intervals,
+        catalog=tuple(catalog),
+        targets=tuple(targets),
+        samples_manifest_path=samples_manifest_path,
+        repo_root=repo_root,
+        manifest_path=path,
+    )
+
+
+# ---------- helpers --------------------------------------------------
+
+
+def _require(d: dict, key: str, kind: type, *, ctx: str = "") -> object:
+    where = f"{ctx}." if ctx else ""
+    if key not in d:
+        raise ManifestError(f"missing required field {where}{key}")
+    v = d[key]
+    if not isinstance(v, kind):
+        raise ManifestError(
+            f"field {where}{key} must be {kind.__name__}, got {type(v).__name__}"
+        )
+    return v
+
+
+def _require_str(d: dict, key: str, *, ctx: str = "") -> str:
+    return _require(d, key, str, ctx=ctx)  # type: ignore[return-value]
+
+
+def _require_int(d: dict, key: str, *, ctx: str = "") -> int:
+    # tomllib parses both ints and floats; require strict int for fields
+    # that should be ints, accept int-valued floats for ergonomics.
+    where = f"{ctx}." if ctx else ""
+    if key not in d:
+        raise ManifestError(f"missing required field {where}{key}")
+    v = d[key]
+    if isinstance(v, bool):  # bool is a subclass of int — reject explicitly
+        raise ManifestError(f"field {where}{key} must be int, got bool")
+    if isinstance(v, int):
+        return v
+    raise ManifestError(
+        f"field {where}{key} must be int, got {type(v).__name__}"
+    )
+
+
+def _require_float(d: dict, key: str, *, ctx: str = "") -> float:
+    where = f"{ctx}." if ctx else ""
+    if key not in d:
+        raise ManifestError(f"missing required field {where}{key}")
+    v = d[key]
+    if isinstance(v, bool):
+        raise ManifestError(f"field {where}{key} must be number, got bool")
+    if isinstance(v, (int, float)):
+        return float(v)
+    raise ManifestError(
+        f"field {where}{key} must be number, got {type(v).__name__}"
+    )
+
+
+def _require_dict(d: dict, key: str, *, ctx: str = "") -> dict:
+    return _require(d, key, dict, ctx=ctx)  # type: ignore[return-value]
--- a/scripts/install-lab-host.sh
+++ b/scripts/install-lab-host.sh
@ -124,6 +124,24 @@ command -v perf >/dev/null || die \
 command -v tcpdump >/dev/null || die \
    "tcpdump not on PATH after install attempt — collector source 4 (bridge pcap + netflow) requires it. See ensure_collector_packages above for what was tried."

+# Canonical experiment manifest (PIPELINE.md §4.1). Validate at install
+# time so a host that cloned a repo missing manifest.toml fails the
+# install loudly, not at first orchestrator startup. The orchestrator
+# also validates and exits 78 on bad manifest, but install-time is the
+# earliest point we can fail.
+[[ -f "$REPO_ROOT/manifest.toml" ]] || die \
+    "$REPO_ROOT/manifest.toml not found — every lab host must ship the canonical experiment manifest (§4.1). Did the git pull complete?"
+"$REPO_ROOT/.venv/bin/python" -c "
+import sys, pathlib
+sys.path.insert(0, '$REPO_ROOT')
+from orchestrator.manifest import load_canonical, ManifestError
+try:
+    load_canonical(pathlib.Path('$REPO_ROOT'))
+except ManifestError as e:
+    print(f'manifest.toml failed validation: {e}', file=sys.stderr)
+    sys.exit(1)
+" || die "manifest.toml validation failed — see error above (§4.1)"
+
 # uv is preferred (lockfile-driven). Fall back to system pip if absent.
 USE_UV=0
 if command -v uv >/dev/null; then USE_UV=1; fi
--- a/tests/test_fleet.py
+++ b/tests/test_fleet.py
@ -245,7 +245,50 @@ def _fixture_modules() -> dict:
    }


-def _fleet_cfg_with_modules(tmp_path: Path, *, force_tier2: bool = False):
+def _fixture_manifest(*, max_tier3_slots: int = 0,
+                      max_concurrent_ceiling: int = 0):
+    """Synthetic canonical Manifest for fleet tests.
+
+    Mirrors the production manifest.toml shape but constructed in-memory
+    so test outcomes don't depend on what the on-disk manifest happens
+    to say. Per-test parameterization (ceilings, future schedule
+    variants) goes through this builder, not through CLI overrides
+    that don't exist anymore (PIPELINE.md §4.1)."""
+    from orchestrator.manifest import (
+        CollectorIntervals, FleetPolicy, Manifest, Phase,
+    )
+    return Manifest(
+        schema_version=1,
+        name="test-fixture",
+        ram_per_vm_mib=320,
+        schedule=(
+            Phase("clean", 10.0),
+            Phase("armed", 3.0),
+            Phase("infecting", 5.0),
+            Phase("infected_running", 25.0),
+            Phase("dormant", 15.0),
+            Phase("clean", 5.0),
+        ),
+        fleet=FleetPolicy(
+            max_concurrent_ceiling=max_concurrent_ceiling,
+            max_tier3_slots=max_tier3_slots,
+        ),
+        collectors_active=("proc", "qmp", "perf", "guest_agent",
+                           "pcap", "netflow"),
+        intervals=CollectorIntervals(
+            proc_ms=100, qmp_ms=1000, perf_ms=100,
+            guest_agent_ms=100, pcap_snaplen=256, netflow_bucket_ms=100,
+        ),
+        catalog=(),
+        targets=(),
+        samples_manifest_path="samples/manifest.toml",
+        repo_root=REPO_ROOT,
+        manifest_path=REPO_ROOT / "manifest.toml",
+    )
+
+
+def _fleet_cfg_with_modules(tmp_path: Path, *, max_tier3_slots: int = 0,
+                            max_concurrent_ceiling: int = 0):
    from orchestrator import fleet
    from samples.manifest import SampleManifest

@ -254,9 +297,12 @@ def _fleet_cfg_with_modules(tmp_path: Path, *, force_tier2: bool = False):
        host_id="test-host",
        repo_root=repo_root,
        data_root=tmp_path,
-        manifest=SampleManifest.load(repo_root / "samples" / "manifest.toml"),
+        experiment=_fixture_manifest(
+            max_tier3_slots=max_tier3_slots,
+            max_concurrent_ceiling=max_concurrent_ceiling,
+        ),
+        samples=SampleManifest.load(repo_root / "samples" / "manifest.toml"),
        modules=_fixture_modules(),
-        force_tier2=force_tier2,
    )


@ -273,7 +319,7 @@ def test_fleet_dispatches_to_tier3_when_msfrpcd_listening(monkeypatch, tmp_path)
    _patch_subprocess(monkeypatch)
    capacity = fleet.detect_capacity()

-    sample = cfg.manifest.samples[0]
+    sample = cfg.samples.samples[0]
    res = fleet._run_slot(cfg, slot=0, sample=sample, episode_index=0, capacity=capacity)

    assert res.tier == "tier3", res
@ -293,7 +339,7 @@ def test_fleet_falls_back_to_tier2_when_msfrpcd_down(monkeypatch, tmp_path) -> N
    _patch_subprocess(monkeypatch)
    capacity = fleet.detect_capacity()

-    sample = cfg.manifest.samples[0]
+    sample = cfg.samples.samples[0]
    res = fleet._run_slot(cfg, slot=0, sample=sample, episode_index=0, capacity=capacity)

    assert res.tier == "tier2"
@ -309,26 +355,40 @@ def test_fleet_falls_back_to_tier2_when_module_catalog_empty(monkeypatch, tmp_pa
        host_id="test-host",
        repo_root=REPO_ROOT,
        data_root=tmp_path,
-        manifest=SampleManifest.load(REPO_ROOT / "samples" / "manifest.toml"),
+        experiment=_fixture_manifest(),
+        samples=SampleManifest.load(REPO_ROOT / "samples" / "manifest.toml"),
        modules={},  # explicitly empty
    )
    monkeypatch.setattr(fleet, "_msfrpcd_available", lambda *a, **kw: True)
    _patch_subprocess(monkeypatch)
    capacity = fleet.detect_capacity()

-    sample = cfg.manifest.samples[0]
+    sample = cfg.samples.samples[0]
    res = fleet._run_slot(cfg, slot=0, sample=sample, episode_index=0, capacity=capacity)
    assert res.tier == "tier2"


-def test_fleet_force_tier2_overrides_msfrpcd(monkeypatch, tmp_path) -> None:
+def test_fleet_empty_module_catalog_falls_back_to_tier2(monkeypatch, tmp_path) -> None:
+    """An empty module catalog forces Tier-2 fallback even when msfrpcd
+    is reachable. This replaces the former force_tier2 override knob:
+    per PIPELINE.md §14 the closed override list contains only
+    CIS490_ALLOW_DIRTY, and per §1 the right way to disable Tier-3 is
+    to ship no admitted modules — not to flag-flip the orchestrator."""
    from orchestrator import fleet
-    cfg = _fleet_cfg_with_modules(tmp_path, force_tier2=True)
+    from samples.manifest import SampleManifest
+    cfg = fleet.FleetConfig(
+        host_id="test-host",
+        repo_root=REPO_ROOT,
+        data_root=tmp_path,
+        experiment=_fixture_manifest(),
+        samples=SampleManifest.load(REPO_ROOT / "samples" / "manifest.toml"),
+        modules={},  # empty catalog → no Tier-3
+    )
    monkeypatch.setattr(fleet, "_msfrpcd_available", lambda *a, **kw: True)
    _patch_subprocess(monkeypatch)
    capacity = fleet.detect_capacity()

-    sample = cfg.manifest.samples[0]
+    sample = cfg.samples.samples[0]
    res = fleet._run_slot(cfg, slot=0, sample=sample, episode_index=0, capacity=capacity)
    assert res.tier == "tier2"

@ -344,7 +404,7 @@ def test_fleet_skips_requires_bridge_modules_when_no_bridge(monkeypatch, tmp_pat
    _patch_subprocess(monkeypatch)
    capacity = fleet.detect_capacity()

-    sample = cfg.manifest.samples[0]
+    sample = cfg.samples.samples[0]
    seen_modules = set()
    for ep in range(20):
        res = fleet._run_slot(cfg, slot=0, sample=sample, episode_index=ep, capacity=capacity)
@ -379,7 +439,7 @@ def test_tier3_strips_bridge_env_even_when_set(monkeypatch, tmp_path) -> None:
    monkeypatch.setenv("BRIDGE", "br-malware")
    _patch_subprocess(monkeypatch)
    capacity = fleet.detect_capacity()
-    sample = cfg.manifest.samples[0]
+    sample = cfg.samples.samples[0]
    fleet._run_slot(cfg, slot=0, sample=sample, episode_index=0, capacity=capacity)
    assert "BRIDGE" not in _RecordingPopen.calls[-1]["env"]

@ -399,7 +459,7 @@ def test_tier3_drops_requires_bridge_modules_unconditionally(monkeypatch, tmp_pa
    _patch_subprocess(monkeypatch)
    capacity = fleet.detect_capacity()
    slirp_friendly = {k for k, v in cfg.modules.items() if not v.requires_bridge}
-    sample = cfg.manifest.samples[0]
+    sample = cfg.samples.samples[0]
    seen = set()
    for ep in range(40):
        res = fleet._run_slot(cfg, slot=0, sample=sample,
@ -422,7 +482,7 @@ def test_fleet_assigns_unique_port_base_per_slot(monkeypatch, tmp_path) -> None:
    _patch_subprocess(monkeypatch)
    capacity = fleet.detect_capacity()

-    sample = cfg.manifest.samples[0]
+    sample = cfg.samples.samples[0]
    fleet._run_slot(cfg, slot=0, sample=sample, episode_index=0, capacity=capacity)
    fleet._run_slot(cfg, slot=1, sample=sample, episode_index=0, capacity=capacity)
    fleet._run_slot(cfg, slot=2, sample=sample, episode_index=0, capacity=capacity)
--- a/tests/test_manifest.py
+++ b/tests/test_manifest.py
@ -0,0 +1,168 @@
+"""Tests for orchestrator/manifest.py — the canonical experiment
+manifest loader (PIPELINE.md §4.1)."""
+
+from __future__ import annotations
+
+from pathlib import Path
+
+import pytest
+
+from orchestrator.manifest import (
+    KNOWN_COLLECTORS, KNOWN_PHASES, ManifestError, load_canonical,
+)
+
+
+REPO_ROOT = Path(__file__).resolve().parent.parent
+
+MINIMAL_VALID = """
+schema_version = 1
+name = "test-experiment"
+
+[experiment]
+ram_per_vm_mib = 320
+
+[[experiment.schedule.phases]]
+name = "clean"
+seconds = 1.0
+
+[[experiment.schedule.phases]]
+name = "armed"
+seconds = 1.0
+
+[experiment.fleet]
+max_concurrent_ceiling = 0
+max_tier3_slots = 0
+
+[collectors]
+active = ["proc"]
+
+[collectors.intervals]
+proc_ms = 100
+qmp_ms = 1000
+perf_ms = 100
+guest_agent_ms = 100
+pcap_snaplen = 256
+netflow_bucket_ms = 100
+
+[catalog]
+modules = []
+
+[targets]
+images = []
+
+[samples]
+manifest_path = "samples/manifest.toml"
+"""
+
+
+def _write_manifest(repo: Path, body: str) -> None:
+    (repo / "manifest.toml").write_text(body)
+
+
+def test_canonical_manifest_in_repo_loads() -> None:
+    """The actual manifest.toml shipped in the repo MUST load and
+    validate. If it doesn't, every lab host fails preflight."""
+    m = load_canonical(REPO_ROOT)
+    assert m.schema_version == 1
+    assert m.name == "cis490-spectral-v1"
+    # Every active collector must be in KNOWN_COLLECTORS.
+    for c in m.collectors_active:
+        assert c in KNOWN_COLLECTORS
+    # Every scheduled phase name must be in KNOWN_PHASES.
+    for p in m.schedule:
+        assert p.name in KNOWN_PHASES
+
+
+def test_loads_minimal_valid(tmp_path: Path) -> None:
+    _write_manifest(tmp_path, MINIMAL_VALID)
+    m = load_canonical(tmp_path)
+    assert m.name == "test-experiment"
+    assert len(m.schedule) == 2
+    assert m.fleet.max_concurrent_ceiling == 0
+    assert m.collectors_active == ("proc",)
+
+
+def test_missing_file_raises_manifest_error(tmp_path: Path) -> None:
+    with pytest.raises(ManifestError, match="not found"):
+        load_canonical(tmp_path)
+
+
+def test_unsupported_schema_version_raises(tmp_path: Path) -> None:
+    _write_manifest(tmp_path, MINIMAL_VALID.replace(
+        "schema_version = 1", "schema_version = 2"
+    ))
+    with pytest.raises(ManifestError, match="schema_version=2"):
+        load_canonical(tmp_path)
+
+
+def test_unknown_collector_in_active_raises(tmp_path: Path) -> None:
+    _write_manifest(tmp_path, MINIMAL_VALID.replace(
+        'active = ["proc"]', 'active = ["proc", "totally_not_real"]'
+    ))
+    with pytest.raises(ManifestError, match="totally_not_real"):
+        load_canonical(tmp_path)
+
+
+def test_duplicate_collector_in_active_raises(tmp_path: Path) -> None:
+    _write_manifest(tmp_path, MINIMAL_VALID.replace(
+        'active = ["proc"]', 'active = ["proc", "proc"]'
+    ))
+    with pytest.raises(ManifestError, match="duplicate"):
+        load_canonical(tmp_path)
+
+
+def test_unknown_phase_name_raises(tmp_path: Path) -> None:
+    bad = MINIMAL_VALID.replace('name = "armed"', 'name = "magical"', 1)
+    _write_manifest(tmp_path, bad)
+    with pytest.raises(ManifestError, match="magical"):
+        load_canonical(tmp_path)
+
+
+def test_negative_phase_seconds_raises(tmp_path: Path) -> None:
+    bad = MINIMAL_VALID.replace('seconds = 1.0\n\n[[experiment.schedule.phases]]\nname = "armed"\nseconds = 1.0',
+                                'seconds = 1.0\n\n[[experiment.schedule.phases]]\nname = "armed"\nseconds = -5.0')
+    _write_manifest(tmp_path, bad)
+    with pytest.raises(ManifestError, match="must be > 0"):
+        load_canonical(tmp_path)
+
+
+def test_negative_ram_raises(tmp_path: Path) -> None:
+    bad = MINIMAL_VALID.replace("ram_per_vm_mib = 320", "ram_per_vm_mib = -1")
+    _write_manifest(tmp_path, bad)
+    with pytest.raises(ManifestError, match="positive"):
+        load_canonical(tmp_path)
+
+
+def test_catalog_entry_missing_verified_against_raises(tmp_path: Path) -> None:
+    bad = MINIMAL_VALID.replace(
+        "[catalog]\nmodules = []",
+        '[catalog]\n[[catalog.modules]]\nname = "fixture"\nlast_verified = "abc"\n',
+    )
+    _write_manifest(tmp_path, bad)
+    with pytest.raises(ManifestError, match="verified_against"):
+        load_canonical(tmp_path)
+
+
+def test_catalog_entry_with_both_fields_loads(tmp_path: Path) -> None:
+    valid = MINIMAL_VALID.replace(
+        "[catalog]\nmodules = []",
+        ('[catalog]\n[[catalog.modules]]\nname = "fixture"\n'
+         'verified_against = "test-target"\nlast_verified = "abc"\n'),
+    )
+    _write_manifest(tmp_path, valid)
+    m = load_canonical(tmp_path)
+    assert len(m.catalog) == 1
+    assert m.catalog[0].name == "fixture"
+    assert m.catalog[0].verified_against == "test-target"
+    assert m.catalog[0].last_verified == "abc"
+
+
+def test_to_meta_round_trips_to_json_safe(tmp_path: Path) -> None:
+    """meta.json embedding requires the to_meta dict be json-encodable."""
+    import json
+    _write_manifest(tmp_path, MINIMAL_VALID)
+    m = load_canonical(tmp_path)
+    encoded = json.dumps(m.to_meta())
+    decoded = json.loads(encoded)
+    assert decoded["schema_version"] == 1
+    assert decoded["name"] == "test-experiment"
--- a/tools/run_fleet.py
+++ b/tools/run_fleet.py
@ -1,13 +1,20 @@
 """``cis490-fleet`` — run as many concurrent labeled episodes as the
-host can handle, drawing samples from the manifest.
+host can handle, drawing samples and experiment shape from the
+canonical manifest at <repo_root>/manifest.toml.
+
+Per PIPELINE.md §4.1 the experiment manifest is the single source of
+truth for what the experiment is. There are NO `--manifest`,
+`--modules-dir`, `--ram-per-vm-mib`, `--max-concurrent`,
+`--max-tier3-slots`, `--force-tier2`, or `--require-real-samples`
+flags — those would all be per-host overrides of the canonical
+experiment shape, which §4.1 forbids. A host that can't load the
+canonical manifest exits 78 (§9, §4.7).

 Modes:

  --capacity     Print the resource calculation and exit. No VMs spawned.
  --waves N      Run N waves of episodes (one wave = max_concurrent
                 episodes, each in its own slot). Default: 1.
-  --max-concurrent N
-                 Cap concurrency below the auto-detected ceiling.
 """

 from __future__ import annotations
@ -27,26 +34,26 @@ from exploits.modules import load_module_configs  # noqa: E402
 from orchestrator.fleet import (  # noqa: E402
    FleetConfig, FleetRunner, capacity_report, detect_capacity,
 )
+from orchestrator.manifest import ManifestError, load_canonical  # noqa: E402
 from samples.manifest import SampleManifest  # noqa: E402


+# LSB-style sysadmin-error exit code: do NOT respawn (paired with
+# RestartPreventExitStatus=78 on cis490-orchestrator.service).
+EXIT_SYSADMIN_ERROR = 78
+
+
 def main(argv: list[str] | None = None) -> int:
    p = argparse.ArgumentParser(prog="cis490-fleet")
-    p.add_argument("--capacity", action="store_true")
-    p.add_argument("--waves", type=int, default=1)
-    p.add_argument("--max-concurrent", type=int, default=None)
-    p.add_argument("--manifest",
-                   default=str(Path(__file__).resolve().parent.parent / "samples" / "manifest.toml"))
-    p.add_argument("--modules-dir",
-                   default=str(Path(__file__).resolve().parent.parent / "exploits" / "modules"))
-    p.add_argument("--data-root", default="data")
-    p.add_argument("--host-id", default=os.environ.get("FLEET_HOST_ID") or os.uname().nodename)
-    p.add_argument("--ram-per-vm-mib", type=int, default=320)
-    p.add_argument("--require-real-samples", action="store_true")
-    p.add_argument("--force-tier2", action="store_true",
-                   help="Skip Tier 3 even when msfrpcd is reachable")
-    p.add_argument("--max-tier3-slots", type=int, default=None,
-                   help="Cap concurrent Tier-3 slots; slots >= N fall back to Tier-2")
+    p.add_argument("--capacity", action="store_true",
+                   help="Print capacity calculation and exit")
+    p.add_argument("--waves", type=int, default=1,
+                   help="Number of episode waves to run")
+    p.add_argument("--data-root", default="data",
+                   help="Per-host writable directory for episode dirs")
+    p.add_argument("--host-id",
+                   default=os.environ.get("FLEET_HOST_ID") or os.uname().nodename,
+                   help="Stable identity for this host (per-host)")
    p.add_argument("--log-level", default="INFO")
    args = p.parse_args(argv)

@ -54,27 +61,64 @@ def main(argv: list[str] | None = None) -> int:
        level=getattr(logging, args.log_level.upper(), logging.INFO),
        format="%(asctime)s %(levelname)s %(name)s %(message)s",
    )
+    log = logging.getLogger("cis490.fleet")
+
+    repo_root = Path(__file__).resolve().parent.parent
+
+    # Canonical manifest. PIPELINE.md §4.1 / §4.7 — failure to load is
+    # a sysadmin error, not a runtime warning. Exit 78 so systemd's
+    # RestartPreventExitStatus=78 keeps the unit stuck-and-loud rather
+    # than respawning into the same broken state.
+    try:
+        experiment = load_canonical(repo_root)
+    except ManifestError as e:
+        log.error("canonical manifest failed to load: %s", e)
+        log.error(
+            "this host cannot run the experiment without a valid "
+            "manifest.toml at %s. Per §4.1 there is no fallback. "
+            "Exiting %d (sysadmin error).",
+            repo_root / "manifest.toml", EXIT_SYSADMIN_ERROR,
+        )
+        return EXIT_SYSADMIN_ERROR
+
+    log.info("loaded experiment manifest: %s v%d (schema_version=%d)",
+             experiment.name, 1, experiment.schema_version)

    if args.capacity:
-        print(capacity_report())
+        print(capacity_report(ram_per_vm_mib=experiment.ram_per_vm_mib))
        return 0

-    manifest = SampleManifest.load(args.manifest)
-    repo_root = Path(__file__).resolve().parent.parent
-    modules_dir = Path(args.modules_dir)
-    modules = load_module_configs(modules_dir) if modules_dir.exists() else {}
+    samples_path = repo_root / experiment.samples_manifest_path
+    samples = SampleManifest.load(samples_path)
+
+    # Module catalog admission: only load modules whose name appears in
+    # experiment.catalog. Modules without a catalog entry don't get
+    # loaded into the runtime catalog regardless of whether their toml
+    # is on disk. PIPELINE.md §4.3.
+    modules_dir = repo_root / "exploits" / "modules"
+    on_disk_modules = (
+        load_module_configs(modules_dir) if modules_dir.exists() else {}
+    )
+    admitted_names = {entry.name for entry in experiment.catalog}
+    modules = {
+        name: cfg for name, cfg in on_disk_modules.items()
+        if name in admitted_names
+    }
+    if admitted_names and (admitted_names - set(modules.keys())):
+        missing = sorted(admitted_names - set(modules.keys()))
+        log.error(
+            "manifest.catalog references modules whose toml is missing "
+            "from %s: %s. Refusing to start (§4.3).", modules_dir, missing,
+        )
+        return EXIT_SYSADMIN_ERROR

    cfg = FleetConfig(
        host_id=args.host_id,
        repo_root=repo_root,
        data_root=Path(args.data_root).resolve(),
-        manifest=manifest,
+        experiment=experiment,
+        samples=samples,
        modules=modules,
-        ram_per_vm_mib=args.ram_per_vm_mib,
-        max_concurrent_override=args.max_concurrent,
-        require_real_samples=args.require_real_samples,
-        force_tier2=args.force_tier2,
-        max_tier3_slots=args.max_tier3_slots,
    )

    runner = FleetRunner(cfg)
@ -88,6 +132,7 @@ def main(argv: list[str] | None = None) -> int:

    print(json.dumps({
        "host_id": args.host_id,
+        "experiment": experiment.name,
        "capacity": result.capacity.to_dict(),
        "modules_loaded": sorted(modules.keys()),
        "slots": [
--- a/tools/run_real_vm_demo.py
+++ b/tools/run_real_vm_demo.py
@ -29,6 +29,7 @@ sys.path.insert(0, str(Path(__file__).resolve().parent))

 from collectors import qmp  # noqa: E402
 from orchestrator.episode import EpisodeConfig, EpisodeRunner  # noqa: E402
+from orchestrator.manifest import ManifestError, load_canonical  # noqa: E402
 from samples.manifest import SampleManifest  # noqa: E402
 from vm_load_controller import VMLoadController  # noqa: E402
 from vm_serial import SerialClient  # noqa: E402
@ -98,12 +99,9 @@ def main() -> int:
    parser.add_argument(
        "--sample",
        default=os.environ.get("SAMPLE_NAME"),
-        help="Pick a workload profile from the manifest by name. Fleet runner "
-        "passes this via SAMPLE_NAME env. If unset, runs the v1 yes-loop.",
-    )
-    parser.add_argument(
-        "--manifest",
-        default=str(Path(__file__).resolve().parent.parent / "samples" / "manifest.toml"),
+        help="Pick a workload profile from the samples manifest by name. "
+        "Fleet runner passes this via SAMPLE_NAME env. If unset, runs the "
+        "v1 yes-loop.",
    )
    args = parser.parse_args()

@ -116,13 +114,23 @@ def main() -> int:
    repo_root = Path(__file__).resolve().parent.parent
    launcher = repo_root / "vm" / "launch_demo.sh"

+    # Canonical experiment manifest (PIPELINE.md §4.1). The samples-manifest
+    # path comes from here too; no per-call override.
+    try:
+        experiment = load_canonical(repo_root)
+    except ManifestError as e:
+        log.error("canonical manifest failed to load: %s", e)
+        return 78
+    samples_path = repo_root / experiment.samples_manifest_path
+
    # Resolve sample if requested.
    sample = None
    if args.sample:
-        manifest = SampleManifest.load(args.manifest)
-        sample = next((s for s in manifest.samples if s.name == args.sample), None)
+        samples = SampleManifest.load(samples_path)
+        sample = next((s for s in samples.samples if s.name == args.sample), None)
        if sample is None:
-            log.error("sample %r not in manifest %s", args.sample, args.manifest)
+            log.error("sample %r not in samples manifest %s",
+                      args.sample, samples_path)
            return 2
        log.info("using sample=%s profile=%s kind=%s",
                 sample.name, sample.profile, sample.kind)
@ -217,6 +225,7 @@ def main() -> int:
            qmp_socket=qmp_sock if qmp_sock.exists() else None,
            guest_agent_socket=agent_sock if agent_sock.exists() else None,
            bridge_iface=os.environ.get("BRIDGE") or None,
+            experiment_meta=experiment.to_meta(),
            # Source 3 (oracle): perf-stat sampling of the qemu PID.
            # Now that the stdout/stderr + event-name parser bugs are
            # fixed, perf produces real rows. Per PIPELINE.md §4.4
--- a/tools/run_tier3_demo.py
+++ b/tools/run_tier3_demo.py
@ -38,6 +38,7 @@ from exploits.driver import DriverConfig, MSFExploitDriver  # noqa: E402
 from exploits.modules import load_module_config  # noqa: E402
 from exploits.msfrpc import MSFRpcClient, MSFRpcConfig  # noqa: E402
 from orchestrator.episode import EpisodeConfig, EpisodeRunner  # noqa: E402
+from orchestrator.manifest import ManifestError, load_canonical  # noqa: E402
 from samples.manifest import SampleManifest  # noqa: E402


@ -170,12 +171,9 @@ def main() -> int:
    parser.add_argument(
        "--sample",
        default=os.environ.get("SAMPLE_NAME"),
-        help="Pick a workload profile from the manifest by name. Fleet runner "
-        "passes this via SAMPLE_NAME env. Without it, falls back to the v1 yes-loop.",
-    )
-    parser.add_argument(
-        "--manifest",
-        default=str(Path(__file__).resolve().parent.parent / "samples" / "manifest.toml"),
+        help="Pick a workload profile from the samples manifest by name. "
+        "Fleet runner passes this via SAMPLE_NAME env. Without it, falls "
+        "back to the v1 yes-loop.",
    )
    args = parser.parse_args()

@ -198,15 +196,24 @@ def main() -> int:
        log.error("no module config at %s", module_path)
        return 2

+    # Canonical experiment manifest (PIPELINE.md §4.1).
+    try:
+        experiment = load_canonical(repo_root)
+    except ManifestError as e:
+        log.error("canonical manifest failed to load: %s", e)
+        return 78
+    samples_path = repo_root / experiment.samples_manifest_path
+
    module = load_module_config(module_path)
    log.info("module loaded: %s (%s)", module.name, module.module_path)

    sample = None
    if args.sample:
-        manifest = SampleManifest.load(args.manifest)
-        sample = next((s for s in manifest.samples if s.name == args.sample), None)
+        samples = SampleManifest.load(samples_path)
+        sample = next((s for s in samples.samples if s.name == args.sample), None)
        if sample is None:
-            log.error("sample %r not in manifest %s", args.sample, args.manifest)
+            log.error("sample %r not in samples manifest %s",
+                      args.sample, samples_path)
            return 2
        log.info("sample=%s profile=%s kind=%s",
                 sample.name, sample.profile, sample.kind)
@ -308,6 +315,7 @@ def main() -> int:
            qmp_socket=qmp_sock if qmp_sock.exists() else None,
            guest_agent_socket=agent_sock if agent_sock.exists() else None,
            bridge_iface=os.environ.get("BRIDGE") or None,
+            experiment_meta=experiment.to_meta(),
            # Source 3 (oracle): see equivalent in run_real_vm_demo.py.
            # Now that the perf collector parser is fixed, turn it on
            # in production rather than leave it silently disabled.