PIPELINE §5 step 2: canonical manifest at <repo>/manifest.toml
The experiment is now defined by a single version-pinned file —
manifest.toml at the repo root. PIPELINE.md §4.1 / §13 / §16. Every
lab host loads THIS exact file; per-host overrides of experiment
shape are forbidden.
Drops the following per-host CLI overrides that previously violated
the canonical-manifest principle:
* --manifest, --modules-dir (paths now derived)
* --ram-per-vm-mib (in manifest.experiment)
* --max-concurrent (manifest.experiment.fleet.max_concurrent_ceiling)
* --max-tier3-slots (manifest.experiment.fleet.max_tier3_slots)
* --force-tier2 (not a §14 sanctioned override knob —
ship empty catalog to disable Tier-3)
* --require-real-samples (sample-side concern; out of fleet scope)
* tools/run_*_demo.py --manifest (samples path now from canonical)
New surface:
* manifest.toml — the single source of truth
* orchestrator/manifest.py — load_canonical() + Manifest dataclass
with strict validation, raises
ManifestError on any failure
* EpisodeConfig.experiment_meta — populated by run_*_demo.py from
the canonical manifest; stamped
into every episode's meta.json
under "experiment" key for
provenance
* cis490-orchestrator.service — RestartPreventExitStatus=78 so
manifest-load failures stay
stuck-and-loud (§9, §4.7)
* install-lab-host.sh — validates manifest.toml at
install time; missing or invalid
= die with clear message
Catalog admission semantics: only modules whose name appears in
manifest.catalog get loaded into the runtime catalog (§4.3 in
miniature, will tighten further in step 4 when verified_against /
last_verified actually gate admission). Missing toml for an admitted
name is a sysadmin error → exit 78.
Renames cfg.manifest → cfg.samples + adds cfg.experiment to
disambiguate sample-manifest from experiment-manifest. Rewrites
test_fleet.py fixture to construct synthetic Manifest objects so
test outcomes don't depend on the on-disk manifest.toml content.
12 new tests in tests/test_manifest.py: schema-version mismatch,
unknown collector, duplicate collector, unknown phase, negative
phase seconds, negative ram, missing catalog fields, json round-trip.
Local run: `python tools/run_fleet.py --capacity` correctly logs the
loaded manifest and prints capacity. 241 tests passing.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
parent
dca6144a4a
commit
207a902c3e
11 changed files with 971 additions and 105 deletions
|
|
@ -18,17 +18,25 @@ EnvironmentFile=/etc/cis490/lab-host.env
|
|||
# unit still starts on Tier-2-only hosts where msfrpcd isn't installed.
|
||||
EnvironmentFile=-/etc/cis490/msfrpc.env
|
||||
# Fleet mode: detect host capacity, run that many concurrent episodes
|
||||
# per wave with samples drawn from the manifest. Each invocation runs
|
||||
# one wave and exits; systemd respawns per Restart= below, giving us
|
||||
# a continuous stream of fresh-sample episodes per host. The shipper
|
||||
# picks them up as `done.marker` files appear.
|
||||
# per wave with samples + experiment shape drawn from the canonical
|
||||
# manifest at /opt/cis490/manifest.toml. Each invocation runs one wave
|
||||
# and exits; systemd respawns per Restart= below.
|
||||
#
|
||||
# Per PIPELINE.md §4.1 there are no --manifest, --max-tier3-slots,
|
||||
# --ram-per-vm-mib, --max-concurrent, --force-tier2, or
|
||||
# --require-real-samples flags. Experiment-shape parameters live in
|
||||
# manifest.toml. Per-host overrides are forbidden.
|
||||
#
|
||||
# Exit 78 (sysadmin error) when the canonical manifest fails to load
|
||||
# or when the host can't run the experiment. RestartPreventExitStatus=78
|
||||
# keeps the unit stuck-and-loud rather than respawning into the same
|
||||
# broken state — operator notices and fixes.
|
||||
ExecStart=/opt/cis490/.venv/bin/python /opt/cis490/tools/run_fleet.py \
|
||||
--data-root /var/lib/cis490/data \
|
||||
--manifest /opt/cis490/samples/manifest.toml \
|
||||
--waves 1 \
|
||||
--max-tier3-slots 4
|
||||
--waves 1
|
||||
Restart=always
|
||||
RestartSec=15
|
||||
RestartPreventExitStatus=78
|
||||
|
||||
# Hardening — explicitly grant CAP_NET_RAW for tcpdump (source 4) and
|
||||
# CAP_SYS_ADMIN / CAP_PERFMON for perf (source 3) when the operator
|
||||
|
|
|
|||
145
manifest.toml
Normal file
145
manifest.toml
Normal file
|
|
@ -0,0 +1,145 @@
|
|||
# CIS490 canonical experiment manifest.
|
||||
#
|
||||
# This file is the single source of truth for what the experiment IS:
|
||||
# which collectors run, at what cadence, against what target images,
|
||||
# with which exploit modules in rotation, walking which phase budget,
|
||||
# under what concurrency cap. Every lab host loads THIS exact file.
|
||||
# Per-host overrides are forbidden — PIPELINE.md §4.1 / §13.
|
||||
#
|
||||
# Hosts that cannot run the canonical experiment exit 78 at
|
||||
# orchestrator startup (PIPELINE.md §4.7). They produce zero episodes
|
||||
# and say so loudly. There is no "ship what we can" fallback.
|
||||
#
|
||||
# Substantive amendments to this file follow PIPELINE.md §16:
|
||||
# operator sign-off (§15), §8 decision tests, lands in the same merge
|
||||
# as the code change it justifies. "This is just config" is not a
|
||||
# free pass — the manifest is admission scope (§13).
|
||||
|
||||
schema_version = 1
|
||||
name = "cis490-spectral-v1"
|
||||
|
||||
# ---------------------------------------------------------------------
|
||||
# [experiment] — episode-shape parameters
|
||||
# ---------------------------------------------------------------------
|
||||
[experiment]
|
||||
# Per-VM RAM in mebibytes. The capacity detector divides available
|
||||
# host memory by this number to compute max_concurrent. Every slot
|
||||
# gets the same RAM — no per-host fuzz.
|
||||
ram_per_vm_mib = 320
|
||||
|
||||
# ---------------------------------------------------------------------
|
||||
# [experiment.schedule] — phase budget (§4.5)
|
||||
# ---------------------------------------------------------------------
|
||||
# Each phase carries a duration in seconds — the MAXIMUM time the
|
||||
# orchestrator will wait in that phase. Phase TRANSITIONS are written
|
||||
# from observed events (PIPELINE.md §4.5), not from this clock. The
|
||||
# clock is just an upper bound: if no event fires within the budget,
|
||||
# the orchestrator advances and records the failure.
|
||||
[[experiment.schedule.phases]]
|
||||
name = "clean"
|
||||
seconds = 10.0
|
||||
|
||||
[[experiment.schedule.phases]]
|
||||
name = "armed"
|
||||
seconds = 3.0
|
||||
|
||||
[[experiment.schedule.phases]]
|
||||
name = "infecting"
|
||||
seconds = 5.0
|
||||
|
||||
[[experiment.schedule.phases]]
|
||||
name = "infected_running"
|
||||
seconds = 25.0
|
||||
|
||||
[[experiment.schedule.phases]]
|
||||
name = "dormant"
|
||||
seconds = 15.0
|
||||
|
||||
[[experiment.schedule.phases]]
|
||||
name = "infected_running"
|
||||
seconds = 20.0
|
||||
|
||||
[[experiment.schedule.phases]]
|
||||
name = "dormant"
|
||||
seconds = 5.0
|
||||
|
||||
[[experiment.schedule.phases]]
|
||||
name = "clean"
|
||||
seconds = 5.0
|
||||
|
||||
# ---------------------------------------------------------------------
|
||||
# [experiment.fleet] — concurrency policy
|
||||
# ---------------------------------------------------------------------
|
||||
[experiment.fleet]
|
||||
# Hard ceiling on per-wave slots regardless of host capacity. 0 = no
|
||||
# ceiling (capacity detector decides). Hosts whose capacity exceeds
|
||||
# this still run only `max_concurrent_ceiling` slots, so the dataset
|
||||
# isn't dominated by the largest host.
|
||||
max_concurrent_ceiling = 0
|
||||
|
||||
# Cap of Tier-3 (real-exploit) slots per wave. Slots beyond this fall
|
||||
# back to Tier-2. 0 = no cap. Useful when msfrpcd contention starts
|
||||
# to matter; left at 0 by default.
|
||||
max_tier3_slots = 0
|
||||
|
||||
# ---------------------------------------------------------------------
|
||||
# [collectors] — active set (§4.4)
|
||||
# ---------------------------------------------------------------------
|
||||
# A collector listed here MUST emit ≥1 row when run against the
|
||||
# canonical-manifest experiment (§4.4 admission). A collector that
|
||||
# can't pass admission is REMOVED from this list — never silently
|
||||
# included with zero rows.
|
||||
[collectors]
|
||||
active = [
|
||||
"proc",
|
||||
"qmp",
|
||||
"perf",
|
||||
"guest_agent",
|
||||
"pcap",
|
||||
"netflow",
|
||||
]
|
||||
|
||||
[collectors.intervals]
|
||||
proc_ms = 100
|
||||
qmp_ms = 1000
|
||||
perf_ms = 100
|
||||
guest_agent_ms = 100
|
||||
pcap_snaplen = 256
|
||||
netflow_bucket_ms = 100
|
||||
|
||||
# ---------------------------------------------------------------------
|
||||
# [catalog] — Tier-3 module catalog (§4.3)
|
||||
# ---------------------------------------------------------------------
|
||||
# Each entry references a module config in `exploits/modules/<name>.toml`.
|
||||
# Every entry MUST carry `verified_against` and `last_verified`. The
|
||||
# absence of either drops the module from the active catalog.
|
||||
#
|
||||
# Empty as of 2026-05-04: PIPELINE.md §3 found 0/67 session_open on
|
||||
# samba_usermap_script against the SourceForge Metasploitable2 image,
|
||||
# and no in-house verified target exists yet. §5 step 3 builds the
|
||||
# target VM, step 4 re-admits modules with verification recorded.
|
||||
# Until then, Tier-3 episodes do not run — the dataset is honest
|
||||
# Tier-2 only.
|
||||
[catalog]
|
||||
modules = []
|
||||
|
||||
# ---------------------------------------------------------------------
|
||||
# [targets] — target VM images (§4.2)
|
||||
# ---------------------------------------------------------------------
|
||||
# Each entry pins (image_name, sha256, build_script). The image MUST
|
||||
# have been produced by `build_script` and verified per §4.2. Hosts
|
||||
# that don't have the expected image at the expected sha256 fail
|
||||
# preflight (§4.7) and produce zero episodes.
|
||||
#
|
||||
# Empty until §5 step 3 lands declarative target builds.
|
||||
[targets]
|
||||
images = []
|
||||
|
||||
# ---------------------------------------------------------------------
|
||||
# [samples] — pointer to the malware-sample manifest
|
||||
# ---------------------------------------------------------------------
|
||||
# Samples roll forward independently of experiment shape (new sample
|
||||
# = manifest entry; doesn't change collector set or schedule). Path
|
||||
# is relative to repo root.
|
||||
[samples]
|
||||
manifest_path = "samples/manifest.toml"
|
||||
|
|
@ -160,6 +160,13 @@ class EpisodeConfig:
|
|||
# doesn't need to import exploits.modules; populated by callers
|
||||
# that have a ModuleConfig in hand.
|
||||
exploit_meta: dict | None = None
|
||||
# Canonical experiment manifest provenance (PIPELINE.md §4.1).
|
||||
# Plain dict (output of Manifest.to_meta()) so the runner doesn't
|
||||
# need to import orchestrator.manifest; populated by run_fleet /
|
||||
# run_*_demo via load_canonical(). When None, meta.experiment is
|
||||
# null — useful for unit tests that don't go through the canonical
|
||||
# path; production episodes always carry it.
|
||||
experiment_meta: dict | None = None
|
||||
# Snapshot/revert (Tier 0+):
|
||||
# revert_at_start — before any phase walks, loadvm <snapshot_name>.
|
||||
# Use this to drop the guest back to a known-good baseline at
|
||||
|
|
@ -466,6 +473,7 @@ class EpisodeRunner:
|
|||
},
|
||||
"exploit": self.cfg.exploit_meta,
|
||||
"sample": sample_meta,
|
||||
"experiment": self.cfg.experiment_meta,
|
||||
"schedule": {
|
||||
"baseline_seconds": self.cfg.duration_s,
|
||||
"interval_ms": self.cfg.interval_ms,
|
||||
|
|
|
|||
|
|
@ -43,6 +43,8 @@ from exploits.modules import (
|
|||
)
|
||||
from samples.manifest import Sample, SampleManifest
|
||||
|
||||
from .manifest import Manifest
|
||||
|
||||
|
||||
log = logging.getLogger("cis490.fleet")
|
||||
|
||||
|
|
@ -93,32 +95,51 @@ class FleetCapacity:
|
|||
|
||||
@dataclass
|
||||
class FleetConfig:
|
||||
"""Per-host config that combines (a) the canonical experiment
|
||||
manifest, (b) per-host identity (host_id), and (c) host-local
|
||||
paths (data_root, repo_root). Experiment-shape parameters live
|
||||
in `experiment` (the canonical Manifest). Per-host overrides of
|
||||
experiment shape are forbidden — see PIPELINE.md §4.1."""
|
||||
host_id: str
|
||||
repo_root: Path
|
||||
data_root: Path
|
||||
manifest: SampleManifest
|
||||
# Canonical experiment manifest — drives schedule, ram_per_vm_mib,
|
||||
# collector active set, intervals, fleet ceilings, catalog, targets.
|
||||
experiment: "Manifest"
|
||||
# Sample manifest (separate file at samples/manifest.toml). Holds
|
||||
# the per-malware-family sample list; rolls forward independently
|
||||
# of experiment shape so a new sample is one-line, not a
|
||||
# canonical-manifest amendment.
|
||||
samples: SampleManifest
|
||||
# Module catalog for Tier-3 dispatch. Required for fleet-driven
|
||||
# exploit-fire variety; empty catalog forces Tier-2 fallback.
|
||||
# exploit-fire variety; empty catalog forces Tier-2 fallback. The
|
||||
# canonical manifest's catalog admission gates which modules end
|
||||
# up here (only those whose toml exists AND whose name is in
|
||||
# experiment.catalog get loaded).
|
||||
modules: dict[str, ModuleConfig] = field(default_factory=dict)
|
||||
# VM resource shape — must match what the launcher requests.
|
||||
ram_per_vm_mib: int = 320
|
||||
# Cap concurrency below the calculated max (e.g. for a smoke test).
|
||||
max_concurrent_override: int | None = None
|
||||
# Skip episodes whose sample requires a real binary that's not present.
|
||||
require_real_samples: bool = False
|
||||
# Force Tier-2 even when msfrpcd is up; used by tests + dev runs
|
||||
# that want a no-exploit baseline.
|
||||
force_tier2: bool = False
|
||||
# Limit how many slots per wave run as Tier-3. Slots 0..N-1 get
|
||||
# Tier-3; the rest fall back to Tier-2. Metasploitable2 boot is IO-
|
||||
# bound: running >~6 concurrent target VMs saturates disk and causes
|
||||
# all slots to timeout waiting for the guest service to come up.
|
||||
# None = no cap (all eligible slots use Tier-3).
|
||||
max_tier3_slots: int | None = None
|
||||
# msfrpcd connectivity (read by tier-3 driver via env).
|
||||
# msfrpcd connectivity (read by tier-3 driver via env). Per-host
|
||||
# because msfrpcd may be on a non-loopback bind on some hosts;
|
||||
# NOT an experiment-shape parameter.
|
||||
msfrpcd_host: str = "127.0.0.1"
|
||||
msfrpcd_port: int = 55553
|
||||
|
||||
# ---- experiment-shape derived properties (manifest-driven) -----
|
||||
|
||||
@property
|
||||
def ram_per_vm_mib(self) -> int:
|
||||
return self.experiment.ram_per_vm_mib
|
||||
|
||||
@property
|
||||
def max_concurrent_ceiling(self) -> int:
|
||||
"""Hard ceiling on per-wave slots; 0 = no ceiling (capacity
|
||||
detector decides)."""
|
||||
return self.experiment.fleet.max_concurrent_ceiling
|
||||
|
||||
@property
|
||||
def max_tier3_slots(self) -> int:
|
||||
"""Cap of Tier-3 slots per wave; 0 = no cap."""
|
||||
return self.experiment.fleet.max_tier3_slots
|
||||
|
||||
|
||||
def _read_meminfo() -> dict[str, int]:
|
||||
out: dict[str, int] = {}
|
||||
|
|
@ -263,11 +284,16 @@ def _run_slot(
|
|||
{k: v for k, v in cfg.modules.items() if not v.requires_bridge}
|
||||
if cfg.modules else {}
|
||||
)
|
||||
# Tier-3 dispatch requires (a) at least one verified module loaded
|
||||
# AND not filtered out by requires_bridge, (b) msfrpcd reachable,
|
||||
# (c) the slot index is below the manifest's max_tier3_slots cap
|
||||
# (0 = no cap). force_tier2 is no longer a sanctioned override knob
|
||||
# per §14 — to disable Tier-3 globally, ship an empty catalog (the
|
||||
# current state); to disable per-slot, raise the cap.
|
||||
tier3_ready = (
|
||||
not cfg.force_tier2
|
||||
and bool(usable_modules)
|
||||
bool(usable_modules)
|
||||
and _msfrpcd_available(cfg.msfrpcd_host, cfg.msfrpcd_port)
|
||||
and (cfg.max_tier3_slots is None or slot < cfg.max_tier3_slots)
|
||||
and (cfg.max_tier3_slots == 0 or slot < cfg.max_tier3_slots)
|
||||
)
|
||||
|
||||
env = os.environ.copy()
|
||||
|
|
@ -345,13 +371,13 @@ def _run_slot(
|
|||
]
|
||||
tier = "tier2"
|
||||
module_name = None
|
||||
if not cfg.force_tier2 and not cfg.modules:
|
||||
if not cfg.modules:
|
||||
log.warning("slot=%d falling back to Tier 2: empty module catalog", slot)
|
||||
elif not cfg.force_tier2 and not usable_modules:
|
||||
elif not usable_modules:
|
||||
log.warning("slot=%d falling back to Tier 2: no non-bridge modules available", slot)
|
||||
elif not cfg.force_tier2 and cfg.max_tier3_slots is not None and slot >= cfg.max_tier3_slots:
|
||||
elif cfg.max_tier3_slots > 0 and slot >= cfg.max_tier3_slots:
|
||||
log.debug("slot=%d Tier 2 by max_tier3_slots=%d cap", slot, cfg.max_tier3_slots)
|
||||
elif not cfg.force_tier2:
|
||||
else:
|
||||
log.warning("slot=%d falling back to Tier 2: msfrpcd unreachable at %s:%d",
|
||||
slot, cfg.msfrpcd_host, cfg.msfrpcd_port)
|
||||
|
||||
|
|
@ -422,8 +448,12 @@ class FleetRunner:
|
|||
ram_per_vm_mib=self.cfg.ram_per_vm_mib,
|
||||
)
|
||||
n_slots = capacity.max_concurrent
|
||||
if self.cfg.max_concurrent_override is not None:
|
||||
n_slots = min(n_slots, self.cfg.max_concurrent_override)
|
||||
# Manifest ceiling: 0 = no ceiling. The capacity detector
|
||||
# decides per-host based on hardware; the manifest caps that
|
||||
# decision to keep the dataset balanced across hosts of
|
||||
# different sizes (PIPELINE.md §4.1).
|
||||
if self.cfg.max_concurrent_ceiling > 0:
|
||||
n_slots = min(n_slots, self.cfg.max_concurrent_ceiling)
|
||||
if n_slots <= 0:
|
||||
log.warning(
|
||||
"fleet capacity is zero (%s); cannot run", capacity.rationale,
|
||||
|
|
@ -433,8 +463,9 @@ class FleetRunner:
|
|||
)
|
||||
|
||||
log.info(
|
||||
"fleet host=%s slots=%d episodes=%d manifest_size=%d",
|
||||
self.cfg.host_id, n_slots, episodes, len(self.cfg.manifest),
|
||||
"fleet host=%s slots=%d episodes=%d samples=%d experiment=%s",
|
||||
self.cfg.host_id, n_slots, episodes, len(self.cfg.samples),
|
||||
self.cfg.experiment.name,
|
||||
)
|
||||
|
||||
all_results: list[SlotResult] = []
|
||||
|
|
@ -444,18 +475,13 @@ class FleetRunner:
|
|||
break
|
||||
episode_index = episode_index_base + ep
|
||||
slot_samples = [
|
||||
self.cfg.manifest.select(
|
||||
self.cfg.samples.select(
|
||||
host_id=self.cfg.host_id,
|
||||
slot=slot,
|
||||
episode_index=episode_index,
|
||||
)
|
||||
for slot in range(n_slots)
|
||||
]
|
||||
if self.cfg.require_real_samples:
|
||||
slot_samples = [s for s in slot_samples if s.kind == "real"]
|
||||
if not slot_samples:
|
||||
log.warning("require_real_samples: no real samples in manifest; skipping wave")
|
||||
continue
|
||||
|
||||
log.info(
|
||||
"wave %d/%d: %s",
|
||||
|
|
@ -491,8 +517,8 @@ class FleetRunner:
|
|||
# ---------------------------------------------------------------------------
|
||||
|
||||
|
||||
def capacity_report() -> str:
|
||||
c = detect_capacity()
|
||||
def capacity_report(*, ram_per_vm_mib: int = 320) -> str:
|
||||
c = detect_capacity(ram_per_vm_mib=ram_per_vm_mib)
|
||||
return (
|
||||
f"cores: {c.cores_total} (reserve {c.cores_reserved})\n"
|
||||
f"ram: {c.ram_total_mib} MiB total, {c.ram_available_mib} MiB available "
|
||||
|
|
|
|||
371
orchestrator/manifest.py
Normal file
371
orchestrator/manifest.py
Normal file
|
|
@ -0,0 +1,371 @@
|
|||
"""Canonical experiment manifest (PIPELINE.md §4.1 / §13).
|
||||
|
||||
The manifest at `<repo_root>/manifest.toml` is the single source of
|
||||
truth for what the experiment is: which collectors run, at what
|
||||
cadence, against what targets, with which exploit modules in rotation,
|
||||
walking which phase budget. Every lab host loads THIS file. There is
|
||||
no per-host override flag, no `--manifest <path>` argument, no
|
||||
fallback. A host that can't load and validate the canonical manifest
|
||||
must exit 78 and ship zero episodes.
|
||||
|
||||
`load_canonical(repo_root)` reads from the fixed path and validates.
|
||||
On any failure it raises `ManifestError`; callers translate that into
|
||||
exit 78. `Manifest` is a frozen dataclass — once loaded the values
|
||||
don't move under us mid-run.
|
||||
|
||||
Substantive amendments follow PIPELINE.md §16: operator sign-off,
|
||||
landed in the same merge as the code change, with §8 decision tests
|
||||
applied to the amendment itself.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import tomllib
|
||||
from dataclasses import dataclass, field
|
||||
from pathlib import Path
|
||||
|
||||
|
||||
CANONICAL_FILENAME = "manifest.toml"
|
||||
|
||||
# Closed enums — keep in sync with the corresponding code that
|
||||
# implements each name. A name not in these sets means the manifest
|
||||
# is asking for something the orchestrator doesn't know how to do.
|
||||
KNOWN_COLLECTORS: frozenset[str] = frozenset({
|
||||
"proc",
|
||||
"qmp",
|
||||
"perf",
|
||||
"guest_agent",
|
||||
"pcap",
|
||||
"netflow",
|
||||
})
|
||||
|
||||
KNOWN_PHASES: frozenset[str] = frozenset({
|
||||
"clean",
|
||||
"armed",
|
||||
"infecting",
|
||||
"infected_running",
|
||||
"dormant",
|
||||
"failed",
|
||||
})
|
||||
|
||||
|
||||
class ManifestError(ValueError):
|
||||
"""Raised when the canonical manifest is missing, unreadable, or
|
||||
fails validation. The orchestrator translates this into exit 78
|
||||
(PIPELINE.md §4.7 / §9)."""
|
||||
|
||||
|
||||
@dataclass(frozen=True)
|
||||
class Phase:
|
||||
name: str
|
||||
seconds: float
|
||||
|
||||
|
||||
@dataclass(frozen=True)
|
||||
class CollectorIntervals:
|
||||
proc_ms: int
|
||||
qmp_ms: int
|
||||
perf_ms: int
|
||||
guest_agent_ms: int
|
||||
pcap_snaplen: int
|
||||
netflow_bucket_ms: int
|
||||
|
||||
|
||||
@dataclass(frozen=True)
|
||||
class FleetPolicy:
|
||||
max_concurrent_ceiling: int
|
||||
max_tier3_slots: int
|
||||
|
||||
|
||||
@dataclass(frozen=True)
|
||||
class CatalogEntry:
|
||||
name: str
|
||||
verified_against: str
|
||||
last_verified: str
|
||||
|
||||
|
||||
@dataclass(frozen=True)
|
||||
class TargetSpec:
|
||||
image_name: str
|
||||
sha256: str
|
||||
build_script: str
|
||||
|
||||
|
||||
@dataclass(frozen=True)
|
||||
class Manifest:
|
||||
schema_version: int
|
||||
name: str
|
||||
ram_per_vm_mib: int
|
||||
schedule: tuple[Phase, ...]
|
||||
fleet: FleetPolicy
|
||||
collectors_active: tuple[str, ...]
|
||||
intervals: CollectorIntervals
|
||||
catalog: tuple[CatalogEntry, ...]
|
||||
targets: tuple[TargetSpec, ...]
|
||||
samples_manifest_path: str
|
||||
# Resolved repo root + manifest path so callers can stamp them
|
||||
# into meta.json for provenance without re-deriving.
|
||||
repo_root: Path = field(repr=False)
|
||||
manifest_path: Path = field(repr=False)
|
||||
|
||||
def to_meta(self) -> dict:
|
||||
"""Lightweight representation suitable for embedding in
|
||||
meta.json so episodes carry their experiment provenance.
|
||||
Excludes the resolved Path fields (host-specific paths don't
|
||||
belong in the wire-format)."""
|
||||
return {
|
||||
"schema_version": self.schema_version,
|
||||
"name": self.name,
|
||||
"ram_per_vm_mib": self.ram_per_vm_mib,
|
||||
"phases": [
|
||||
{"name": p.name, "seconds": p.seconds} for p in self.schedule
|
||||
],
|
||||
"fleet": {
|
||||
"max_concurrent_ceiling": self.fleet.max_concurrent_ceiling,
|
||||
"max_tier3_slots": self.fleet.max_tier3_slots,
|
||||
},
|
||||
"collectors_active": list(self.collectors_active),
|
||||
"intervals": {
|
||||
"proc_ms": self.intervals.proc_ms,
|
||||
"qmp_ms": self.intervals.qmp_ms,
|
||||
"perf_ms": self.intervals.perf_ms,
|
||||
"guest_agent_ms": self.intervals.guest_agent_ms,
|
||||
"pcap_snaplen": self.intervals.pcap_snaplen,
|
||||
"netflow_bucket_ms": self.intervals.netflow_bucket_ms,
|
||||
},
|
||||
"catalog": [
|
||||
{"name": c.name, "verified_against": c.verified_against,
|
||||
"last_verified": c.last_verified}
|
||||
for c in self.catalog
|
||||
],
|
||||
"targets": [
|
||||
{"image_name": t.image_name, "sha256": t.sha256,
|
||||
"build_script": t.build_script}
|
||||
for t in self.targets
|
||||
],
|
||||
}
|
||||
|
||||
|
||||
def load_canonical(repo_root: Path | str) -> Manifest:
|
||||
"""Load + validate `<repo_root>/manifest.toml`. There is no
|
||||
`manifest_path` parameter on purpose — per §4.1 the canonical
|
||||
manifest lives at exactly one path. Callers that pass the path
|
||||
directly are off the runbook.
|
||||
|
||||
Raises ManifestError on any failure. Successful return guarantees
|
||||
every field is present, every collector name is known, every
|
||||
phase name is known, and every catalog entry has both
|
||||
verified_against and last_verified.
|
||||
"""
|
||||
repo_root = Path(repo_root).resolve()
|
||||
path = repo_root / CANONICAL_FILENAME
|
||||
if not path.exists():
|
||||
raise ManifestError(
|
||||
f"canonical manifest not found at {path}. "
|
||||
f"PIPELINE.md §4.1 requires exactly one manifest at the "
|
||||
f"repo root; this host cannot run the experiment without it."
|
||||
)
|
||||
try:
|
||||
raw = tomllib.loads(path.read_text())
|
||||
except (OSError, tomllib.TOMLDecodeError) as e:
|
||||
raise ManifestError(f"cannot parse {path}: {e}") from e
|
||||
|
||||
return _validate(raw, repo_root, path)
|
||||
|
||||
|
||||
def _validate(raw: dict, repo_root: Path, path: Path) -> Manifest:
|
||||
schema_version = _require_int(raw, "schema_version")
|
||||
if schema_version != 1:
|
||||
raise ManifestError(
|
||||
f"manifest schema_version={schema_version} not supported; "
|
||||
f"this orchestrator handles version 1 only. "
|
||||
f"Upgrade orchestrator or downgrade manifest to match."
|
||||
)
|
||||
name = _require_str(raw, "name")
|
||||
|
||||
experiment = _require_dict(raw, "experiment")
|
||||
ram_per_vm_mib = _require_int(experiment, "ram_per_vm_mib")
|
||||
if ram_per_vm_mib <= 0:
|
||||
raise ManifestError(
|
||||
f"experiment.ram_per_vm_mib must be positive, got {ram_per_vm_mib}"
|
||||
)
|
||||
|
||||
schedule_block = _require_dict(experiment, "schedule")
|
||||
phases_raw = schedule_block.get("phases")
|
||||
if not isinstance(phases_raw, list) or not phases_raw:
|
||||
raise ManifestError(
|
||||
"experiment.schedule.phases must be a non-empty array"
|
||||
)
|
||||
phases: list[Phase] = []
|
||||
for i, p in enumerate(phases_raw):
|
||||
if not isinstance(p, dict):
|
||||
raise ManifestError(
|
||||
f"experiment.schedule.phases[{i}] must be a table"
|
||||
)
|
||||
pname = _require_str(p, "name", ctx=f"phases[{i}]")
|
||||
if pname not in KNOWN_PHASES:
|
||||
raise ManifestError(
|
||||
f"experiment.schedule.phases[{i}].name={pname!r} not in "
|
||||
f"KNOWN_PHASES {sorted(KNOWN_PHASES)}"
|
||||
)
|
||||
secs = _require_float(p, "seconds", ctx=f"phases[{i}]")
|
||||
if secs <= 0:
|
||||
raise ManifestError(
|
||||
f"experiment.schedule.phases[{i}].seconds must be > 0, "
|
||||
f"got {secs}"
|
||||
)
|
||||
phases.append(Phase(name=pname, seconds=secs))
|
||||
|
||||
fleet_block = _require_dict(experiment, "fleet")
|
||||
fleet = FleetPolicy(
|
||||
max_concurrent_ceiling=_require_int(fleet_block, "max_concurrent_ceiling"),
|
||||
max_tier3_slots=_require_int(fleet_block, "max_tier3_slots"),
|
||||
)
|
||||
if fleet.max_concurrent_ceiling < 0 or fleet.max_tier3_slots < 0:
|
||||
raise ManifestError(
|
||||
"experiment.fleet ceilings must be >= 0 (0 = no cap)"
|
||||
)
|
||||
|
||||
collectors_block = _require_dict(raw, "collectors")
|
||||
active_raw = collectors_block.get("active")
|
||||
if not isinstance(active_raw, list):
|
||||
raise ManifestError("collectors.active must be an array")
|
||||
if len(set(active_raw)) != len(active_raw):
|
||||
raise ManifestError(
|
||||
f"collectors.active contains duplicates: {active_raw}"
|
||||
)
|
||||
for c in active_raw:
|
||||
if c not in KNOWN_COLLECTORS:
|
||||
raise ManifestError(
|
||||
f"collectors.active references unknown collector "
|
||||
f"{c!r}; known: {sorted(KNOWN_COLLECTORS)}"
|
||||
)
|
||||
collectors_active = tuple(active_raw)
|
||||
|
||||
intervals_block = _require_dict(collectors_block, "intervals")
|
||||
intervals = CollectorIntervals(
|
||||
proc_ms=_require_int(intervals_block, "proc_ms"),
|
||||
qmp_ms=_require_int(intervals_block, "qmp_ms"),
|
||||
perf_ms=_require_int(intervals_block, "perf_ms"),
|
||||
guest_agent_ms=_require_int(intervals_block, "guest_agent_ms"),
|
||||
pcap_snaplen=_require_int(intervals_block, "pcap_snaplen"),
|
||||
netflow_bucket_ms=_require_int(intervals_block, "netflow_bucket_ms"),
|
||||
)
|
||||
for fname, fval in (
|
||||
("proc_ms", intervals.proc_ms),
|
||||
("qmp_ms", intervals.qmp_ms),
|
||||
("perf_ms", intervals.perf_ms),
|
||||
("guest_agent_ms", intervals.guest_agent_ms),
|
||||
("pcap_snaplen", intervals.pcap_snaplen),
|
||||
("netflow_bucket_ms", intervals.netflow_bucket_ms),
|
||||
):
|
||||
if fval <= 0:
|
||||
raise ManifestError(
|
||||
f"collectors.intervals.{fname} must be > 0, got {fval}"
|
||||
)
|
||||
|
||||
catalog_block = _require_dict(raw, "catalog")
|
||||
modules_raw = catalog_block.get("modules")
|
||||
if not isinstance(modules_raw, list):
|
||||
raise ManifestError("catalog.modules must be an array")
|
||||
catalog: list[CatalogEntry] = []
|
||||
for i, entry in enumerate(modules_raw):
|
||||
if not isinstance(entry, dict):
|
||||
raise ManifestError(f"catalog.modules[{i}] must be a table")
|
||||
cname = _require_str(entry, "name", ctx=f"catalog[{i}]")
|
||||
verified_against = _require_str(
|
||||
entry, "verified_against", ctx=f"catalog[{i}]"
|
||||
)
|
||||
last_verified = _require_str(
|
||||
entry, "last_verified", ctx=f"catalog[{i}]"
|
||||
)
|
||||
catalog.append(CatalogEntry(
|
||||
name=cname,
|
||||
verified_against=verified_against,
|
||||
last_verified=last_verified,
|
||||
))
|
||||
|
||||
targets_block = _require_dict(raw, "targets")
|
||||
images_raw = targets_block.get("images")
|
||||
if not isinstance(images_raw, list):
|
||||
raise ManifestError("targets.images must be an array")
|
||||
targets: list[TargetSpec] = []
|
||||
for i, t in enumerate(images_raw):
|
||||
if not isinstance(t, dict):
|
||||
raise ManifestError(f"targets.images[{i}] must be a table")
|
||||
targets.append(TargetSpec(
|
||||
image_name=_require_str(t, "image_name", ctx=f"targets[{i}]"),
|
||||
sha256=_require_str(t, "sha256", ctx=f"targets[{i}]"),
|
||||
build_script=_require_str(t, "build_script", ctx=f"targets[{i}]"),
|
||||
))
|
||||
|
||||
samples_block = _require_dict(raw, "samples")
|
||||
samples_manifest_path = _require_str(samples_block, "manifest_path")
|
||||
|
||||
return Manifest(
|
||||
schema_version=schema_version,
|
||||
name=name,
|
||||
ram_per_vm_mib=ram_per_vm_mib,
|
||||
schedule=tuple(phases),
|
||||
fleet=fleet,
|
||||
collectors_active=collectors_active,
|
||||
intervals=intervals,
|
||||
catalog=tuple(catalog),
|
||||
targets=tuple(targets),
|
||||
samples_manifest_path=samples_manifest_path,
|
||||
repo_root=repo_root,
|
||||
manifest_path=path,
|
||||
)
|
||||
|
||||
|
||||
# ---------- helpers --------------------------------------------------
|
||||
|
||||
|
||||
def _require(d: dict, key: str, kind: type, *, ctx: str = "") -> object:
|
||||
where = f"{ctx}." if ctx else ""
|
||||
if key not in d:
|
||||
raise ManifestError(f"missing required field {where}{key}")
|
||||
v = d[key]
|
||||
if not isinstance(v, kind):
|
||||
raise ManifestError(
|
||||
f"field {where}{key} must be {kind.__name__}, got {type(v).__name__}"
|
||||
)
|
||||
return v
|
||||
|
||||
|
||||
def _require_str(d: dict, key: str, *, ctx: str = "") -> str:
|
||||
return _require(d, key, str, ctx=ctx) # type: ignore[return-value]
|
||||
|
||||
|
||||
def _require_int(d: dict, key: str, *, ctx: str = "") -> int:
|
||||
# tomllib parses both ints and floats; require strict int for fields
|
||||
# that should be ints, accept int-valued floats for ergonomics.
|
||||
where = f"{ctx}." if ctx else ""
|
||||
if key not in d:
|
||||
raise ManifestError(f"missing required field {where}{key}")
|
||||
v = d[key]
|
||||
if isinstance(v, bool): # bool is a subclass of int — reject explicitly
|
||||
raise ManifestError(f"field {where}{key} must be int, got bool")
|
||||
if isinstance(v, int):
|
||||
return v
|
||||
raise ManifestError(
|
||||
f"field {where}{key} must be int, got {type(v).__name__}"
|
||||
)
|
||||
|
||||
|
||||
def _require_float(d: dict, key: str, *, ctx: str = "") -> float:
|
||||
where = f"{ctx}." if ctx else ""
|
||||
if key not in d:
|
||||
raise ManifestError(f"missing required field {where}{key}")
|
||||
v = d[key]
|
||||
if isinstance(v, bool):
|
||||
raise ManifestError(f"field {where}{key} must be number, got bool")
|
||||
if isinstance(v, (int, float)):
|
||||
return float(v)
|
||||
raise ManifestError(
|
||||
f"field {where}{key} must be number, got {type(v).__name__}"
|
||||
)
|
||||
|
||||
|
||||
def _require_dict(d: dict, key: str, *, ctx: str = "") -> dict:
|
||||
return _require(d, key, dict, ctx=ctx) # type: ignore[return-value]
|
||||
|
|
@ -124,6 +124,24 @@ command -v perf >/dev/null || die \
|
|||
command -v tcpdump >/dev/null || die \
|
||||
"tcpdump not on PATH after install attempt — collector source 4 (bridge pcap + netflow) requires it. See ensure_collector_packages above for what was tried."
|
||||
|
||||
# Canonical experiment manifest (PIPELINE.md §4.1). Validate at install
|
||||
# time so a host that cloned a repo missing manifest.toml fails the
|
||||
# install loudly, not at first orchestrator startup. The orchestrator
|
||||
# also validates and exits 78 on bad manifest, but install-time is the
|
||||
# earliest point we can fail.
|
||||
[[ -f "$REPO_ROOT/manifest.toml" ]] || die \
|
||||
"$REPO_ROOT/manifest.toml not found — every lab host must ship the canonical experiment manifest (§4.1). Did the git pull complete?"
|
||||
"$REPO_ROOT/.venv/bin/python" -c "
|
||||
import sys, pathlib
|
||||
sys.path.insert(0, '$REPO_ROOT')
|
||||
from orchestrator.manifest import load_canonical, ManifestError
|
||||
try:
|
||||
load_canonical(pathlib.Path('$REPO_ROOT'))
|
||||
except ManifestError as e:
|
||||
print(f'manifest.toml failed validation: {e}', file=sys.stderr)
|
||||
sys.exit(1)
|
||||
" || die "manifest.toml validation failed — see error above (§4.1)"
|
||||
|
||||
# uv is preferred (lockfile-driven). Fall back to system pip if absent.
|
||||
USE_UV=0
|
||||
if command -v uv >/dev/null; then USE_UV=1; fi
|
||||
|
|
|
|||
|
|
@ -245,7 +245,50 @@ def _fixture_modules() -> dict:
|
|||
}
|
||||
|
||||
|
||||
def _fleet_cfg_with_modules(tmp_path: Path, *, force_tier2: bool = False):
|
||||
def _fixture_manifest(*, max_tier3_slots: int = 0,
|
||||
max_concurrent_ceiling: int = 0):
|
||||
"""Synthetic canonical Manifest for fleet tests.
|
||||
|
||||
Mirrors the production manifest.toml shape but constructed in-memory
|
||||
so test outcomes don't depend on what the on-disk manifest happens
|
||||
to say. Per-test parameterization (ceilings, future schedule
|
||||
variants) goes through this builder, not through CLI overrides
|
||||
that don't exist anymore (PIPELINE.md §4.1)."""
|
||||
from orchestrator.manifest import (
|
||||
CollectorIntervals, FleetPolicy, Manifest, Phase,
|
||||
)
|
||||
return Manifest(
|
||||
schema_version=1,
|
||||
name="test-fixture",
|
||||
ram_per_vm_mib=320,
|
||||
schedule=(
|
||||
Phase("clean", 10.0),
|
||||
Phase("armed", 3.0),
|
||||
Phase("infecting", 5.0),
|
||||
Phase("infected_running", 25.0),
|
||||
Phase("dormant", 15.0),
|
||||
Phase("clean", 5.0),
|
||||
),
|
||||
fleet=FleetPolicy(
|
||||
max_concurrent_ceiling=max_concurrent_ceiling,
|
||||
max_tier3_slots=max_tier3_slots,
|
||||
),
|
||||
collectors_active=("proc", "qmp", "perf", "guest_agent",
|
||||
"pcap", "netflow"),
|
||||
intervals=CollectorIntervals(
|
||||
proc_ms=100, qmp_ms=1000, perf_ms=100,
|
||||
guest_agent_ms=100, pcap_snaplen=256, netflow_bucket_ms=100,
|
||||
),
|
||||
catalog=(),
|
||||
targets=(),
|
||||
samples_manifest_path="samples/manifest.toml",
|
||||
repo_root=REPO_ROOT,
|
||||
manifest_path=REPO_ROOT / "manifest.toml",
|
||||
)
|
||||
|
||||
|
||||
def _fleet_cfg_with_modules(tmp_path: Path, *, max_tier3_slots: int = 0,
|
||||
max_concurrent_ceiling: int = 0):
|
||||
from orchestrator import fleet
|
||||
from samples.manifest import SampleManifest
|
||||
|
||||
|
|
@ -254,9 +297,12 @@ def _fleet_cfg_with_modules(tmp_path: Path, *, force_tier2: bool = False):
|
|||
host_id="test-host",
|
||||
repo_root=repo_root,
|
||||
data_root=tmp_path,
|
||||
manifest=SampleManifest.load(repo_root / "samples" / "manifest.toml"),
|
||||
experiment=_fixture_manifest(
|
||||
max_tier3_slots=max_tier3_slots,
|
||||
max_concurrent_ceiling=max_concurrent_ceiling,
|
||||
),
|
||||
samples=SampleManifest.load(repo_root / "samples" / "manifest.toml"),
|
||||
modules=_fixture_modules(),
|
||||
force_tier2=force_tier2,
|
||||
)
|
||||
|
||||
|
||||
|
|
@ -273,7 +319,7 @@ def test_fleet_dispatches_to_tier3_when_msfrpcd_listening(monkeypatch, tmp_path)
|
|||
_patch_subprocess(monkeypatch)
|
||||
capacity = fleet.detect_capacity()
|
||||
|
||||
sample = cfg.manifest.samples[0]
|
||||
sample = cfg.samples.samples[0]
|
||||
res = fleet._run_slot(cfg, slot=0, sample=sample, episode_index=0, capacity=capacity)
|
||||
|
||||
assert res.tier == "tier3", res
|
||||
|
|
@ -293,7 +339,7 @@ def test_fleet_falls_back_to_tier2_when_msfrpcd_down(monkeypatch, tmp_path) -> N
|
|||
_patch_subprocess(monkeypatch)
|
||||
capacity = fleet.detect_capacity()
|
||||
|
||||
sample = cfg.manifest.samples[0]
|
||||
sample = cfg.samples.samples[0]
|
||||
res = fleet._run_slot(cfg, slot=0, sample=sample, episode_index=0, capacity=capacity)
|
||||
|
||||
assert res.tier == "tier2"
|
||||
|
|
@ -309,26 +355,40 @@ def test_fleet_falls_back_to_tier2_when_module_catalog_empty(monkeypatch, tmp_pa
|
|||
host_id="test-host",
|
||||
repo_root=REPO_ROOT,
|
||||
data_root=tmp_path,
|
||||
manifest=SampleManifest.load(REPO_ROOT / "samples" / "manifest.toml"),
|
||||
experiment=_fixture_manifest(),
|
||||
samples=SampleManifest.load(REPO_ROOT / "samples" / "manifest.toml"),
|
||||
modules={}, # explicitly empty
|
||||
)
|
||||
monkeypatch.setattr(fleet, "_msfrpcd_available", lambda *a, **kw: True)
|
||||
_patch_subprocess(monkeypatch)
|
||||
capacity = fleet.detect_capacity()
|
||||
|
||||
sample = cfg.manifest.samples[0]
|
||||
sample = cfg.samples.samples[0]
|
||||
res = fleet._run_slot(cfg, slot=0, sample=sample, episode_index=0, capacity=capacity)
|
||||
assert res.tier == "tier2"
|
||||
|
||||
|
||||
def test_fleet_force_tier2_overrides_msfrpcd(monkeypatch, tmp_path) -> None:
|
||||
def test_fleet_empty_module_catalog_falls_back_to_tier2(monkeypatch, tmp_path) -> None:
|
||||
"""An empty module catalog forces Tier-2 fallback even when msfrpcd
|
||||
is reachable. This replaces the former force_tier2 override knob:
|
||||
per PIPELINE.md §14 the closed override list contains only
|
||||
CIS490_ALLOW_DIRTY, and per §1 the right way to disable Tier-3 is
|
||||
to ship no admitted modules — not to flag-flip the orchestrator."""
|
||||
from orchestrator import fleet
|
||||
cfg = _fleet_cfg_with_modules(tmp_path, force_tier2=True)
|
||||
from samples.manifest import SampleManifest
|
||||
cfg = fleet.FleetConfig(
|
||||
host_id="test-host",
|
||||
repo_root=REPO_ROOT,
|
||||
data_root=tmp_path,
|
||||
experiment=_fixture_manifest(),
|
||||
samples=SampleManifest.load(REPO_ROOT / "samples" / "manifest.toml"),
|
||||
modules={}, # empty catalog → no Tier-3
|
||||
)
|
||||
monkeypatch.setattr(fleet, "_msfrpcd_available", lambda *a, **kw: True)
|
||||
_patch_subprocess(monkeypatch)
|
||||
capacity = fleet.detect_capacity()
|
||||
|
||||
sample = cfg.manifest.samples[0]
|
||||
sample = cfg.samples.samples[0]
|
||||
res = fleet._run_slot(cfg, slot=0, sample=sample, episode_index=0, capacity=capacity)
|
||||
assert res.tier == "tier2"
|
||||
|
||||
|
|
@ -344,7 +404,7 @@ def test_fleet_skips_requires_bridge_modules_when_no_bridge(monkeypatch, tmp_pat
|
|||
_patch_subprocess(monkeypatch)
|
||||
capacity = fleet.detect_capacity()
|
||||
|
||||
sample = cfg.manifest.samples[0]
|
||||
sample = cfg.samples.samples[0]
|
||||
seen_modules = set()
|
||||
for ep in range(20):
|
||||
res = fleet._run_slot(cfg, slot=0, sample=sample, episode_index=ep, capacity=capacity)
|
||||
|
|
@ -379,7 +439,7 @@ def test_tier3_strips_bridge_env_even_when_set(monkeypatch, tmp_path) -> None:
|
|||
monkeypatch.setenv("BRIDGE", "br-malware")
|
||||
_patch_subprocess(monkeypatch)
|
||||
capacity = fleet.detect_capacity()
|
||||
sample = cfg.manifest.samples[0]
|
||||
sample = cfg.samples.samples[0]
|
||||
fleet._run_slot(cfg, slot=0, sample=sample, episode_index=0, capacity=capacity)
|
||||
assert "BRIDGE" not in _RecordingPopen.calls[-1]["env"]
|
||||
|
||||
|
|
@ -399,7 +459,7 @@ def test_tier3_drops_requires_bridge_modules_unconditionally(monkeypatch, tmp_pa
|
|||
_patch_subprocess(monkeypatch)
|
||||
capacity = fleet.detect_capacity()
|
||||
slirp_friendly = {k for k, v in cfg.modules.items() if not v.requires_bridge}
|
||||
sample = cfg.manifest.samples[0]
|
||||
sample = cfg.samples.samples[0]
|
||||
seen = set()
|
||||
for ep in range(40):
|
||||
res = fleet._run_slot(cfg, slot=0, sample=sample,
|
||||
|
|
@ -422,7 +482,7 @@ def test_fleet_assigns_unique_port_base_per_slot(monkeypatch, tmp_path) -> None:
|
|||
_patch_subprocess(monkeypatch)
|
||||
capacity = fleet.detect_capacity()
|
||||
|
||||
sample = cfg.manifest.samples[0]
|
||||
sample = cfg.samples.samples[0]
|
||||
fleet._run_slot(cfg, slot=0, sample=sample, episode_index=0, capacity=capacity)
|
||||
fleet._run_slot(cfg, slot=1, sample=sample, episode_index=0, capacity=capacity)
|
||||
fleet._run_slot(cfg, slot=2, sample=sample, episode_index=0, capacity=capacity)
|
||||
|
|
|
|||
168
tests/test_manifest.py
Normal file
168
tests/test_manifest.py
Normal file
|
|
@ -0,0 +1,168 @@
|
|||
"""Tests for orchestrator/manifest.py — the canonical experiment
|
||||
manifest loader (PIPELINE.md §4.1)."""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
from pathlib import Path
|
||||
|
||||
import pytest
|
||||
|
||||
from orchestrator.manifest import (
|
||||
KNOWN_COLLECTORS, KNOWN_PHASES, ManifestError, load_canonical,
|
||||
)
|
||||
|
||||
|
||||
REPO_ROOT = Path(__file__).resolve().parent.parent
|
||||
|
||||
MINIMAL_VALID = """
|
||||
schema_version = 1
|
||||
name = "test-experiment"
|
||||
|
||||
[experiment]
|
||||
ram_per_vm_mib = 320
|
||||
|
||||
[[experiment.schedule.phases]]
|
||||
name = "clean"
|
||||
seconds = 1.0
|
||||
|
||||
[[experiment.schedule.phases]]
|
||||
name = "armed"
|
||||
seconds = 1.0
|
||||
|
||||
[experiment.fleet]
|
||||
max_concurrent_ceiling = 0
|
||||
max_tier3_slots = 0
|
||||
|
||||
[collectors]
|
||||
active = ["proc"]
|
||||
|
||||
[collectors.intervals]
|
||||
proc_ms = 100
|
||||
qmp_ms = 1000
|
||||
perf_ms = 100
|
||||
guest_agent_ms = 100
|
||||
pcap_snaplen = 256
|
||||
netflow_bucket_ms = 100
|
||||
|
||||
[catalog]
|
||||
modules = []
|
||||
|
||||
[targets]
|
||||
images = []
|
||||
|
||||
[samples]
|
||||
manifest_path = "samples/manifest.toml"
|
||||
"""
|
||||
|
||||
|
||||
def _write_manifest(repo: Path, body: str) -> None:
|
||||
(repo / "manifest.toml").write_text(body)
|
||||
|
||||
|
||||
def test_canonical_manifest_in_repo_loads() -> None:
|
||||
"""The actual manifest.toml shipped in the repo MUST load and
|
||||
validate. If it doesn't, every lab host fails preflight."""
|
||||
m = load_canonical(REPO_ROOT)
|
||||
assert m.schema_version == 1
|
||||
assert m.name == "cis490-spectral-v1"
|
||||
# Every active collector must be in KNOWN_COLLECTORS.
|
||||
for c in m.collectors_active:
|
||||
assert c in KNOWN_COLLECTORS
|
||||
# Every scheduled phase name must be in KNOWN_PHASES.
|
||||
for p in m.schedule:
|
||||
assert p.name in KNOWN_PHASES
|
||||
|
||||
|
||||
def test_loads_minimal_valid(tmp_path: Path) -> None:
|
||||
_write_manifest(tmp_path, MINIMAL_VALID)
|
||||
m = load_canonical(tmp_path)
|
||||
assert m.name == "test-experiment"
|
||||
assert len(m.schedule) == 2
|
||||
assert m.fleet.max_concurrent_ceiling == 0
|
||||
assert m.collectors_active == ("proc",)
|
||||
|
||||
|
||||
def test_missing_file_raises_manifest_error(tmp_path: Path) -> None:
|
||||
with pytest.raises(ManifestError, match="not found"):
|
||||
load_canonical(tmp_path)
|
||||
|
||||
|
||||
def test_unsupported_schema_version_raises(tmp_path: Path) -> None:
|
||||
_write_manifest(tmp_path, MINIMAL_VALID.replace(
|
||||
"schema_version = 1", "schema_version = 2"
|
||||
))
|
||||
with pytest.raises(ManifestError, match="schema_version=2"):
|
||||
load_canonical(tmp_path)
|
||||
|
||||
|
||||
def test_unknown_collector_in_active_raises(tmp_path: Path) -> None:
|
||||
_write_manifest(tmp_path, MINIMAL_VALID.replace(
|
||||
'active = ["proc"]', 'active = ["proc", "totally_not_real"]'
|
||||
))
|
||||
with pytest.raises(ManifestError, match="totally_not_real"):
|
||||
load_canonical(tmp_path)
|
||||
|
||||
|
||||
def test_duplicate_collector_in_active_raises(tmp_path: Path) -> None:
|
||||
_write_manifest(tmp_path, MINIMAL_VALID.replace(
|
||||
'active = ["proc"]', 'active = ["proc", "proc"]'
|
||||
))
|
||||
with pytest.raises(ManifestError, match="duplicate"):
|
||||
load_canonical(tmp_path)
|
||||
|
||||
|
||||
def test_unknown_phase_name_raises(tmp_path: Path) -> None:
|
||||
bad = MINIMAL_VALID.replace('name = "armed"', 'name = "magical"', 1)
|
||||
_write_manifest(tmp_path, bad)
|
||||
with pytest.raises(ManifestError, match="magical"):
|
||||
load_canonical(tmp_path)
|
||||
|
||||
|
||||
def test_negative_phase_seconds_raises(tmp_path: Path) -> None:
|
||||
bad = MINIMAL_VALID.replace('seconds = 1.0\n\n[[experiment.schedule.phases]]\nname = "armed"\nseconds = 1.0',
|
||||
'seconds = 1.0\n\n[[experiment.schedule.phases]]\nname = "armed"\nseconds = -5.0')
|
||||
_write_manifest(tmp_path, bad)
|
||||
with pytest.raises(ManifestError, match="must be > 0"):
|
||||
load_canonical(tmp_path)
|
||||
|
||||
|
||||
def test_negative_ram_raises(tmp_path: Path) -> None:
|
||||
bad = MINIMAL_VALID.replace("ram_per_vm_mib = 320", "ram_per_vm_mib = -1")
|
||||
_write_manifest(tmp_path, bad)
|
||||
with pytest.raises(ManifestError, match="positive"):
|
||||
load_canonical(tmp_path)
|
||||
|
||||
|
||||
def test_catalog_entry_missing_verified_against_raises(tmp_path: Path) -> None:
|
||||
bad = MINIMAL_VALID.replace(
|
||||
"[catalog]\nmodules = []",
|
||||
'[catalog]\n[[catalog.modules]]\nname = "fixture"\nlast_verified = "abc"\n',
|
||||
)
|
||||
_write_manifest(tmp_path, bad)
|
||||
with pytest.raises(ManifestError, match="verified_against"):
|
||||
load_canonical(tmp_path)
|
||||
|
||||
|
||||
def test_catalog_entry_with_both_fields_loads(tmp_path: Path) -> None:
|
||||
valid = MINIMAL_VALID.replace(
|
||||
"[catalog]\nmodules = []",
|
||||
('[catalog]\n[[catalog.modules]]\nname = "fixture"\n'
|
||||
'verified_against = "test-target"\nlast_verified = "abc"\n'),
|
||||
)
|
||||
_write_manifest(tmp_path, valid)
|
||||
m = load_canonical(tmp_path)
|
||||
assert len(m.catalog) == 1
|
||||
assert m.catalog[0].name == "fixture"
|
||||
assert m.catalog[0].verified_against == "test-target"
|
||||
assert m.catalog[0].last_verified == "abc"
|
||||
|
||||
|
||||
def test_to_meta_round_trips_to_json_safe(tmp_path: Path) -> None:
|
||||
"""meta.json embedding requires the to_meta dict be json-encodable."""
|
||||
import json
|
||||
_write_manifest(tmp_path, MINIMAL_VALID)
|
||||
m = load_canonical(tmp_path)
|
||||
encoded = json.dumps(m.to_meta())
|
||||
decoded = json.loads(encoded)
|
||||
assert decoded["schema_version"] == 1
|
||||
assert decoded["name"] == "test-experiment"
|
||||
|
|
@ -1,13 +1,20 @@
|
|||
"""``cis490-fleet`` — run as many concurrent labeled episodes as the
|
||||
host can handle, drawing samples from the manifest.
|
||||
host can handle, drawing samples and experiment shape from the
|
||||
canonical manifest at <repo_root>/manifest.toml.
|
||||
|
||||
Per PIPELINE.md §4.1 the experiment manifest is the single source of
|
||||
truth for what the experiment is. There are NO `--manifest`,
|
||||
`--modules-dir`, `--ram-per-vm-mib`, `--max-concurrent`,
|
||||
`--max-tier3-slots`, `--force-tier2`, or `--require-real-samples`
|
||||
flags — those would all be per-host overrides of the canonical
|
||||
experiment shape, which §4.1 forbids. A host that can't load the
|
||||
canonical manifest exits 78 (§9, §4.7).
|
||||
|
||||
Modes:
|
||||
|
||||
--capacity Print the resource calculation and exit. No VMs spawned.
|
||||
--waves N Run N waves of episodes (one wave = max_concurrent
|
||||
episodes, each in its own slot). Default: 1.
|
||||
--max-concurrent N
|
||||
Cap concurrency below the auto-detected ceiling.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
|
@ -27,26 +34,26 @@ from exploits.modules import load_module_configs # noqa: E402
|
|||
from orchestrator.fleet import ( # noqa: E402
|
||||
FleetConfig, FleetRunner, capacity_report, detect_capacity,
|
||||
)
|
||||
from orchestrator.manifest import ManifestError, load_canonical # noqa: E402
|
||||
from samples.manifest import SampleManifest # noqa: E402
|
||||
|
||||
|
||||
# LSB-style sysadmin-error exit code: do NOT respawn (paired with
|
||||
# RestartPreventExitStatus=78 on cis490-orchestrator.service).
|
||||
EXIT_SYSADMIN_ERROR = 78
|
||||
|
||||
|
||||
def main(argv: list[str] | None = None) -> int:
|
||||
p = argparse.ArgumentParser(prog="cis490-fleet")
|
||||
p.add_argument("--capacity", action="store_true")
|
||||
p.add_argument("--waves", type=int, default=1)
|
||||
p.add_argument("--max-concurrent", type=int, default=None)
|
||||
p.add_argument("--manifest",
|
||||
default=str(Path(__file__).resolve().parent.parent / "samples" / "manifest.toml"))
|
||||
p.add_argument("--modules-dir",
|
||||
default=str(Path(__file__).resolve().parent.parent / "exploits" / "modules"))
|
||||
p.add_argument("--data-root", default="data")
|
||||
p.add_argument("--host-id", default=os.environ.get("FLEET_HOST_ID") or os.uname().nodename)
|
||||
p.add_argument("--ram-per-vm-mib", type=int, default=320)
|
||||
p.add_argument("--require-real-samples", action="store_true")
|
||||
p.add_argument("--force-tier2", action="store_true",
|
||||
help="Skip Tier 3 even when msfrpcd is reachable")
|
||||
p.add_argument("--max-tier3-slots", type=int, default=None,
|
||||
help="Cap concurrent Tier-3 slots; slots >= N fall back to Tier-2")
|
||||
p.add_argument("--capacity", action="store_true",
|
||||
help="Print capacity calculation and exit")
|
||||
p.add_argument("--waves", type=int, default=1,
|
||||
help="Number of episode waves to run")
|
||||
p.add_argument("--data-root", default="data",
|
||||
help="Per-host writable directory for episode dirs")
|
||||
p.add_argument("--host-id",
|
||||
default=os.environ.get("FLEET_HOST_ID") or os.uname().nodename,
|
||||
help="Stable identity for this host (per-host)")
|
||||
p.add_argument("--log-level", default="INFO")
|
||||
args = p.parse_args(argv)
|
||||
|
||||
|
|
@ -54,27 +61,64 @@ def main(argv: list[str] | None = None) -> int:
|
|||
level=getattr(logging, args.log_level.upper(), logging.INFO),
|
||||
format="%(asctime)s %(levelname)s %(name)s %(message)s",
|
||||
)
|
||||
log = logging.getLogger("cis490.fleet")
|
||||
|
||||
repo_root = Path(__file__).resolve().parent.parent
|
||||
|
||||
# Canonical manifest. PIPELINE.md §4.1 / §4.7 — failure to load is
|
||||
# a sysadmin error, not a runtime warning. Exit 78 so systemd's
|
||||
# RestartPreventExitStatus=78 keeps the unit stuck-and-loud rather
|
||||
# than respawning into the same broken state.
|
||||
try:
|
||||
experiment = load_canonical(repo_root)
|
||||
except ManifestError as e:
|
||||
log.error("canonical manifest failed to load: %s", e)
|
||||
log.error(
|
||||
"this host cannot run the experiment without a valid "
|
||||
"manifest.toml at %s. Per §4.1 there is no fallback. "
|
||||
"Exiting %d (sysadmin error).",
|
||||
repo_root / "manifest.toml", EXIT_SYSADMIN_ERROR,
|
||||
)
|
||||
return EXIT_SYSADMIN_ERROR
|
||||
|
||||
log.info("loaded experiment manifest: %s v%d (schema_version=%d)",
|
||||
experiment.name, 1, experiment.schema_version)
|
||||
|
||||
if args.capacity:
|
||||
print(capacity_report())
|
||||
print(capacity_report(ram_per_vm_mib=experiment.ram_per_vm_mib))
|
||||
return 0
|
||||
|
||||
manifest = SampleManifest.load(args.manifest)
|
||||
repo_root = Path(__file__).resolve().parent.parent
|
||||
modules_dir = Path(args.modules_dir)
|
||||
modules = load_module_configs(modules_dir) if modules_dir.exists() else {}
|
||||
samples_path = repo_root / experiment.samples_manifest_path
|
||||
samples = SampleManifest.load(samples_path)
|
||||
|
||||
# Module catalog admission: only load modules whose name appears in
|
||||
# experiment.catalog. Modules without a catalog entry don't get
|
||||
# loaded into the runtime catalog regardless of whether their toml
|
||||
# is on disk. PIPELINE.md §4.3.
|
||||
modules_dir = repo_root / "exploits" / "modules"
|
||||
on_disk_modules = (
|
||||
load_module_configs(modules_dir) if modules_dir.exists() else {}
|
||||
)
|
||||
admitted_names = {entry.name for entry in experiment.catalog}
|
||||
modules = {
|
||||
name: cfg for name, cfg in on_disk_modules.items()
|
||||
if name in admitted_names
|
||||
}
|
||||
if admitted_names and (admitted_names - set(modules.keys())):
|
||||
missing = sorted(admitted_names - set(modules.keys()))
|
||||
log.error(
|
||||
"manifest.catalog references modules whose toml is missing "
|
||||
"from %s: %s. Refusing to start (§4.3).", modules_dir, missing,
|
||||
)
|
||||
return EXIT_SYSADMIN_ERROR
|
||||
|
||||
cfg = FleetConfig(
|
||||
host_id=args.host_id,
|
||||
repo_root=repo_root,
|
||||
data_root=Path(args.data_root).resolve(),
|
||||
manifest=manifest,
|
||||
experiment=experiment,
|
||||
samples=samples,
|
||||
modules=modules,
|
||||
ram_per_vm_mib=args.ram_per_vm_mib,
|
||||
max_concurrent_override=args.max_concurrent,
|
||||
require_real_samples=args.require_real_samples,
|
||||
force_tier2=args.force_tier2,
|
||||
max_tier3_slots=args.max_tier3_slots,
|
||||
)
|
||||
|
||||
runner = FleetRunner(cfg)
|
||||
|
|
@ -88,6 +132,7 @@ def main(argv: list[str] | None = None) -> int:
|
|||
|
||||
print(json.dumps({
|
||||
"host_id": args.host_id,
|
||||
"experiment": experiment.name,
|
||||
"capacity": result.capacity.to_dict(),
|
||||
"modules_loaded": sorted(modules.keys()),
|
||||
"slots": [
|
||||
|
|
|
|||
|
|
@ -29,6 +29,7 @@ sys.path.insert(0, str(Path(__file__).resolve().parent))
|
|||
|
||||
from collectors import qmp # noqa: E402
|
||||
from orchestrator.episode import EpisodeConfig, EpisodeRunner # noqa: E402
|
||||
from orchestrator.manifest import ManifestError, load_canonical # noqa: E402
|
||||
from samples.manifest import SampleManifest # noqa: E402
|
||||
from vm_load_controller import VMLoadController # noqa: E402
|
||||
from vm_serial import SerialClient # noqa: E402
|
||||
|
|
@ -98,12 +99,9 @@ def main() -> int:
|
|||
parser.add_argument(
|
||||
"--sample",
|
||||
default=os.environ.get("SAMPLE_NAME"),
|
||||
help="Pick a workload profile from the manifest by name. Fleet runner "
|
||||
"passes this via SAMPLE_NAME env. If unset, runs the v1 yes-loop.",
|
||||
)
|
||||
parser.add_argument(
|
||||
"--manifest",
|
||||
default=str(Path(__file__).resolve().parent.parent / "samples" / "manifest.toml"),
|
||||
help="Pick a workload profile from the samples manifest by name. "
|
||||
"Fleet runner passes this via SAMPLE_NAME env. If unset, runs the "
|
||||
"v1 yes-loop.",
|
||||
)
|
||||
args = parser.parse_args()
|
||||
|
||||
|
|
@ -116,13 +114,23 @@ def main() -> int:
|
|||
repo_root = Path(__file__).resolve().parent.parent
|
||||
launcher = repo_root / "vm" / "launch_demo.sh"
|
||||
|
||||
# Canonical experiment manifest (PIPELINE.md §4.1). The samples-manifest
|
||||
# path comes from here too; no per-call override.
|
||||
try:
|
||||
experiment = load_canonical(repo_root)
|
||||
except ManifestError as e:
|
||||
log.error("canonical manifest failed to load: %s", e)
|
||||
return 78
|
||||
samples_path = repo_root / experiment.samples_manifest_path
|
||||
|
||||
# Resolve sample if requested.
|
||||
sample = None
|
||||
if args.sample:
|
||||
manifest = SampleManifest.load(args.manifest)
|
||||
sample = next((s for s in manifest.samples if s.name == args.sample), None)
|
||||
samples = SampleManifest.load(samples_path)
|
||||
sample = next((s for s in samples.samples if s.name == args.sample), None)
|
||||
if sample is None:
|
||||
log.error("sample %r not in manifest %s", args.sample, args.manifest)
|
||||
log.error("sample %r not in samples manifest %s",
|
||||
args.sample, samples_path)
|
||||
return 2
|
||||
log.info("using sample=%s profile=%s kind=%s",
|
||||
sample.name, sample.profile, sample.kind)
|
||||
|
|
@ -217,6 +225,7 @@ def main() -> int:
|
|||
qmp_socket=qmp_sock if qmp_sock.exists() else None,
|
||||
guest_agent_socket=agent_sock if agent_sock.exists() else None,
|
||||
bridge_iface=os.environ.get("BRIDGE") or None,
|
||||
experiment_meta=experiment.to_meta(),
|
||||
# Source 3 (oracle): perf-stat sampling of the qemu PID.
|
||||
# Now that the stdout/stderr + event-name parser bugs are
|
||||
# fixed, perf produces real rows. Per PIPELINE.md §4.4
|
||||
|
|
|
|||
|
|
@ -38,6 +38,7 @@ from exploits.driver import DriverConfig, MSFExploitDriver # noqa: E402
|
|||
from exploits.modules import load_module_config # noqa: E402
|
||||
from exploits.msfrpc import MSFRpcClient, MSFRpcConfig # noqa: E402
|
||||
from orchestrator.episode import EpisodeConfig, EpisodeRunner # noqa: E402
|
||||
from orchestrator.manifest import ManifestError, load_canonical # noqa: E402
|
||||
from samples.manifest import SampleManifest # noqa: E402
|
||||
|
||||
|
||||
|
|
@ -170,12 +171,9 @@ def main() -> int:
|
|||
parser.add_argument(
|
||||
"--sample",
|
||||
default=os.environ.get("SAMPLE_NAME"),
|
||||
help="Pick a workload profile from the manifest by name. Fleet runner "
|
||||
"passes this via SAMPLE_NAME env. Without it, falls back to the v1 yes-loop.",
|
||||
)
|
||||
parser.add_argument(
|
||||
"--manifest",
|
||||
default=str(Path(__file__).resolve().parent.parent / "samples" / "manifest.toml"),
|
||||
help="Pick a workload profile from the samples manifest by name. "
|
||||
"Fleet runner passes this via SAMPLE_NAME env. Without it, falls "
|
||||
"back to the v1 yes-loop.",
|
||||
)
|
||||
args = parser.parse_args()
|
||||
|
||||
|
|
@ -198,15 +196,24 @@ def main() -> int:
|
|||
log.error("no module config at %s", module_path)
|
||||
return 2
|
||||
|
||||
# Canonical experiment manifest (PIPELINE.md §4.1).
|
||||
try:
|
||||
experiment = load_canonical(repo_root)
|
||||
except ManifestError as e:
|
||||
log.error("canonical manifest failed to load: %s", e)
|
||||
return 78
|
||||
samples_path = repo_root / experiment.samples_manifest_path
|
||||
|
||||
module = load_module_config(module_path)
|
||||
log.info("module loaded: %s (%s)", module.name, module.module_path)
|
||||
|
||||
sample = None
|
||||
if args.sample:
|
||||
manifest = SampleManifest.load(args.manifest)
|
||||
sample = next((s for s in manifest.samples if s.name == args.sample), None)
|
||||
samples = SampleManifest.load(samples_path)
|
||||
sample = next((s for s in samples.samples if s.name == args.sample), None)
|
||||
if sample is None:
|
||||
log.error("sample %r not in manifest %s", args.sample, args.manifest)
|
||||
log.error("sample %r not in samples manifest %s",
|
||||
args.sample, samples_path)
|
||||
return 2
|
||||
log.info("sample=%s profile=%s kind=%s",
|
||||
sample.name, sample.profile, sample.kind)
|
||||
|
|
@ -308,6 +315,7 @@ def main() -> int:
|
|||
qmp_socket=qmp_sock if qmp_sock.exists() else None,
|
||||
guest_agent_socket=agent_sock if agent_sock.exists() else None,
|
||||
bridge_iface=os.environ.get("BRIDGE") or None,
|
||||
experiment_meta=experiment.to_meta(),
|
||||
# Source 3 (oracle): see equivalent in run_real_vm_demo.py.
|
||||
# Now that the perf collector parser is fixed, turn it on
|
||||
# in production rather than leave it silently disabled.
|
||||
|
|
|
|||
Loading…
Add table
Reference in a new issue