Stops out-of-date lab hosts from polluting the dataset with episodes
generated by buggy code. The valid-commits set mirrors the maintainer's
working clone on the Pi automatically — when the maintainer pulls or
pushes a new commit, the receiver picks it up within the 5-second
cache TTL with no service restart.
Receiver changes:
- receiver/version_gate.py (new): VersionGate(repo_path, window).
Each check() consults a frozenset of the last `window` commit
hashes from `git -C <repo> log --format=%H -n <window>`, refreshed
every 5s under a lock. Resilient to transient git failure (keeps
prior cache so a flaky `git` doesn't lock out every shipper).
- receiver/app.py: PUT extracts X-Cis490-Code-Commit; gate.check()
before ingest. Rejects with:
400 + remediation if header missing or malformed
412 + remediation + your_commit + head_commit if not in window
Remediation block is verbatim copy-pasteable into the lab-host
shell:
cd /opt/cis490 && sudo -u cis490 git pull origin main
sudo /opt/cis490/scripts/install-lab-host.sh
sudo systemctl restart cis490-orchestrator
- receiver/store.py: ingest_stream takes commit kwarg, stamps it on
the index.jsonl row (new optional field). Backfilled rows from
index_backfill.py also pull commit out of meta.json.
- receiver/config.py + etc/receiver.toml.example: new [version_gate]
section. enabled=true, repo_path=/home/max/cis490, window=100 by
default. Enabled toggle exists for emergency disable-and-collect.
Shipper changes:
- shipper/transport.py: ship_tarball() takes commit kwarg, sends
X-Cis490-Code-Commit header. 412 maps to status='fatal' so the
queue doesn't infinite-retry — operator must pull and reinstall
before the next ship will succeed.
- shipper/queue.py: reads meta.json::code_version.commit per
episode, passes through. On 412, logs the receiver's full
remediation block at ERROR level so journalctl on the lab host
shows exactly what to run.
Tests: 9 in test_version_gate (including 2 end-to-end via
starlette.testclient), 2 cover the boundary where new commits land
mid-cache and where missing-repo gracefully keeps prior cache.
157/157 total.
Index schema: existing rows stay valid (commit field is optional
on read). New rows from receiver-direct AND from index_backfill.py
include commit.
50 lines
1.7 KiB
Python
50 lines
1.7 KiB
Python
from __future__ import annotations
|
|
|
|
import tomllib
|
|
from dataclasses import dataclass
|
|
from pathlib import Path
|
|
|
|
|
|
DEFAULT_MAX_EPISODE_BYTES = 256 * 1024 * 1024
|
|
|
|
|
|
@dataclass(frozen=True)
|
|
class ReceiverConfig:
|
|
listen_host: str
|
|
listen_port: int
|
|
store_root: Path
|
|
incoming_root: Path
|
|
index_path: Path
|
|
max_episode_bytes: int
|
|
bearer_token: str | None
|
|
# Path to the maintainer's working clone — receiver consults its
|
|
# `git log` for the commit-allow-list. Default mirrors the
|
|
# canonical Pi setup.
|
|
version_gate_repo: Path
|
|
version_gate_window: int
|
|
version_gate_enabled: bool
|
|
|
|
@classmethod
|
|
def load(cls, path: str | Path) -> "ReceiverConfig":
|
|
with open(path, "rb") as f:
|
|
data = tomllib.load(f)
|
|
|
|
listen_addr = data.get("listen_addr", "127.0.0.1:8443")
|
|
host, _, port = listen_addr.rpartition(":")
|
|
version_gate = data.get("version_gate", {})
|
|
return cls(
|
|
listen_host=host or "127.0.0.1",
|
|
listen_port=int(port),
|
|
store_root=Path(data["store_root"]).resolve(),
|
|
incoming_root=Path(data["incoming_root"]).resolve(),
|
|
index_path=Path(data["index_path"]).resolve(),
|
|
max_episode_bytes=int(
|
|
data.get("limits", {}).get("max_episode_bytes", DEFAULT_MAX_EPISODE_BYTES)
|
|
),
|
|
bearer_token=data.get("auth", {}).get("bearer_token"),
|
|
version_gate_repo=Path(
|
|
version_gate.get("repo_path", "/home/max/cis490")
|
|
).resolve(),
|
|
version_gate_window=int(version_gate.get("window", 100)),
|
|
version_gate_enabled=bool(version_gate.get("enabled", True)),
|
|
)
|