CIS490/scripts
Max c1c8e98180 scripts/train-pi-cpu-models.sh — sequential Pi-side trainer chain
Pi has 4 cores; only KNN and tree-based models are realistic to train
here without GPU. While Lambda runs the full 16-job manifest in
parallel (~1.7h), this chain trains the CPU-friendly subset on the
Pi (~30 min) so scenes 8 & 12 populate with multi-model numbers
within minutes instead of waiting on Lambda's full cycle.

Order: gbt-realistic, knn-realistic, knn-oracle, knn_semi-realistic,
knn_semi-oracle. Skips models whose .ckpt.json already exists
(idempotent restart). Each is a subprocess of training/trainer/run.py
so XGBoost/numpy/sklearn don't fight each other for cores.

Caller is expected to start gbt-oracle separately (it's the longest
single training and we kicked it off before invoking this script).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-08 14:12:34 -05:00
..
auto-update.sh lab-host: cis490-autoupdate.timer for self-healing on push 2026-05-01 16:59:31 -05:00
build-lambda-bundle.sh training: lambda-cloud one-shot training integration 2026-05-08 12:32:04 -05:00
fetch-alpine-baseline.sh Close out the deployment-readiness gaps 2026-04-30 00:31:55 -05:00
fetch-lab-host-cert.sh lab-host: cis490-cert-fetch.timer for automatic mTLS bootstrap retry 2026-05-02 13:30:16 -05:00
fetch-metasploitable2.sh Tier 3 + Tier 4 auto-deploy: zero operator interaction 2026-04-30 23:12:08 -05:00
install-lab-host.sh PIPELINE §5 step 2: canonical manifest at <repo>/manifest.toml 2026-05-04 01:25:01 -05:00
install-msfrpcd.sh Tier-3 bring-up: 9 bugs fixed on elliott-ThinkPad (2026-05-01) 2026-05-02 12:26:19 -06:00
install-receiver.sh bootstrap: auto-issue mTLS leaves to enrolled lab hosts (closes #9, refs #3) 2026-04-30 01:30:29 -05:00
install-tier-3-4.sh Tier-3: fix QEMU boot, catalog admission, verify module 2026-05-05 16:41:41 -06:00
install-training-worker-windows.ps1 training/fleet: distributed multi-host trainer with capability gating 2026-05-08 01:20:20 -05:00
install-training-worker.sh training/fleet: distributed multi-host trainer with capability gating 2026-05-08 01:20:20 -05:00
issue-cis490-client-cert-wrapper.sh bootstrap: auto-issue mTLS leaves to enrolled lab hosts (closes #9, refs #3) 2026-04-30 01:30:29 -05:00
lambda-bootstrap.sh training: parallelize lambda bootstrap (2 jobs at a time on the A100) 2026-05-08 12:37:03 -05:00
run-on-lambda.sh training: lambda-cloud one-shot training integration 2026-05-08 12:32:04 -05:00
sync-training-data.sh training: validator, feature/tensor extractors, 6 supervised models, schema-hashed checkpoints, eval suite, dashboard producers 2026-05-08 01:19:00 -05:00
train-pi-cpu-models.sh scripts/train-pi-cpu-models.sh — sequential Pi-side trainer chain 2026-05-08 14:12:34 -05:00