History

Maximus Gorog d622cdb330 Verify debugging functions work as functional contracts Per your directive to make sure debug surfaces actually work: test/scenarios/bench-pass-cost.yaml: Declarative sweep that runs each bench config (all on / no shafts / no post / both off) with 8s settling between, screenshots each config, and asserts frame_dt_ms is positive. The Playwright env runs SwiftShader so absolute deltas are noisy, but the scenario is structurally correct for real hardware where the differences will read clean. Functional contract verification (run interactively): - All 11 wasm exports exist on window.voxel_game - FPS HUD renders and updates frame-to-frame - Telemetry getters return finite values - set_scene_time(42) round-trips through tick to get_scene_time - teleport(x,y,z) round-trips (modulo expected gravity drop) - look_at(yaw,pitch) round-trips - bench_set_disable_post measurably changes frame_dt (915 → 820 ms in my software environment) 18 checks, 0 failures. The debugging substrate is verified end-to-end; remaining "local slower than deployment" gap is most likely a WebGPU-vs-WebGL2 backend selection issue rather than a code-path bug.		2026-05-24 17:18:41 -06:00
..
scenarios	Verify debugging functions work as functional contracts	2026-05-24 17:18:41 -06:00
.gitignore	Test harness: declarative Playwright scenarios + wasm state bindings	2026-05-24 10:51:17 -06:00
launch.py	test harness: default to localhost, not the prod deploy	2026-05-24 11:29:55 -06:00
peek.py	Test harness: declarative Playwright scenarios + wasm state bindings	2026-05-24 10:51:17 -06:00
README.md	test harness: default to localhost, not the prod deploy	2026-05-24 11:29:55 -06:00
requirements.txt	Test harness: declarative Playwright scenarios + wasm state bindings	2026-05-24 10:51:17 -06:00
run.py	Tick/toc instrumentation across build + test + mesh phases	2026-05-24 11:49:08 -06:00

README.md

Test harness — declarative visual + behavioral scenarios

Dev-only. Runs entirely on your machine against a local build. Nothing here ever touches the production deploy — that's a release target, not a test surface.

Mirrors the cucucaracha (lacucarachanews) toolkit pattern: launch.py opens a Chromium with persistent profile + CDP on port 9222; small attach-only tools drive the same session via Playwright.

What's different: the game exposes a small set of wasm bindings so scenarios can declaratively set scene state and read back telemetry, not just click DOM elements. See src/bridges.rs wasm_api for the exports.

Setup (once per machine)

# In the repo root, install the Python harness deps.
cd test
python3 -m venv .venv
. .venv/bin/activate
pip install -r requirements.txt
playwright install chromium

The dev loop

Two terminals. Both run on your machine, never on the production VPS.

Terminal 1 — local game server:

cd /home/maximus/.env/web/voxel-game
./run.sh --no-tunnel
# Builds wasm + server, serves on http://localhost:8080
# (Or `docker compose up` if you prefer the same container we deploy.)

Leave that running. Edit Rust / WGSL / JS → re-run ./run.sh --no-tunnel → refresh the browser tab.

Terminal 2 — Playwright session against the local server:

cd /home/maximus/.env/web/voxel-game/test
. .venv/bin/activate
python3 launch.py                          # default: http://localhost:8080/

That opens a Chromium window pointed at your local game with CDP on port 9222. launch.py exits to idle; the browser stays up so you can attach tools to it. If nothing's listening on :8080 you'll get a clear error message — start the dev server first.

Terminal 3 — drive scenarios + take screenshots:

python3 peek.py                                       # snapshot + telemetry
python3 run.py scenarios/lighting-times-of-day.yaml   # 6 screenshots
python3 run.py scenarios/god-rays-look-at-sun.yaml
python3 run.py scenarios/voxel-construction-darkness.yaml

Screenshots land in test/screenshots/. Diff them against your baseline (visually or with magick compare) to catch regressions.

Pointing at the deployed build (rarely)

You almost never want this. The deploy lags your local code, so a bug you fixed locally still appears there until you push + rebuild the container. But if you want a one-off sanity check:

python3 launch.py --url https://voxel.mxvs.art/

The harness will happily attach; just remember you're looking at whatever tag is currently deployed, not your in-progress work.

The browser stays open; profile lives at ./.browser_profile/. CDP listens on localhost:9222 so the other tools can attach.

One-shot inspection / screenshot

python3 peek.py            # screenshot + dump game telemetry
python3 peek.py --json     # machine-readable

Writes a PNG to screenshots/<ts>_peek.png.

Run a scenario

python3 run.py scenarios/lighting-times-of-day.yaml

Scenarios are YAML lists of steps. Each step is one of:

step	meaning
`wait_for: <js_expr>`	block until `js_expr` evaluates truthy
`wait: <ms>`	sleep that many ms
`eval: <js>`	run JS in the page (state setters, etc.)
`key: <key> [hold: ms]`	press a key (optionally hold)
`mouse_move: [dx, dy]`	relative mouse motion
`mouse: <down\|up\|click>`	mouse button events
`screenshot: <name>.png`	save canvas screenshot to `screenshots/`
`assert: <js_expr>`	fail scenario if `js_expr` is falsy

Available game-state JS bindings

All exported by the wasm module (see src/bridges.rs::wasm_api). After wait_for: "window.voxel_game !== undefined", call as window.voxel_game.<fn>(...).

Setter	Effect
`set_scene_time(t: f32)`	jump shader time to `t` seconds
`set_time_scale(s: f32)`	freeze (0) / fast-forward time
`teleport(x, y, z)`	move player feet to (x,y,z)
`look_at(yaw, pitch)`	set camera angles (radians)
`set_paused(b)`	pause input + physics
`set_fov(deg)` / `set_mouse_sens(s)` / `set_render_distance(blocks)`	settings
`respawn()`	one-shot respawn request

Getter	Returns
`get_scene_time()`	`f32`
`get_position()`	`[x, y, z]`
`get_camera_angles()`	`[yaw, pitch]`
`get_hp()` / `is_alive()`	from death/respawn state

Why this exists

When something looks wrong (e.g. "tops of blocks don't react to sunset"), the dev loop without this is: deploy, open browser, fiddle the time slider, compare to baseline by memory. With this, the loop is: write a scenario that screenshots the same view at noon / sunset / midnight, run it, diff the screenshots against a baseline. Bugs become visible in one command.

For non-visual behaviors (e.g. "is the joystick releasing correctly?") the same harness sends keyboard events and reads back position telemetry to assert "after release, player stops moving within N ms".

Sanity guarantees

Persistent profile means cookies, settings, and game state survive across launch.py runs. The first launch is "fresh"; subsequent ones resume where you left off.
launch.py never touches the page beyond the initial navigation. Scenarios drive the session; the launcher just hosts the browser.
All tools attach via CDP — closing them doesn't close the browser.