The render-distance slider has step=16 in the HTML; setting .value
to 120 snaps to the nearest valid value (128). The test was asserting
the displayed text contained "120" — never true. Changed to 128
(actually on a step boundary). Pure test correctness; UI behavior
was right all along (snap-to-step is the slider's intended behavior).
All four UI scenarios now green on hardware Chromium:
ui-menu-open-close 13 steps 2.44s ✓
ui-hotbar 10 steps 1.07s ✓
ui-respawn 14 steps 3.38s ✓
ui-settings-sliders 15 steps 1.52s ✓
FPS on real hardware: 61.9 fps (16.2ms median) — matches the
deployment, confirms the game itself is fast. The earlier 3 fps
was Playwright's SwiftShader software path.
Per your direction: tests must be able to debug UI/UX behaviors and
must be performant. Playwright's bundled Chromium falls to SwiftShader
on Linux which is fine for visual scenarios but tanks anything where
fps matters. New attach-mode lets us drive YOUR Chrome (hardware GPU)
without needing Playwright to spawn its own.
test/attach.py:
- One-shot health check that connects to localhost:9222 (Chrome
already running with --remote-debugging-port). Doesn't spawn,
doesn't close. Just confirms attach + reports the FPS HUD value.
- peek.py and run.py already attach via CDP, so they work as-is
once Chrome is started with the debug port.
test/README.md:
- New "Two modes" section up front: attach (your real Chrome,
hardware) vs launch (Playwright Chromium, software). Each has a
legitimate use; perf-sensitive work goes through attach.
- Workflow:
google-chrome --remote-debugging-port=9222 \\
--user-data-dir=/tmp/voxel-dev-chrome http://localhost:8080/
python3 attach.py # health check
python3 run.py scenarios/ui-menu-open-close.yaml
New UI scenarios that drive interactions via DOM events / wasm calls,
not pixel screenshots. Render-independent, fast on any backend:
ui-menu-open-close.yaml Click ≡ → assert menu-open class →
click resume → assert closed.
ui-hotbar.yaml pointerdown on slot 4 → assert .active
moved. Digit1 keypress → assert .active
back to slot 0.
ui-respawn.yaml teleport into void → wait → assert
is_alive()===false + body.dead class +
death screen visible. Click respawn-btn
→ assert hp===20, alive===true.
ui-settings-sliders.yaml Slider .value = N + dispatch 'input' →
assert displayed value updates → unwind
so the page isn't left frozen.
README updates list all scenarios. No code in the game changed —
this is pure test-harness additions.
Per your directive to make sure debug surfaces actually work:
test/scenarios/bench-pass-cost.yaml:
Declarative sweep that runs each bench config (all on / no shafts /
no post / both off) with 8s settling between, screenshots each
config, and asserts frame_dt_ms is positive. The Playwright env
runs SwiftShader so absolute deltas are noisy, but the scenario
is structurally correct for real hardware where the differences
will read clean.
Functional contract verification (run interactively):
- All 11 wasm exports exist on window.voxel_game
- FPS HUD renders and updates frame-to-frame
- Telemetry getters return finite values
- set_scene_time(42) round-trips through tick to get_scene_time
- teleport(x,y,z) round-trips (modulo expected gravity drop)
- look_at(yaw,pitch) round-trips
- bench_set_disable_post measurably changes frame_dt
(915 → 820 ms in my software environment)
18 checks, 0 failures. The debugging substrate is verified
end-to-end; remaining "local slower than deployment" gap is most
likely a WebGPU-vs-WebGL2 backend selection issue rather than a
code-path bug.
run.sh:
- phase() wrapper logs elapsed seconds per build step.
- Tracks total build+startup at the end.
- Output is "==> phase / [Ns] phase" so the slow steps are obvious.
test/run.py:
- Per-step time.perf_counter() around each scenario step.
- "slowest steps" summary printed at the end so the worst
offenders are immediately visible.
- Total wall-clock time at scenario end.
src/render/mod.rs:
- browser_now() helper: web_sys::performance().now() on wasm,
Instant-based on native. Monotonic ms timestamps for tick/toc.
- Renderer::rebuild_chunk wraps build_chunk_mesh in a t0/t1
measurement and logs anything over 5ms with vertex/index counts.
Surfaces sky_visibility cost in the browser console.
web/main.js:
- Exposes window.voxel_game = wasm after init so the test
harness can drive scenarios declaratively (set_scene_time,
teleport, look_at, get_position, etc.).
src/shader.wgsl:
- Fix duplicate `let to_eye` declaration introduced in Round D
(specular's normalized to_eye conflicted with fog's raw version).
Renamed fog's local to_eye_raw. The test harness caught this
immediately — first WGSL compile error, first scenario run.
Findings from running scenarios/lighting-times-of-day.yaml:
- 289 chunks × ~100ms avg = ~29s mesh-build on main thread.
- Page-ready latency dominated by this. window.voxel_game appears
almost immediately (init resolves before chunks build), but
the world is invisible until meshes are uploaded.
- sky_visibility (8 cosine rays × HashMap voxel lookups) is the
hot path inside build_chunk_mesh.
Next: make chunk-mesh build progressive (one or two chunks per tick
instead of all up-front), so the world becomes visible immediately
and pops in over a few seconds.
Testing is a dev process — point the harness at a local build you can
edit, rebuild, and screenshot in seconds. Pointing it at the prod
deploy by default was wrong: the deploy lags local code by a deploy
cycle, so visual changes you make wouldn't appear there until rebuilt
on the Linode.
test/launch.py:
- DEFAULT_URL is now http://localhost:8080/.
- Friendly pre-flight check: if --url is localhost and nothing is
listening on the port, print a clear "start ./run.sh --no-tunnel"
message and exit 1. Avoids the silent ERR_CONNECTION_REFUSED
failure mode.
- --url https://voxel.mxvs.art/ still works for one-off remote
sanity checks.
test/README.md:
- Lead with the dev-loop instruction: terminal 1 runs the local
server, terminal 2 runs launch.py, terminal 3 drives scenarios.
- Note the "pointing at deploy" path as a rarely-used escape hatch.
Mirrors the cucucaracha (lacucarachanews) toolkit pattern adapted for
the voxel game:
bridges.rs adds TestCommand + Telemetry plumbing:
- thread_local TEST_COMMANDS queue + TELEMETRY snapshot.
- drain_test_commands() called by App::tick at frame start.
- publish_telemetry(t) called at frame end.
- wasm_api exports: set_scene_time, teleport, look_at, plus
getters get_scene_time / get_position / get_camera_angles.
app.rs:
- drain_test_commands() applies SetSceneTime / Teleport / LookAt
before physics integrates. Teleport zeroes velocity and syncs the
camera to feet+EYE_HEIGHT.
- publish_telemetry() at end of tick exposes scene state to JS.
test/:
launch.py Open Chromium with persistent profile + CDP:9222.
Navigates to https://voxel.mxvs.art by default;
--url for local dev.
peek.py Attach via CDP, screenshot canvas, dump telemetry
(scene_time, position, camera angles, hp). Read-only.
run.py Execute a YAML scenario:
wait_for, wait, eval, key, mouse_move, mouse,
screenshot, assert
Key allowlist prevents stray scenarios from sending
arbitrary input.
requirements.txt playwright + PyYAML.
README.md Setup, grammar, available bindings, why this
exists.
scenarios/
lighting-times-of-day.yaml Screenshots at noon / afternoon /
sunset / civil twilight / midnight
/ sunrise. Verifies the Round A
sunset fixes by visual diff.
god-rays-look-at-sun.yaml Pointed at the sun at four altitudes
to inspect the Round B shafts.
voxel-construction-darkness.yaml Visual baseline for the sky_vis
bake from Round D.
.gitignore Excludes the browser profile +
screenshots directory.
Visual regression workflow:
1. python3 launch.py
2. (separate terminal) python3 run.py scenarios/lighting-times-of-day.yaml
3. Compare screenshots/lighting-times-of-day_*.png against baseline.
Tests still 63 passing. Native + wasm release clean.