terainia/test/README.md
Maximus Gorog 3ed16c2aaf Performant UI test harness — attach to real Chrome
Per your direction: tests must be able to debug UI/UX behaviors and
must be performant. Playwright's bundled Chromium falls to SwiftShader
on Linux which is fine for visual scenarios but tanks anything where
fps matters. New attach-mode lets us drive YOUR Chrome (hardware GPU)
without needing Playwright to spawn its own.

test/attach.py:
  - One-shot health check that connects to localhost:9222 (Chrome
    already running with --remote-debugging-port). Doesn't spawn,
    doesn't close. Just confirms attach + reports the FPS HUD value.
  - peek.py and run.py already attach via CDP, so they work as-is
    once Chrome is started with the debug port.

test/README.md:
  - New "Two modes" section up front: attach (your real Chrome,
    hardware) vs launch (Playwright Chromium, software). Each has a
    legitimate use; perf-sensitive work goes through attach.
  - Workflow:
      google-chrome --remote-debugging-port=9222 \\
        --user-data-dir=/tmp/voxel-dev-chrome http://localhost:8080/
      python3 attach.py        # health check
      python3 run.py scenarios/ui-menu-open-close.yaml

New UI scenarios that drive interactions via DOM events / wasm calls,
not pixel screenshots. Render-independent, fast on any backend:

  ui-menu-open-close.yaml    Click ≡ → assert menu-open class →
                              click resume → assert closed.
  ui-hotbar.yaml             pointerdown on slot 4 → assert .active
                              moved. Digit1 keypress → assert .active
                              back to slot 0.
  ui-respawn.yaml            teleport into void → wait → assert
                              is_alive()===false + body.dead class +
                              death screen visible. Click respawn-btn
                              → assert hp===20, alive===true.
  ui-settings-sliders.yaml   Slider .value = N + dispatch 'input' →
                              assert displayed value updates → unwind
                              so the page isn't left frozen.

README updates list all scenarios. No code in the game changed —
this is pure test-harness additions.
2026-05-24 17:41:05 -06:00

212 lines
8.1 KiB
Markdown

# Test harness — declarative visual + behavioral scenarios
**Dev-only. Runs entirely on your machine against a local build.**
Nothing here ever touches the production deploy — that's a release
target, not a test surface.
## Two modes
| When you want | Use | GPU |
|-------------------------------------|----------------|-----------|
| Functional UI + game-state tests | `attach.py` to your *real* Chrome | hardware |
| Visual regression screenshots only | `launch.py` (Playwright Chromium) | software |
Playwright's bundled Chromium falls to SwiftShader (software CPU
rasterization) on Linux, so it's fine for "did this menu open?" but
useless for "is this fast enough?". For perf-sensitive scenarios
attach to your normal Chrome instead.
### Attach mode (recommended for any perf / UI test)
Start Chrome yourself, once, with debug port + a separate profile:
```sh
google-chrome \
--remote-debugging-port=9222 \
--user-data-dir=/tmp/voxel-dev-chrome \
http://localhost:8080/
```
(Use `chromium`, `google-chrome-stable`, or whichever Chrome binary
your distro has — the flags are the same.) Keep that window open.
Then from this directory:
```sh
python3 attach.py # health check
python3 peek.py # screenshot + telemetry
python3 run.py scenarios/ui-menu.yaml # drive a scenario
```
The `--user-data-dir` keeps the debug-port Chrome separate from
your normal browsing session so cookies / history don't leak either way.
Mirrors the cucucaracha (lacucarachanews) toolkit pattern: `launch.py`
opens a Chromium with persistent profile + CDP on port 9222; small
attach-only tools drive the *same* session via Playwright.
What's different: the game exposes a small set of wasm bindings so
scenarios can declaratively set scene state and read back telemetry,
not just click DOM elements. See `src/bridges.rs` `wasm_api` for the
exports.
## Setup (once per machine)
```sh
# In the repo root, install the Python harness deps.
cd test
python3 -m venv .venv
. .venv/bin/activate
pip install -r requirements.txt
playwright install chromium
```
## The dev loop
Two terminals. Both run on **your machine**, never on the production VPS.
**Terminal 1 — local game server:**
```sh
cd /home/maximus/.env/web/voxel-game
./run.sh --no-tunnel
# Builds wasm + server, serves on http://localhost:8080
# (Or `docker compose up` if you prefer the same container we deploy.)
```
Leave that running. Edit Rust / WGSL / JS → re-run `./run.sh
--no-tunnel` → refresh the browser tab.
**Terminal 2 — Playwright session against the local server:**
```sh
cd /home/maximus/.env/web/voxel-game/test
. .venv/bin/activate
python3 launch.py # default: http://localhost:8080/
```
That opens a Chromium window pointed at your local game with CDP on
port 9222. `launch.py` exits to idle; the browser stays up so you can
attach tools to it. If nothing's listening on `:8080` you'll get a
clear error message — start the dev server first.
**Terminal 3 — drive scenarios + take screenshots:**
```sh
python3 peek.py # snapshot + telemetry
python3 run.py scenarios/lighting-times-of-day.yaml # 6 screenshots
python3 run.py scenarios/god-rays-look-at-sun.yaml
python3 run.py scenarios/voxel-construction-darkness.yaml
```
Screenshots land in `test/screenshots/`. Diff them against your
baseline (visually or with `magick compare`) to catch regressions.
## Pointing at the deployed build (rarely)
You almost never want this. The deploy lags your local code, so a bug
you fixed locally still appears there until you push + rebuild the
container. But if you want a one-off sanity check:
```sh
python3 launch.py --url https://voxel.mxvs.art/
```
The harness will happily attach; just remember you're looking at
whatever tag is currently deployed, not your in-progress work.
The browser stays open; profile lives at `./.browser_profile/`. CDP
listens on `localhost:9222` so the other tools can attach.
## One-shot inspection / screenshot
```sh
python3 peek.py # screenshot + dump game telemetry
python3 peek.py --json # machine-readable
```
Writes a PNG to `screenshots/<ts>_peek.png`.
## Run a scenario
```sh
python3 run.py scenarios/lighting-times-of-day.yaml
```
Scenarios are YAML lists of `steps`. Each step is one of:
| step | meaning |
|-----------------------------|---------|
| `wait_for: <js_expr>` | block until `js_expr` evaluates truthy |
| `wait: <ms>` | sleep that many ms |
| `eval: <js>` | run JS in the page (state setters, etc.) |
| `key: <key> [hold: ms]` | press a key (optionally hold) |
| `mouse_move: [dx, dy]` | relative mouse motion |
| `mouse: <down\|up\|click>` | mouse button events |
| `screenshot: <name>.png` | save canvas screenshot to `screenshots/` |
| `assert: <js_expr>` | fail scenario if `js_expr` is falsy |
## UI test scenarios
These exercise interactions via DOM events + wasm calls — no
pixel-clicking, no reliance on render perf, fast on any backend.
| Scenario | Asserts |
|-----------------------------------------|---------|
| `ui-menu-open-close.yaml` | Settings menu opens via the ≡ button, closes via Resume |
| `ui-hotbar.yaml` | Hotbar slot selection via DOM click + keyboard digit |
| `ui-respawn.yaml` | Void-death triggers death screen; respawn button restores HP |
| `ui-settings-sliders.yaml` | FOV/render-dist/time-scale slider input round-trips to displayed value |
Plus the visual / perf scenarios that also work in attach mode:
| Scenario | Asserts |
|-----------------------------------------|---------|
| `lighting-times-of-day.yaml` | Visual sweep of noon → sunset → midnight → sunrise |
| `god-rays-look-at-sun.yaml` | Shafts visible at four sun altitudes |
| `voxel-construction-darkness.yaml` | sky_vis bake responds to surrounding voxels |
| `bench-pass-cost.yaml` | Sweeps bench-flag configs; meaningful only on hardware |
## Available game-state JS bindings
All exported by the wasm module (see `src/bridges.rs::wasm_api`).
After `wait_for: "window.voxel_game !== undefined"`, call as
`window.voxel_game.<fn>(...)`.
| Setter | Effect |
|--------|--------|
| `set_scene_time(t: f32)` | jump shader time to `t` seconds |
| `set_time_scale(s: f32)` | freeze (0) / fast-forward time |
| `teleport(x, y, z)` | move player feet to (x,y,z) |
| `look_at(yaw, pitch)` | set camera angles (radians) |
| `set_paused(b)` | pause input + physics |
| `set_fov(deg)` / `set_mouse_sens(s)` / `set_render_distance(blocks)` | settings |
| `respawn()` | one-shot respawn request |
| Getter | Returns |
|--------|---------|
| `get_scene_time()` | `f32` |
| `get_position()` | `[x, y, z]` |
| `get_camera_angles()` | `[yaw, pitch]` |
| `get_hp()` / `is_alive()` | from death/respawn state |
## Why this exists
When something looks wrong (e.g. "tops of blocks don't react to
sunset"), the dev loop without this is: deploy, open browser, fiddle
the time slider, compare to baseline by memory. With this, the loop
is: write a scenario that screenshots the same view at noon / sunset /
midnight, run it, diff the screenshots against a baseline. Bugs become
visible in one command.
For non-visual behaviors (e.g. "is the joystick releasing correctly?")
the same harness sends keyboard events and reads back position
telemetry to assert "after release, player stops moving within N ms".
## Sanity guarantees
- Persistent profile means cookies, settings, and game state survive
across `launch.py` runs. The first launch is "fresh"; subsequent
ones resume where you left off.
- `launch.py` never touches the page beyond the initial navigation.
Scenarios drive the session; the launcher just hosts the browser.
- All tools attach via CDP — closing them doesn't close the browser.