lean4-htt/doc/dev/testing.md
Garmelon 08eb78a5b2
chore: switch to new test/bench suite (#12590)
This PR sets up the new integrated test/bench suite. It then migrates
all benchmarks and some related tests to the new suite. There's also
some documentation and some linting.

For now, a lot of the old tests are left alone so this PR doesn't become
even larger than it already is. Eventually, all tests should be migrated
to the new suite though so there isn't a confusing mix of two systems.
2026-02-25 13:51:53 +00:00

142 lines
6.4 KiB
Markdown

# Test Suite
**Warning:** This document is partially outdated.
It describes the old test suite, which is currently in the process of being replaced.
The new test suite's documentation can be found at [`tests/README.md`](../../tests/README.md).
After [building Lean](../make/index.md) you can run all the tests using
```
cd build/release
make test ARGS=-j4
```
Change the 4 to the maximum number of parallel tests you want to
allow. The best choice is the number of CPU cores on your machine as
the tests are mostly CPU bound. You can find the number of processors
on linux using `nproc` and on Windows it is the `NUMBER_OF_PROCESSORS`
environment variable.
You can run tests after [building a specific stage](bootstrap.md) by
adding the `-C stageN` argument. The default when run as above is stage 1. The
Lean tests will automatically use that stage's corresponding Lean
executables
Running `make test` will not pick up new test files; run
```bash
cmake build/release/stage1
```
to update the list of tests.
You can also use `ctest` directly if you are in the right folder. So
to run stage1 tests with a 300 second timeout run this:
```bash
cd build/release/stage1
ctest -j 4 --output-on-failure --timeout 300
```
Useful `ctest` flags are `-R <name of test>` to run a single test, and
`--rerun-failed` to run all tests that failed during the last run.
You can also pass `ctest` flags via `make test ARGS="--rerun-failed"`.
To get verbose output from ctest pass the `--verbose` command line
option. Test output is normally suppressed and only summary
information is displayed. This option will show all test output.
## Test Suite Organization
All these tests are included by [src/shell/CMakeLists.txt](https://github.com/leanprover/lean4/blob/master/src/shell/CMakeLists.txt):
- [`tests/lean`](https://github.com/leanprover/lean4/tree/master/tests/lean/): contains tests that come equipped with a
.lean.expected.out file. The driver script [`test_single.sh`](https://github.com/leanprover/lean4/tree/master/tests/lean/test_single.sh) runs
each test and checks the actual output (*.produced.out) with the
checked in expected output.
- [`tests/lean/run`](https://github.com/leanprover/lean4/tree/master/tests/lean/run/): contains tests that are run through the lean
command line one file at a time. These tests only look for error
codes and do not check the expected output even though output is
produced, it is ignored.
**Note:** Tests in this directory run with `-Dlinter.all=false` to reduce noise.
If your test needs to verify linter behavior (e.g., deprecation warnings),
explicitly enable the relevant linter with `set_option linter.<name> true`.
- [`tests/lean/interactive`](https://github.com/leanprover/lean4/tree/master/tests/lean/interactive/): are designed to test server requests at a
given position in the input file. Each .lean file contains comments
that indicate how to simulate a client request at that position.
using a `--^` point to the line position. Example:
```lean,ignore
open Foo in
theorem tst2 (h : a ≤ b) : a + 2 ≤ b + 2 :=
Bla.
--^ completion
```
In this example, the test driver [`test_single.sh`](https://github.com/leanprover/lean4/tree/master/tests/lean/interactive/test_single.sh) will simulate an
auto-completion request at `Bla.`. The expected output is stored in
a .lean.expected.out in the json format that is part of the
[Language Server
Protocol](https://microsoft.github.io/language-server-protocol/).
This can also be used to test the following additional requests:
```
--^ textDocument/hover
--^ textDocument/typeDefinition
--^ textDocument/definition
--^ $/lean/plainGoal
--^ $/lean/plainTermGoal
--^ insert: ...
--^ collectDiagnostics
```
- [`tests/lean/server`](https://github.com/leanprover/lean4/tree/master/tests/lean/server/): Tests more of the Lean `--server` protocol.
There are just a few of them, and it uses .log files containing
JSON.
- [`tests/compiler`](https://github.com/leanprover/lean4/tree/master/tests/compiler/): contains tests that will run the Lean compiler and
build an executable that is executed and the output is compared to
the .lean.expected.out file. This test also contains a subfolder
[`foreign`](https://github.com/leanprover/lean4/tree/master/tests/compiler/foreign/) which shows how to extend Lean using C++.
- [`tests/lean/trust0`](https://github.com/leanprover/lean4/tree/master/tests/lean/trust0): tests that run Lean in a mode that Lean doesn't
even trust the .olean files (i.e., trust 0).
- [`tests/bench`](https://github.com/leanprover/lean4/tree/master/tests/bench/): contains performance tests.
- [`tests/plugin`](https://github.com/leanprover/lean4/tree/master/tests/plugin/): tests that compiled Lean code can be loaded into
`lean` via the `--plugin` command line option.
## Writing Good Tests
Every test file should contain:
* an initial `/-! -/` module docstring summarizing the test's purpose
* a module docstring for each test section that describes what is tested
and, if not 100% clear, why that is the desirable behavior
At the time of writing, most tests do not follow these new guidelines yet.
For an example of a conforming test, see [`tests/lean/1971.lean`](https://github.com/leanprover/lean4/tree/master/tests/lean/1971.lean).
## Fixing Tests
When the Lean source code or the standard library are modified, some of the
tests break because the produced output is slightly different, and we have
to reflect the changes in the `.lean.expected.out` files.
We should not blindly copy the new produced output since we may accidentally
miss a bug introduced by recent changes.
The test suite contains commands that allow us to see what changed in a convenient way.
First, we must install [meld](http://meldmerge.org/). On Ubuntu, we can do it by simply executing
```
sudo apt-get install meld
```
Now, suppose `bad_class.lean` test is broken. We can see the problem by going to [`tests/lean`](https://github.com/leanprover/lean4/tree/master/tests/lean) directory and
executing
```
./test_single.sh -i bad_class.lean
```
When the `-i` option is provided, `meld` is automatically invoked
whenever there is discrepancy between the produced and expected
outputs. `meld` can also be used to repair the problems.
In Emacs, we can also execute `M-x lean4-diff-test-file` to check/diff the file of the current buffer.
To mass-copy all `.produced.out` files to the respective `.expected.out` file, use `tests/lean/copy-produced`.