lean4-htt/tests/bench
Sebastian Graf 00c1f0d3a9
test: create aux theorems for backward rules in SymM-based mvcgen (#12295)
This PR improves and simplifies the SymM-based mvcgen prototype by
creating `BackwardRule.apply`-ready auxiliary theorems for spec
theorems. These auxiliary theorems have types that have reducible
definitions unfolded and shared, just like the rest of the SymM world
assumes. Furthermore, in order to aid kernel checking times,
definitional reductions leave behind expected type hints. With #12290,
we get the following numbers:

```
goal_100: 100.671964 ms, kernel: 34.104676 ms
goal_200: 152.650808 ms, kernel: 70.653251 ms
goal_300: 222.973242 ms, kernel: 105.874266 ms
goal_400: 294.032333 ms, kernel: 150.025106 ms
goal_500: 366.748098 ms, kernel: 193.483843 ms
goal_600: 442.509542 ms, kernel: 236.845115 ms
goal_700: 517.527685 ms, kernel: 268.804230 ms
goal_800: 601.657910 ms, kernel: 310.765606 ms
goal_900: 681.020759 ms, kernel: 357.428032 ms
goal_1000: 762.212989 ms, kernel: 403.789517 ms
```

The baseline is `shallow_add_sub_cancel`:

```
goal_100: 62.721757 ms, kernel: 22.109237 ms
goal_200: 140.118652 ms, kernel: 45.219512 ms
goal_300: 241.077690 ms, kernel: 78.779379 ms
goal_400: 363.274462 ms, kernel: 128.951250 ms
goal_500: 517.350791 ms, kernel: 155.498217 ms
goal_600: 678.291416 ms, kernel: 212.325487 ms
goal_700: 881.479043 ms, kernel: 258.690695 ms
goal_800: 1092.357375 ms, kernel: 351.996079 ms
goal_900: 1247.759480 ms, kernel: 319.197608 ms
goal_1000: 1497.203628 ms, kernel: 364.532560 ms
```

The latter is with the main solving loop in interpreter mode, but the
kernel checking times are still representative.
Earlier experiments suggest that the precompiled baseline performs at
roughly 650ms for `goal_1000`, so the new mvcgen is getting close.
2026-02-03 17:43:33 +00:00
..
inundation refactor: migrate to new ranges (#8841) 2025-07-07 12:41:53 +00:00
mergeSort chore: deprecate List.iota (#6708) 2025-01-21 02:32:35 +00:00
mvcgen test: create aux theorems for backward rules in SymM-based mvcgen (#12295) 2026-02-03 17:43:33 +00:00
qsort feat: remove runtime bounds checks and partial from qsort (#6241) 2024-12-01 06:26:00 +00:00
sym feat: improves simpArrowTelescope simproc (#12153) 2026-01-25 22:29:38 +00:00
.gitignore chore: add .dSYM files (Mac debug symbols) to tests .gitignore files (#8771) 2025-06-13 15:27:46 +00:00
accumulate_profile.py chore: add lakeprof benchmarks (#9709) 2025-08-06 11:25:45 +00:00
arith_eval.ml
big_beq.lean perf: mkNoConfusionCtors: cheaper inferType (#10455) 2025-09-19 10:51:17 +00:00
big_beq_rec.lean chore: benchmark for deriving BEq on large inductive (#10028) 2025-08-21 15:50:12 +00:00
big_deceq.lean chore: benchmarks for deriving DecidableEq on large inductives (#10149) 2025-08-27 12:05:04 +00:00
big_deceq_rec.lean chore: benchmarks for deriving DecidableEq on large inductives (#10149) 2025-08-27 12:05:04 +00:00
big_do.lean test: add a benchmark that is slow to elaborate (#5656) 2024-10-23 08:20:15 +00:00
big_match.lean chore: large match statement benchmark (#9665) 2025-08-01 15:25:07 +00:00
big_match_nat.lean test: add big match on nat lit benchmarks (#11502) 2025-12-04 08:21:56 +00:00
big_match_nat_split.lean test: add big match on nat lit benchmarks (#11502) 2025-12-04 08:21:56 +00:00
big_match_partial.lean test: benchmark for large partial match (#11199) 2025-11-16 11:20:31 +00:00
big_omega.lean test: big_omega benchmark (#5817) 2024-10-24 07:26:29 +00:00
big_struct.lean perf: add introSubstEq shortcut (#12190) 2026-01-28 12:33:14 +00:00
big_struct_dep.lean test: add a big dependent struct test (#12061) 2026-01-20 12:00:25 +00:00
big_struct_dep1.lean test: add big_struct_dep1 benchmark (#12191) 2026-01-27 14:36:09 +00:00
binarytrees.ghc-6.hs doc: fix typos 2021-03-07 15:06:02 +01:00
binarytrees.lean test: clean up binarytrees.lean 2023-01-19 14:44:20 +01:00
binarytrees.lean.args chore: adjust "small" bench/ inputs to be reasonable for interpreter 2020-02-28 10:04:13 +01:00
binarytrees.lean.expected.out chore: adjust "small" bench/ inputs to be reasonable for interpreter 2020-02-28 10:04:13 +01:00
binarytrees.ocaml-2.ml
binarytrees.st.hs test: add binarytrees.st benchmark 2023-01-19 14:44:20 +01:00
binarytrees.st.lean refactor: migrate to new ranges (#8841) 2025-07-07 12:41:53 +00:00
binarytrees.st.mlton-2.sml test: add binarytrees.st benchmark 2023-01-19 14:44:20 +01:00
binarytrees.st.sml test: add binarytrees.st benchmark 2023-01-19 14:44:20 +01:00
binarytrees.st.swift test: add binarytrees.st benchmark 2023-01-19 14:44:20 +01:00
binarytrees.swift feat(tests/bench): add safe binarytrees.swift from https://benchmarksgame-team.pages.debian.net/benchmarksgame/program/binarytrees-swift-1.html 2019-05-30 19:33:38 +02:00
binarytrees5.ml test: add binarytrees.st benchmark 2023-01-19 14:44:20 +01:00
binarytrees5_multicore.ml chore: more benchmarking setup 2023-01-17 13:28:05 +01:00
bv_decide_inequality.lean fix: bv_decide benchmarks (#6017) 2024-11-09 11:18:33 +00:00
bv_decide_large_aig.lean perf: add a large AIG benchmark for bv_decide (#7721) 2025-03-29 16:04:25 +00:00
bv_decide_mod.lean fix: bv_decide benchmarks (#6017) 2024-11-09 11:18:33 +00:00
bv_decide_mul.lean feat: add bv_decide benchmarks (#5203) 2024-08-29 12:45:58 +00:00
bv_decide_realworld.lean chore: notation ^^ for Bool.xor (#5332) 2024-09-18 08:59:11 +00:00
bv_decide_rewriter.lean perf: bv_decide rewriting benchmark (#9231) 2025-07-07 10:24:08 +00:00
channel.lean refactor: migrate to new ranges (#8841) 2025-07-07 12:41:53 +00:00
charactersIn.lean chore: benchmark for charactersIn (#11643) 2025-12-12 22:23:51 +00:00
compile.sh feat: LLVM backend (#1837) 2022-12-30 12:45:30 +01:00
const_fold.hs chore(tests/bench): rename benchmarks 2019-05-30 16:25:41 +02:00
const_fold.lean chore: remove command universes 2021-06-29 17:01:07 -07:00
const_fold.lean.args chore: lower const_fold inputs again to prevent stack overflow in sanitized build 2020-02-28 13:23:39 +01:00
const_fold.lean.expected.out chore: lower const_fold inputs again to prevent stack overflow in sanitized build 2020-02-28 13:23:39 +01:00
const_fold.ml chore(tests/bench): rename benchmarks 2019-05-30 16:25:41 +02:00
const_fold.sml chore(tests/bench): rename benchmarks 2019-05-30 16:25:41 +02:00
const_fold.swift chore(tests/bench): rename benchmarks 2019-05-30 16:25:41 +02:00
cross.yaml chore: fix more typos in comments 2023-10-08 14:37:34 -07:00
dag_hassorry_issue.lean chore: re-enable tests (#10923) 2025-10-23 08:38:57 +00:00
dag_hassorry_issue.lean.args chore: reduce stack space usage at instantiate_mvars_fn (#4931) 2024-08-06 17:38:59 +00:00
dag_hassorry_issue.lean.expected.out chore: re-enable tests (#10923) 2025-10-23 08:38:57 +00:00
delayed_assign.lean test: delayed assignment performance issue (#12201) 2026-01-28 02:08:39 +00:00
deriv.hs
deriv.lean chore: remove command universes 2021-06-29 17:01:07 -07:00
deriv.lean.args chore: adjust "small" bench/ inputs to be reasonable for interpreter 2020-02-28 10:04:13 +01:00
deriv.lean.expected.out chore: adjust "small" bench/ inputs to be reasonable for interpreter 2020-02-28 10:04:13 +01:00
deriv.ml
deriv.sml
deriv.swift test(tests/bench): add deriv.swift 2019-05-30 11:34:58 -07:00
ex-50-50-1.leq test: new linear solver benchmark by Marc 2021-12-02 17:03:35 +01:00
flake.lock chore: update cross-bench setup 2024-04-15 10:59:07 +02:00
flake.nix chore: robustify Nix shell (#8141) 2025-04-28 15:08:32 +00:00
full-stdlib.exec.yaml feat: separate benchmark for profiling the stdlib per-file 2020-10-29 11:53:03 +01:00
ghc-gc.py
hashmap.lean refactor: remove last appearances of allowNontermination (#12211) 2026-01-29 07:22:19 +00:00
identifier_completion.lean perf: do not export opaque bodies (#10119) 2025-08-27 20:59:59 +00:00
identifier_completion_didOpen.log perf: do not export opaque bodies (#10119) 2025-08-27 20:59:59 +00:00
identifier_completion_initialization.log test: identifier completion benchmark (#6796) 2025-01-27 19:31:32 +00:00
identifier_completion_runner.lean test: improve language server test coverage (#10574) 2025-09-30 11:15:03 +00:00
ilean_roundtrip.lean feat: reduce server memory consumption (#11162) 2025-12-01 10:53:23 +00:00
iterators.lean refactor: move Iter and others from Std.Iterators to Std (#11446) 2025-12-15 08:24:12 +00:00
lean-gc.py
liasolver.lean feat: verify all and any for hash maps (#10765) 2025-11-15 16:59:37 +00:00
liasolver.lean.args test: new linear solver benchmark by Marc 2021-12-02 17:03:35 +01:00
liasolver.lean.expected.out chore: re-enable tests (#10923) 2025-10-23 08:38:57 +00:00
Makefile chore: update cross-bench setup 2024-04-15 10:59:07 +02:00
mlkit-gc.py
mut_rec_wf.lean chore: add #9598 as benchmark (#9642) 2025-07-31 15:32:54 +00:00
nat_repr.lean refactor: migrate to new ranges (#8841) 2025-07-07 12:41:53 +00:00
nat_repr.lean.args chore: Nat.repr microbenchmark (#3888) 2024-04-17 18:10:32 +00:00
nat_repr.lean.expected.out chore: Nat.repr microbenchmark (#3888) 2024-04-17 18:10:32 +00:00
ocaml-gc.py chore: more benchmarking setup 2023-01-17 13:28:05 +01:00
omega_stress.lean perf: optimize sorry detection in unused variables linter (#7129) 2025-02-22 16:43:39 +00:00
parser.lean refactor: migrate to new ranges (#8841) 2025-07-07 12:41:53 +00:00
perf.py chore: update benchmark suite 2022-05-25 18:26:36 +02:00
phashmap.lean refactor: remove last appearances of allowNontermination (#12211) 2026-01-29 07:22:19 +00:00
qsort.hs chore: update benchmark suite 2022-05-25 18:26:36 +02:00
qsort.lean chore: remove >6 month old deprecations (#10446) 2025-09-22 12:47:11 +00:00
qsort.lean.args chore: adjust "small" bench/ inputs to be reasonable for interpreter 2020-02-28 10:04:13 +01:00
qsort.lean.expected.out test(tests/bench): add benchmarks as regular ctests with lowered inputs 2019-09-02 10:52:24 +02:00
qsort.ml test: more fair qsort.ml benchmark 2022-10-12 20:22:55 +02:00
qsort.sml
qsort.swift test: more fair qsort.ml benchmark 2022-10-12 20:22:55 +02:00
rbmap.hs chore: make rbmap.hs more similar to other implementations 2022-09-24 14:16:48 +02:00
rbmap.lean chore: modernize rbmap benchmarks a bit 2022-09-24 14:16:48 +02:00
rbmap.lean.args chore: adjust "small" bench/ inputs to be reasonable for interpreter 2020-02-28 10:04:13 +01:00
rbmap.lean.expected.out chore: adjust "small" bench/ inputs to be reasonable for interpreter 2020-02-28 10:04:13 +01:00
rbmap.ml
rbmap.sml
rbmap.swift tests(tests/bench): add rbmap.swift 2019-05-30 14:47:06 -07:00
rbmap2.lean chore: remove command universes 2021-06-29 17:01:07 -07:00
rbmap3.lean chore: remove command universes 2021-06-29 17:01:07 -07:00
rbmap500k.lean chore: remove command universes 2021-06-29 17:01:07 -07:00
rbmap_checkpoint.hs chore: make rbmap.hs more similar to other implementations 2022-09-24 14:16:48 +02:00
rbmap_checkpoint.lean chore: modernize rbmap benchmarks a bit 2022-09-24 14:16:48 +02:00
rbmap_checkpoint.lean.args chore: adjust "small" bench/ inputs to be reasonable for interpreter 2020-02-28 10:04:13 +01:00
rbmap_checkpoint.lean.expected.out chore: adjust "small" bench/ inputs to be reasonable for interpreter 2020-02-28 10:04:13 +01:00
rbmap_checkpoint.ml test(tests/bench/rbmap_checkpoint): OCaml version using myLen 2019-05-30 07:40:53 -07:00
rbmap_checkpoint.sml chore(tests/bench/rbmap_checkpoint): use myLean 2019-05-30 07:30:07 -07:00
rbmap_checkpoint.swift test(tests/bench/rbmap_checkpoint): add swift version 2019-05-30 14:35:58 -07:00
rbmap_checkpoint2.lean chore: remove command universes 2021-06-29 17:01:07 -07:00
rbmap_checkpoint2.sml
rbmap_checkpoint_cpp_lean3.cpp test(tests/bench): add C++ versions of rbmap benchmarks 2019-06-22 06:58:27 -07:00
rbmap_checkpoint_cpp_std.cpp test(tests/bench): add C++ versions of rbmap benchmarks 2019-06-22 06:58:27 -07:00
rbmap_cpp_lean3.cpp test(tests/bench): add C++ versions of rbmap benchmarks 2019-06-22 06:58:27 -07:00
rbmap_cpp_std.cpp test(tests/bench): add C++ versions of rbmap benchmarks 2019-06-22 06:58:27 -07:00
rbmap_fbip.lean feat: add rbmap_fbip benchmark 2022-10-06 17:26:43 -07:00
rbmap_library.lean chore: more RBMap cleanup 2022-10-06 17:26:43 -07:00
README.md chore: update cross-bench setup 2024-04-15 10:59:07 +02:00
reduceMatch.lean perf: sparse case splitting in match compilation (#10823) 2025-11-06 13:46:35 +00:00
report.py chore: safer bench script 2023-07-19 08:31:39 +02:00
riscv-ast.lean chore: update bench/riskv-ast.lean (#10505) 2025-09-24 11:46:26 +00:00
run.sh
server_startup.lean feat: reduce server memory consumption (#11162) 2025-12-01 10:53:23 +00:00
server_startup.log test: add language server startup benchmark (#3558) 2024-03-04 09:01:51 +00:00
sigmaIterator.lean fix: update naming of FinitenessRelation fields in the sigmaIterator.lean benchmark (#11836) 2025-12-29 23:13:13 +00:00
simp_arith1.lean chore: minimize benchmark imports so we don't spend a majority in importing (#9513) 2025-07-24 12:14:12 +00:00
simp_bubblesort_256.lean doc: correct typos in documentation and comments (#11465) 2025-12-02 06:38:05 +00:00
simp_congr.lean perf: add benchmark for congruence reasoning in simp (#9511) 2025-07-24 10:47:37 +00:00
simp_local.lean refactor: module-ize Lean (#9330) 2025-07-25 12:02:51 +00:00
simp_subexpr.lean perf: simp subexpr benchmark (#9404) 2025-07-17 11:53:48 +00:00
speedcenter.exec.velcom.yaml test: add big_struct_dep1 benchmark (#12191) 2026-01-27 14:36:09 +00:00
speedcenter.yaml chore: try refining some benchmark settings (#8377) 2025-05-16 11:24:11 +00:00
states35.lean chore: move states35 to bench directory 2022-04-09 15:46:28 -07:00
test_single.sh feat: LLVM backend (#1837) 2022-12-30 12:45:30 +01:00
treemap.lean refactor: remove last appearances of allowNontermination (#12211) 2026-01-29 07:22:19 +00:00
unionfind.lean feat: change Array.get to take a Nat and a proof (#6032) 2024-11-12 03:30:46 +00:00
unionfind.lean.args chore: adjust "small" bench/ inputs to be reasonable for interpreter 2020-02-28 10:04:13 +01:00
unionfind.lean.expected.out chore: adjust "small" bench/ inputs to be reasonable for interpreter 2020-02-28 10:04:13 +01:00
unionfind_clean.lean chore(frontends/lean): use => instead of := in match-expressions 2019-07-04 11:38:38 -07:00
watchdogRss.lean feat: reduce server memory consumption (#11162) 2025-12-01 10:53:23 +00:00
workspaceSymbols.lean chore: make workspaceSymbols benchmark independent of sorry search (#11642) 2025-12-12 20:10:27 +00:00
workspaceSymbolsNewRanges.lean chore: make workspaceSymbol benchmarks modules (#11094) 2025-11-05 18:40:39 +00:00

Lean Benchmark Suites

This folder contains multiple small Lean programs for benchmarking used by two separate benchmark suites based on the temci benchmarking tool:

  • The light-weight "Speedcenter" suite benchmarks the current build of Lean. It can be used for quick comparisons on the cmdline and powers the Lean Speedcenter website.
  • The heavy-weight "Cross" suite benchmarks multiple Lean configurations and other functional compilers against each other and generates CSV and HTML reports from that. It was created for the paper "Counting Immutable Beans - Reference Counting Optimized for Purely Functional Programming" (IFL19).

Speedcenter Suite

Requirements:

  • A local Lean build in ../../build/release. Build at least the bin target.
  • temci. Using Nix, open a nix-shell in the project root directory to add a compatible version to your PATH. Alternatively, try pip3 install git+https://github.com/parttimenerd/temci.git.

To execute the suite and save the results in base.yaml, run (in this folder)

temci exec --config speedcenter.yaml --out base.yaml

Other interesting exec flags:

  • use --runs N to modify the default number of 10 runs per benchmark
  • use --included_blocks fast to excluded slow benchmarks like the stdlib benchmark. You can replace fast with any benchmark name or label in speedcenter.exec.yaml.

If you have multiple saved result files, you can compare them with

temci report --config speedcenter.yaml report1.yaml report2.yaml ...

Cross Suite

We recommend using Nix for building/obtaining all Lean variants and used compilers in a reproducible way. After installing Nix, running the benchmarks is as easy as

nix develop
make

This will record 50 runs for each benchmark configuration (this can be changed with runs in cross.yaml), generate results in report_lean.csv and report_cross.csv, and print them to stdout in a tabulated format. It will also generate HTML reports in report/ comparing the time-based benchmarks.

In order to reduce noise in the benchmarking data, you may instead want to try calling make inside a temci shell:

temci short shell --sudo --preset usable --cpuset_active make

Using root powers, this will temporarily configure your machine similarly to the LLVM benchmarking recommendations and move all your other processes to a single CPU core.