History

Kyle Miller 047f6aaf89 feat: efficient tactic configuration elaborators and configurability (#13651 ) This PR replaces the previous tactic configuration system with a significantly more efficient one that supports custom configuration syntaxes and processing. On a simple benchmark, configuration evaluation takes 6.2% of the time it used to. The `declare_config_elab` command generates a configuration elaborator that now directly constructs configuration objects; previously it relied on `Meta.evalExpr'`, which involved running a configuration through the full term elaboration, compilation, and evaluation processes. The generated configuration elaborators now also have the capability to do direct `Syntax` evaluation in common cases, skipping term elaboration. Furthermore, the elaborator accepts configurations more liberally: any user-defined syntax that has the form of an `optConfig`-style configuration or configuration item (including, e.g., `namedArgument`s) is accepted. Import `Lean.Elab.ConfigEval` to use the system; see this module for some documentation in addition to the docstrings in `Lean.Elab.ConfigEval.Commands`. Furthermore, the `simp` tactic now also has `(user.optionName := ...)` user configuration options, which can be declared using a global `tactic.simp.user.optionName` option; use `getUserConfigOption` and `withUserConfig` to access and set these in metaprograms. Other features: - `declare_config_elab` creates a function that exposes an `init` parameter for the configuration that will be modified. It also now has a `where` clause, enabling defining custom handlers for specific keys. - Elaborators can be given additional binders, to make parameterized elaborators. This is used by `simp` and `grind ` to support multiple default configurations with the correct expected type for `(config := ...)` elaboration. - The `EvalTerm` class supports direct evaluation of `Syntax`, skipping term elaboration. The system will attempt to automatically derive this class when generating the elaborator. - In case `EvalTerm` does not recognize the term, then the syntax is elaborated to an expression and an `EvalExpr` instance is applied to evaluate the expression. The system will similarly automatically derive these instances if possible. - Automatic derivation is transitive. It is able to seek instances through other instances; e.g. if it needs an `EvalTerm (List T)` instance it will be able to reduce this to seeking an `EvalTerm T` instance. - The system is designed to be flexible, and the various components can be combined to construct a configuration elaborator. There are also now `declare_core_config_elab` and `declare_term_config_elab` for conveniently generating elaborators for `CoreM` and `TermElabM`. The difference is that the first takes an explicit flag for whether to log exceptions, and the second uses the current `errToSorry` state. Warning: if you use the `TermElabM` one from `TacticM`, it will be unaware of the current `recover` state. The only differences between these macros are the ways error recovery is handled per monad. Other changes: - `#reduce` tactic configuration now makes use of this system and has more options - The module `Lean.Elab.Tactic.ConfigSetter` is removed; the `declare_config_elab`-family macros subsume its functionality. - The module `Lean.Elab.Tactic.Config` is deprecated and will be removed; migration notes appear in the module docs there. Import `Lean.Elab.ConfigEval` instead. - One of the mvcgen benchmarks got significantly slower, but it turned out to be caused by the new tactic configuration elaboration no longer resetting the MetaM caches. Adding an explicit `resetCache` into the test driver fixed the benchmark. ### Notes for metaprogram authors If you are using the module system, you just need a `meta import Lean.Elab.ConfigEval` to use the macros, and it should serve as a drop-in replacement to the previous system so long as 1. your configuration type is a `structure` with no parameters, indices, or universes (only `Type` is supported); 2. default values are self-contained and not dependent on other fields; and 3. all fields have types that are composed from `Option`, `List`, `Array`, `String`, `DataValue`, and inductive types in `Type` with no parameters or indices, whose fields are similarly composed. The macros synthesize a self-contained configuration elaboration procedure, analyzing the `EvalTerm` and `EvalExpr` instances that are currently available or can be automatically derived. These are the components of the system: - `EvalTerm` instances provide `Term → TermElabM α` functions for evaluation of raw syntax; these handle the common cases where an option value is a identifier, application, or other simple expression. They are responsible for adding TermInfo when info is enabled, to support hovers. One can invoke derivation of a `EvalTerm T` instance with the `ensure_eval_term_instance T` command (after `open scoped Lean.Elab.ConfigEval`). - `EvalExpr` instances provide `Expr → TermElabM α` functions for evaluation of elaborated expressions; these handle cases where an option value may require reduction to evaluate. Similarly, one can invoke derivation of an `EvalExpr T` instance with the `ensure_eval_expr_instance T` command. If needed, there's also `derive_eval_expr_instance_using_meta_eval T` for creating a `Meta.evalExpr'`-based evaluator. - Functions like `ConfigEval.evalExprWithElab` compose `EvalTerm` and `EvalExpr` instances into a single procedure that first attempts `EvalTerm`, and, if that fails, applies the standard term elaborator and then attempts `EvalExpr`. This way term elaboration can be skipped in all but uncommon cases. - Configuration item interpretation is through `ConfigEval.foldConfigM`, which is a procedure with a lax specification for what counts as a configuration item, calling the provided function on each recognized configuration item. The idea is: - Null nodes are lists of configurations - One-argument nodes are considered to be wrappers like `optConfig` or `configItem` - Two-argument nodes of the form `("+"<\|>"-") (atom<\|>ident)` are considered to be boolean options - Five-argument nodes of the form `"(" (atom<\|>ident) ":=" syntax ")"` are considered to be general configuration items. (It only checks for the presence of `(` and that there are two-to-five arguments.) - Bare atoms are considered to be positive boolean options - Configuration evaluation then uses `EvalConfigItem.set` on each item produced by the fold, for an `EvalConfigItem` structure defined for the given configuration type. The `def_eval_config_item` command can be used to generate this structure. It analyzes which `EvalTerm` and `EvalExpr` instances exist and derives missing ones, then builds an efficient procedure to process configuration items and apply evaluators. - Lastly, there are the `declare_core_config_elab`, `declare_term_config_elab`, `declare_config_elab`, and `declare_command_config_elab` macros for conveniently running the `def_eval_config_item` command and constructing a self-contained elaboration function. The derivation procedures analyze which `EvalTerm`/`EvalExpr` instances already exist and only derive the "leaf" instances that are necessary to construct `EvalTerm` and `EvalExpr` instances. The derived instances are made `private local` — since configuration elaborators are meant to be self-contained, we decided not to let the additional instances be a side effect of the macros. The instances can be globally added by manually using the `ensure_*` commands. The macros support making parameterized elaborators with arbitrary additional binders. See `make_elab_grind_config` and `make_elab_simp_config` in core Lean for examples of generating a single elaborator that's used with multiple default value configurations. To see how to create a key handler that matches all configuration keys with a given prefix, see `make_elab_simp_config`. There is a todo item at `Lean.Elab.ConfigEval.ReflectConfigItems` for reflecting configurations back to syntax, which is not yet supported. ### Performance evaluation A legacy configuration parser was temporarily added to `Lean.Elab.Tactic.Grind.Config` using `declare_term_config_elab_legacy elabGrindConfigLegacy Grind.Config`, and then this file was used for measuring elaboration time: ```lean import Lean open Lean Elab Meta Tactic Parser def cfgs : Array Syntax := Unhygienic.run do return #[ ← `(Tactic.optConfig\| ), ← `(Tactic.optConfig\| +clean), ← `(Tactic.optConfig\| +trace +markInstances -lookahead -useSorry), ← `(Tactic.optConfig\| (trace := true) (markInstances := true) (lookahead := false) (useSorry := false)), ← `(Tactic.optConfig\| -trace (splits := 20) +revert (maxSuggestions := some 3) (ematch := 2)), ← `(Tactic.optConfig\| (gen := 5) -reducible +splitImp -funCC), ← `(Tactic.optConfig\| (config := { trace := true, lookahead := false, maxSuggestions := some 3 })), ] def testGrindElab (cfgs : Array Syntax) (n : Nat) : TacticM Unit := do profileitM Exception "test grind elab" (← getOptions) do let mut ematch := 0 for _ in [0:n] do for cfg in cfgs do let c ← Tactic.elabGrindConfig cfg ematch := ematch + c.ematch logInfo m!"sum = {ematch}" def testGrindElabLegacy (cfgs : Array Syntax) (n : Nat) : TacticM Unit := do profileitM Exception "test grind elab legacy" (← getOptions) do let mut ematch := 0 for _ in [0:n] do for cfg in cfgs do let c ← Tactic.elabGrindConfigLegacy cfg ematch := ematch + c.ematch logInfo m!"sum = {ematch}" def runTest (info : Bool) (test : TacticM Unit) : TermElabM Unit := do withEnableInfoTree info do let mvar ← mkFreshExprMVar none discard <\| Tactic.run mvar.mvarId! test set_option maxHeartbeats 0 set_option profiler true set_option profiler.threshold 1 def iters : Nat := 1000 #eval runTest false <\| testGrindElab cfgs iters #eval runTest true <\| testGrindElab cfgs iters #eval runTest false <\| testGrindElabLegacy cfgs iters #eval runTest true <\| testGrindElabLegacy cfgs iters ``` A representative output is ``` test grind elab took 315ms test grind elab took 333ms test grind elab legacy took 5.22s test grind elab legacy took 5.33s ``` Computing `(315.0 + 333.0) / (5220 + 5330)` and rounding up to the nearest tenth gives the 6.2% figure. --- The #13426 draft PR includes some LSP modifications to support completions for `simp` user configuration options.		2026-05-14 17:20:57 +00:00
..
build	chore: don't fail on running build bench on built stage3 (#13467 )	2026-04-18 22:07:21 +00:00
mergeSort	feat: `Array.mergeSort` (#12385 )	2026-03-06 13:18:13 +00:00
mvcgen	feat: efficient tactic configuration elaborators and configurability (#13651 )	2026-05-14 17:20:57 +00:00
qsort	chore: relative lean-toolchains (#12652 )	2026-02-25 10:23:35 +00:00
size	chore: improve how test suite interacts with stages (#12913 )	2026-03-16 15:20:03 +00:00
sym	chore: minor tweaks to `Sym.simp` test and benchmark (#13468 )	2026-04-18 21:11:30 +00:00
.gitignore	chore: switch to new test/bench suite (#12590 )	2026-02-25 13:51:53 +00:00
accumulate_profile.py	chore: add `lakeprof` benchmarks (#9709 )	2025-08-06 11:25:45 +00:00
arith_eval.ml
binarytrees.ghc-6.hs
binarytrees.ocaml-2.ml
binarytrees.st.hs
binarytrees.st.mlton-2.sml
binarytrees.st.sml
binarytrees.st.swift
binarytrees.swift
binarytrees5.ml
binarytrees5_multicore.ml
compile.sh
const_fold.hs
const_fold.ml
const_fold.sml
const_fold.swift
cross.yaml
dag_hassorry_issue.lean	chore: deprecate levelZero and levelOne (#12720 )	2026-03-04 01:03:08 +00:00
dag_hassorry_issue.lean.args	chore: reduce stack space usage at `instantiate_mvars_fn` (#4931 )	2024-08-06 17:38:59 +00:00
dag_hassorry_issue.lean.out.expected	chore: improve how test suite interacts with stages (#12913 )	2026-03-16 15:20:03 +00:00
delayed_assign.lean	test: delayed assignment performance issue (#12201 )	2026-01-28 02:08:39 +00:00
deriv.hs
deriv.ml
deriv.sml
deriv.swift
flake.lock	chore: update cross-bench setup	2024-04-15 10:59:07 +02:00
flake.nix	chore: robustify Nix shell (#8141 )	2025-04-28 15:08:32 +00:00
full-stdlib.exec.yaml
ghc-gc.py
lean-gc.py
Makefile	chore: update cross-bench setup	2024-04-15 10:59:07 +02:00
mlkit-gc.py
ocaml-gc.py
perf.py
qsort.hs
qsort.ml
qsort.sml
qsort.swift
rbmap.hs
rbmap.ml
rbmap.sml
rbmap.swift
rbmap2.lean
rbmap3.lean
rbmap500k.lean
rbmap_checkpoint.hs
rbmap_checkpoint.ml
rbmap_checkpoint.sml
rbmap_checkpoint.swift
rbmap_checkpoint2.lean
rbmap_checkpoint2.sml
rbmap_checkpoint_cpp_lean3.cpp
rbmap_checkpoint_cpp_std.cpp
rbmap_cpp_lean3.cpp
rbmap_cpp_std.cpp
README.md	chore: update cross-bench setup	2024-04-15 10:59:07 +02:00
report.py
run.sh
speedcenter.yaml	chore: try refining some benchmark settings (#8377 )	2025-05-16 11:24:11 +00:00
states35.lean
test_single.sh
unionfind_clean.lean

README.md

Lean Benchmark Suites

This folder contains multiple small Lean programs for benchmarking used by two separate benchmark suites based on the temci benchmarking tool:

The light-weight "Speedcenter" suite benchmarks the current build of Lean. It can be used for quick comparisons on the cmdline and powers the Lean Speedcenter website.
The heavy-weight "Cross" suite benchmarks multiple Lean configurations and other functional compilers against each other and generates CSV and HTML reports from that. It was created for the paper "Counting Immutable Beans - Reference Counting Optimized for Purely Functional Programming" (IFL19).

Speedcenter Suite

Requirements:

A local Lean build in ../../build/release. Build at least the bin target.
temci. Using Nix, open a nix-shell in the project root directory to add a compatible version to your PATH. Alternatively, try pip3 install git+https://github.com/parttimenerd/temci.git.

To execute the suite and save the results in base.yaml, run (in this folder)

temci exec --config speedcenter.yaml --out base.yaml

Other interesting exec flags:

use --runs N to modify the default number of 10 runs per benchmark
use --included_blocks fast to excluded slow benchmarks like the stdlib benchmark. You can replace fast with any benchmark name or label in speedcenter.exec.yaml.

If you have multiple saved result files, you can compare them with

temci report --config speedcenter.yaml report1.yaml report2.yaml ...

Cross Suite

We recommend using Nix for building/obtaining all Lean variants and used compilers in a reproducible way. After installing Nix, running the benchmarks is as easy as

nix develop
make

This will record 50 runs for each benchmark configuration (this can be changed with runs in cross.yaml), generate results in report_lean.csv and report_cross.csv, and print them to stdout in a tabulated format. It will also generate HTML reports in report/ comparing the time-based benchmarks.

In order to reduce noise in the benchmarking data, you may instead want to try calling make inside a temci shell:

temci short shell --sudo --preset usable --cpuset_active make

Using root powers, this will temporarily configure your machine similarly to the LLVM benchmarking recommendations and move all your other processes to a single CPU core.