lean4-htt/tests
Leonardo de Moura fa40491c78
test: benchmark MetaM vs SymM (#12100)
This PR adds a comparison between `MetaM` and `SymM` for a benchmark was
proposed during the Lean@Google Hackathon.

### Benchmark description

In this benchmark, we define the semantics of a very simple imperative
language using an inductive predicate

```
Exec prog events mem lctx post
```

The predicate holds if, when executing the program `prog` with an
initial list of events `events`, memory `mem`, and local context `lctx`,
the postcondition `post` holds.

We then consider the following program:

```
input b
a := b
a := a + a
a := a - b
...
a := a + a
a := a - b
```

That is, after reading an input value `b`, the program repeatedly
updates the variable `a` by doubling it and then subtracting `b`.

We prove that, for any initial memory `m` and local context `l`, and
starting from the empty list of events, the following postcondition
holds:

```
fun t' m' l' =>
  m' = m ∧                      -- memory did not change
  ∃ v : Word,
    t' = [IOEvent.IN v] ∧       -- exactly one input event
    l'.get "a" = some v         -- `a` contains the input value
```

In other words, executing the program produces exactly one input event,
leaves the memory unchanged, and ensures that the final value of `a` is
equal to the input value.

### Symbolic simulation benchmark (problem size `n`, with `2·n + 2`
instructions)

| Problem size (n) | MetaM time (ms) | MetaM kernel (ms) | SymM time
(ms) | SymM kernel (ms) | Total speedup |

|------------------|------------------|-------------------|----------------|------------------|---------------|
| 10  | 94.83  | 6.60  | 7.04  | 6.18  | ~13.5× |
| 20  | 218.92 | 13.33 | 14.15 | 13.02 | ~15.5× |
| 30  | 375.10 | 22.95 | 26.51 | 19.81 | ~14.2× |
| 40  | 563.82 | 34.99 | 40.42 | 29.55 | ~14.0× |
| 50  | 815.89 | 53.78 | 60.84 | 42.25 | ~13.4× |
| 60  | 1081.09 | 73.46 | 80.99 | 53.52 | ~13.3× | 
| 70  | 1400.80 | 102.70 | 106.02 | 68.61 | ~13.2× | 
| 80  | 1772.19 | 126.65 | 134.23 | 87.64 | ~13.2× |
| 90  | 2203.41 | 161.68 | 168.26 | 115.52 | ~13.1× | 
| 100 | 2474.09 | 191.23 | 209.13 | 143.86 | ~11.8× |

<img width="580" height="455" alt="image"
src="https://github.com/user-attachments/assets/bc7058fa-e71a-4c2c-be28-860f39166965"
/>

 ### Symbolic simulation with extra simplification (SymM)

Problem size `n` corresponds to a program with `2·n + 2` instructions.

| n   | Total time (ms) | Kernel time (ms) | Non-kernel time (ms) |
|-----|------------------|------------------|----------------------|
| 10  | 6.33  | 3.97 | 2.36 |
| 20  | 10.30 | 5.59 | 4.71 |
| 30  | 13.72 | 7.38 | 6.34 |
| 40  | 17.85 | 8.84 | 9.01 |
| 50  | 21.90 | 10.63 | 11.27 |
| 60  | 27.00 | 12.56 | 14.44 |
| 70  | 32.02 | 14.04 | 17.98 |
| 80  | 37.25 | 15.76 | 21.49 |
| 90  | 42.55 | 17.95 | 24.60 |
| 100 | 49.30 | 20.03 | 29.27 |
| 200 | 125.56 | 38.21 | 87.36 |
| 300 | 293.58 | 66.79 | 226.79 |
| 400 | 361.87 | 78.96 | 282.91 |
| 500 | 518.51 | 102.51 | 416.00 |
| 600 | 716.63 | 122.81 | 593.82 |
2026-01-22 01:38:56 +00:00
..
bench test: benchmark MetaM vs SymM (#12100) 2026-01-22 01:38:56 +00:00
bench-radar chore: add size/install benchmark (#12015) 2026-01-15 14:43:47 +00:00
compiler
elabissues
ir
lake fix: lake: --no-build failure w/ optional release fetch (#12086) 2026-01-21 23:14:54 +00:00
lean fix: symbol name for native boxed declarations in the interpreter (#12095) 2026-01-21 20:38:29 +00:00
pkg fix: split ngen on async elab (#12000) 2026-01-14 12:35:25 +00:00
playground
plugin
simpperf
.gitignore
common.sh feat: re-integrate lean4checker as leanchecker (#11887) 2026-01-08 09:41:33 +00:00
lakefile.toml
lean-toolchain