lean4-htt/tests/lean/run/wf_preprocess_leak.lean
Joachim Breitner 2441bf1f76
perf: check simp cache in simpLoop (#8880)
This PR makes `simp` consult its own cache more often, to avoid
replicating work.

Before, the simp cache was checked upon entry of `simpImpl` only, which
then calls `simpLoop`, which recursively iterates the `pre`-lemmas,
without checking the cache again.

Now, `simpLoop` itself checks the cache. This seems more principled,
given that `simpLoop` is actually putting entries into the cache for
each of its calls, so it’s more uniform if it checks the cache itself.

This avoids repeated rewrites. For example given
```
theorem ab : a = b := testSorry
theorem bc : b = c := testSorry
example (h : P c) : P b ∧ P a := by simp [ab, bc, h]
```
simp would rewrite `b ==> c` twice (once as part of `b ==> c` and then
again as part of `a ==> b ==> c`). And it’d be order dependent: With
```
example (h : P c) : P a ∧ P b := by simp [ab, bc, h]
```
the `a ==> b ==> c` chain would insert `b ==> c` into the cache, and
picked up by `simpImpl` when rewriting `P b`.

With this change, `b ==> c` is performed only once in both examples.

Instruction counts on stdlib and mathlib both show a mild improvement
across the board (0.5%), with individual modules improving by up to 4%
in stdlib and even more in mathlib.


(This does not check the cache before applying `post`, which explains
where there are still some repeated rewrites in the trace logs. But I’m
less sure about inserting a cache check here and so I am treading
carefully here. It’s also going to be at most one `post` application
that’s duplicated, because if `post` returns `.visit`, we go back to
`pre` and thus a cache check.)
2025-06-21 17:58:05 +00:00

107 lines
3.4 KiB
Text
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

/-! This tests demonstrates where and how wf preprocessing leaks to the user -/
structure Tree (α : Type) where
cs : List (Tree α)
def Tree.isLeaf (t : Tree α) := t.cs.isEmpty
-- The `cs.map` in the outer call to `revrev` gets the `attach`-attaching treatment and shows up in
-- the proof state:
/--
trace: α : Type
n : Nat
cs : List (Tree α)
x✝ :
(y : (_ : Nat) ×' Tree α) →
(invImage (fun x => PSigma.casesOn x fun n t => (n, t)) Prod.instWellFoundedRelation).1 y ⟨n.succ, { cs := cs }⟩ →
Tree α
⊢ Prod.Lex (fun x1 x2 => x1 < x2) (fun a₁ a₂ => sizeOf a₁ < sizeOf a₂)
(n, { cs := List.map (fun x => x✝ ⟨n + 1, x.val⟩ ⋯) cs.attach }) (n.succ, { cs := cs })
-/
#guard_msgs(trace) in
def Tree.revrev : (n : Nat) → (t : Tree α) → Tree α
| 0, t => t
| n + 1, Tree.mk cs => revrev n (Tree.mk (cs.map (·.revrev (n + 1))))
termination_by n t => (n, t)
decreasing_by
· apply Prod.Lex.right
simp
have := List.sizeOf_lt_of_mem _ ∈ _
omega
· trace_state
apply Prod.Lex.left
decreasing_tactic
-- as well as in the induction principle:
-- set_option trace.Meta.FunInd true
/--
info: Tree.revrev.induct {α : Type} (motive : Nat → Tree α → Prop) (case1 : ∀ (t : Tree α), motive 0 t)
(case2 :
∀ (n : Nat) (cs : List (Tree α)),
(∀ (x : Tree α), x ∈ cs → motive (n + 1) x) →
(∀ (x : Subtype (Membership.mem cs)), motive (n + 1) x.val) →
motive n
{
cs :=
List.map
(fun x =>
match x with
| ⟨x, h⟩ => Tree.revrev (n + 1) x)
cs.attach } →
motive n.succ { cs := cs })
(n : Nat) (t : Tree α) : motive n t
-/
#guard_msgs in
#check Tree.revrev.induct
-- Tangent: Why three IHs here? Because in the termination proof, the `
-- match x with | ⟨x, h⟩ => Tree.revrev (n + 1) x)
-- was replaced by
-- Tree.revrev (n + 1) ↑x
-- (maybe due to split/simpMatch) and funind picks up that recursive call as a separate one.
-- See
-- set_option pp.proofs true in #print Tree.revrev._unary
-- set_option pp.proofs true in #print Tree.revrev._unary.proof_3
-- It does not show up in the equational theorems:
/--
info: equations:
theorem Tree.revrev.eq_1 : ∀ {α : Type} (x : Tree α), Tree.revrev 0 x = x
theorem Tree.revrev.eq_2 : ∀ {α : Type} (n : Nat) (cs : List (Tree α)),
Tree.revrev n.succ { cs := cs } = Tree.revrev n { cs := List.map (fun x => Tree.revrev (n + 1) x) cs }
-/
#guard_msgs in
#print equations Tree.revrev
theorem sizeOf_map {α β : Type} [SizeOf α] [SizeOf β]
(f : α → β) (xs : List α) (hf : ∀ x, x ∈ xs → sizeOf (f x) = sizeOf x) :
sizeOf (List.map f xs) = sizeOf xs := by
induction xs with
| nil =>
simp
| cons x xs ih =>
simp [List.map]
simp [hf]
apply ih
intro x hx
apply hf
apply List.mem_cons.2
exact Or.inr hx
-- Lets see how tedious it is to use the functional induction principle:
example (n : Nat) (t : Tree α) : sizeOf (Tree.revrev n t) = sizeOf t := by
induction n, t using Tree.revrev.induct with
| case1 =>
simp [Tree.revrev]
| case2 n cs ih1 ih2 ih3 =>
simp [Tree.revrev]
simp only [Subtype.forall, List.map_subtype, List.unattach_attach, Tree.mk.sizeOf_spec] at *
rw [ih3]; clear ih3
rw [sizeOf_map]
· assumption