From 57915af218086affceebcb9135811df05e5af083 Mon Sep 17 00:00:00 2001 From: Paul Reichert <6992158+datokrat@users.noreply.github.com> Date: Wed, 23 Apr 2025 15:55:20 +0200 Subject: [PATCH] fix: reducing Nat.pow, kernel interprets constant as Nat literal (#8060) This PR fixes a bug in the Lean kernel. During reduction of `Nat.pow`, the kernel did not validate that the WHNF of the first argument is a `Nat` literal before interpreting it as an `mpz` number. This PR adds the missing check. ### Explanation In `type_checker::reduce_pow`, an expression was interpreted as a `Nat` literal without previously validating that it actually was a `Nat` literal. We (@TwoFX and me) noticed this while fuzzing the Lean kernel with GMP and Mimalloc disabled. Until now, the fuzzer found one crash, leading us to this issue. What are the consequences? If GMP is disabled, the Lean kernel will crash on some inputs after the memory allocator returns `null`. (MPZ tries to clone the `.const` expression in disguise of a `Nat` literal which accidentally has a size field indicating that the number has 88 trillion `mpz` digits. This is too much for every allocator.) If GMP is enabled, it is possible to [prove `False`](https://live.lean-lang.org/#codez=JYWwDg9gTgLgBAGQKYEMB2AoDATJAzOAcwDoBvAVwF84AuOABQFU1gYyq5AkwjgDkV4aAXjh5yaOAH04ggHxwIYJOIDCAGxQBnDcADGKVXGDjgBACoBPRdLgWrMABZK4ABjhJVGpC6wBaH3AApcg14AHcoViNCOAADPjZIULgACnjiRLgARlcADgBKOABWZwAmGLhQiHJVbDh9DQgK6ABrAG5YtIzU/nSIJOy4fKLnAGZyyursNAByGBx8G1pefi4GKAVaYVFxAA9pORM4PeFXBycAMXqvd08bKHIkX39TRzhmpCg0dzgoJGxyHRIDRwBzAYEwRooOBocggABGHzgCJgoSQTmyAD0cqMSnU0LVMdiACw5eYEegAeQA6gBCTbLBJ9FIkUjOaiAC/JAJfkBUyWHcKDhcAARABiIwAKyQOhgEjhKGwEjA6wgeCFSx0EBAIHQtVkcGwEAwcDgqiQ8FwOgMdGQ6GIABEpeooPxgBBxEI4MRcHg0A7LXBSEbjcG0CgQF4PTEQOYHCAADRB4NwexGGDAj3EX6EaooKAuBNJ40aFB4M3menEYulguFmCWCPCZLEFBgMApYgatAhWKmOAAbQAugUm53uzFKbT+0O8jWk6aAG7uei5sPp4SD2fB+f6B70kduseme5IYip9ZTvKJuCUIM2tDEACi6jhxGUmu1+OIqhMMDfvwAsikd7Ntg2B+gYFqqJeGB+HAyjOhojjAocADi/70IYwLYGCAqmtgGBimgkrSrK8qKsqeBYGc0BICARASEgACOEgAF4fI0dAshw4gnPScLmEGYh4BANREEGGhgN+8AADytHIUB4KoVGODRdGIX0Eh4FcSyXB4DZIgJxo6PY6CEF4vbdIySSuJkl7GlASR9oACYT0UxrHsQOQZIDsKDSnA0axhgQA) because the kernel doesn't crash on a memory allocation and instead just happily interprets the `.const` expression as a GMP number. Importantly, this is _not_ a flaw in Lean's type theory. It is an implementation bug in the built-in kernel, related to the efficient reduction of `Nat.pow`, that will be fixed with this PR; see the test file. Because Lean's kernel is relatively small, there are third-party kernel implementations such as `lean4lean` and `nanoda`. `lean4lean` catches the bogus proof, and looking at its code `nanoda` will, too, but I haven't tried it yet. --- src/kernel/type_checker.cpp | 1 + tests/lean/run/pow_exploit.lean | 43 +++++++++++++++++++++++++++++++++ 2 files changed, 44 insertions(+) create mode 100644 tests/lean/run/pow_exploit.lean diff --git a/src/kernel/type_checker.cpp b/src/kernel/type_checker.cpp index a58daf3824..4d7147576f 100644 --- a/src/kernel/type_checker.cpp +++ b/src/kernel/type_checker.cpp @@ -599,6 +599,7 @@ template optional type_checker::reduce_bin_nat_op(F const & f, optional type_checker::reduce_pow(expr const & e) { expr arg1 = whnf(app_arg(app_fn(e))); + if (!is_nat_lit_ext(arg1)) return none_expr(); expr arg2 = whnf(app_arg(e)); if (!is_nat_lit_ext(arg2)) return none_expr(); nat v1 = get_nat_val(arg1); diff --git a/tests/lean/run/pow_exploit.lean b/tests/lean/run/pow_exploit.lean new file mode 100644 index 0000000000..de5d38c261 --- /dev/null +++ b/tests/lean/run/pow_exploit.lean @@ -0,0 +1,43 @@ +import Lean + +/-! +This was once an exploit of a bug in the Lean kernel relying on the Lean kernel +falsely interpreting an expression as a Nat literal. +-/ + +def g.{u} : PUnit.{u} → Nat := fun _ => open Classical in if Type = Type then 0 else 0 + +-- Just writing `Nat.pow (Nat.pow 10 8) 502` would also work; `Nat.pow (Nat.pow 10 8) 503` wouldn't +def T : Nat → Prop := fun x => if x = 0 then False else True + +-- The kernel reduces this to a number between 10^8032 and 10^8048 +def POW! := Nat.pow (g.{0} ⟨⟩) 1 + +elab "#inject_bad_proof" : command => do + let decl : Lean.Declaration := .defnDecl { + name := `mythm, + hints := .regular 0, + safety := .safe, + type := (.app (.const `T []) (.const `POW! [])), + levelParams := [], + value := (.const `True.intro []) + } + Lean.Elab.Command.liftCoreM (Lean.addDecl decl) + +/-- +error: (kernel) declaration type mismatch, 'mythm' has type + True +but it is expected to have type + T POW! +-/ +#guard_msgs in +#inject_bad_proof + +theorem g_eq_zero : g.{u} n = 0 := by + unfold g + split <;> rfl + +theorem show_false : False := by + change T (Nat.pow 0 1) + rw [← g_eq_zero] + exact mythm