- [x] Depends on: #3958
- [x] Depends on: #3960
This makes the UTF-8 encode and decode functions have lean definitions,
so that we can prove properties about them downstream.
This reduces the number of reimplemented functions which complicate
proofs. After inlining it ends up the same as before.
`ltOfOrd` is also changed to use `compare a b = .lt` instead of
`(compare a b == .lt) = true`, for consistency with the normal form in
std.
Previously, there was a function `opaque fromUTF8Unchecked : ByteArray
-> String` which would convert a list of bytes into a string, but as the
name implies it does not validate that the string is UTF-8 before doing
so and as a result it produces unsound results in the compiler (because
the lean model of `String` indirectly asserts UTF-8 validity). This PR
replaces that function by
```lean
opaque validateUTF8 (a : @& ByteArray) : Bool
opaque fromUTF8 (a : @& ByteArray) (h : validateUTF8 a) : String
```
so that while the function is still "unchecked", we have a proof witness
that the string is valid. To recover the original, actually unchecked
version, use `lcProof` or other unsafe methods to produce the proof
witness.
Because this was the only `ByteArray -> String` conversion function, it
was used in several places in an unsound way (e.g. reading untrusted
input from IO and treating it as UTF-8). These have been replaced by
`fromUTF8?` or `fromUTF8!` as appropriate.
Adds the ability to show a diff when `guard_msgs` fails, using the
histogram diff algorithm pioneered in jgit. This algorithm tends to
produce more user-friendly diffs, but it can be quadratic in the worst
case. Empirically, the quadratic case of this implementation doesn't
seem to be slow enough to matter for messages smaller than hundreds of
megabytes, but if it's ever a problem, we can mitigate it the same way
jgit does by falling back to Myers diff.
See lean/run/guard_msgs.lean in the tests directory for some examples of
its output.
Adds `IO.FS.Handle.isTty` to check whether a handle is a Windows console
or Unix terminal. Also adds an `isTty` field to `IO.FS.Stream`, so that
this can be checked on, e.g., `stdout`.
`Nat.repr` was implemented by generating a list of `Chars`, each created
by a 10-way if-then-else. This can cause significant slow down in some
particular use cases.
Now `Nat.repr` is `implemented_by` a faster implementation that uses
C++’s `std::to_string` on small numbers (< USize.size) and maintains an
array of pre-allocated strings for the first 128 numbers.
The handling of big numbers (≥ USize.size) remains as before.
Just a lemma that we noticed is missing when working on #3880 at the
retreat. We also noticed that there are naming inconsistencies in the
lemmas for `bmod` and `emod`, we should fix that in the future.
while trying to help a user who was facing an unhelpful
```
omega did not find a contradiction:
[0, 0, 0, 0, 1, -1] ∈ [1, ∞)
[0, 0, 0, 0, 0, 1] ∈ [0, ∞)
[0, 0, 0, 0, 1] ∈ [0, ∞)
[1, -1] ∈ [1, ∞)
[0, 0, 0, 1] ∈ [0, ∞)
[0, 1] ∈ [0, ∞)
[1] ∈ [0, ∞)
[0, 0, 0, 1, 1] ∈ [-1, ∞)
```
I couldn’t resist and wrote a pretty-printer for these problem that
shows the linear combination as such, and includes the recognized atoms.
This is especially useful since oftem `omega` failures stem from failure
to recognize atoms as equal. In this case, we now get:
```
omega-failure.lean:19:2-19:7: error: omega could not prove the goal:
a possible counterexample may satisfy the constraints
d - e ≥ 1
e ≥ 0
d ≥ 0
a - b ≥ 1
c ≥ 0
b ≥ 0
a ≥ 0
c + d ≥ -1
where
a := ↑(sizeOf xs)
b := ↑(sizeOf x)
c := ↑(sizeOf x.fst)
d := ↑(sizeOf x.snd)
e := ↑(sizeOf xs)
```
and this might help the user make progress (e.g. by using `case x`
first, and investingating why `sizeOf xs` shows up twice)
Implements a new method to generate instance names for anonymous
instances that uses a heuristic that tends to produce shorter names. A
design goal is to make them relatively unique within projects and
definitely unique across projects, while also using accessible names so
that they can be referred to as needed, both in Lean code and in
discussions.
The new method also takes into account binders provided to the instance,
and it adds project-based suffixes. Despite this, a median new name is
73% its original auto-generated length. (Compare: [old generated
names](https://gist.github.com/kmill/b72bb43f5b01dafef41eb1d2e57a8237)
and [new generated
names](https://gist.github.com/kmill/393acc82e7a8d67fc7387829f4ed547e).)
Some notes:
* The naming is sensitive to what is explicitly provided as a binder vs
what is provided via a `variable`. It does not make use of `variable`s
since, when names are generated, it is not yet known which variables are
used in the body of the instance.
* If the instance name refers to declarations in the current "project"
(given by the root module), then it does not add a suffix. Otherwise, it
adds the project name as a suffix to protect against cross-project
collisions.
* `set_option trace.Elab.instance.mkInstanceName true` can be used to
see what name the auto-generator would give, even if the instance
already has an explicit name.
There were a number of instances that were referred to explicitly in
meta code, and these have been given explicit names.
Removes the unused `Lean.Elab.mkFreshInstanceName` along with the
Command state's `nextInstIdx`.
Fixes#2343
Adds options to control whitespace normalization and message ordering in
`#guard_msgs`.
Examples:
1. `#guard_msgs (whitespace := lax)` ignores differences in whitespace
completely.
2. `#guard_msgs (whitespace := exact)` requires an exact match for
whitespace (after trimming).
3. `#guard_msgs (ordering := sorted)` sorts the list of messages, to
make it insensitive to message order.
Part of the retreat Hackathon.
---------
Co-authored-by: David Thrane Christiansen <david@davidchristiansen.dk>
Co-authored-by: Mario Carneiro <di.gama@gmail.com>
This adds some basic lemmas to support commuting ofInt/toInt and
add/mul.
It also removes the simp annotation on `ofNat_add_ofNat` as in some
contexts the other direction or conversion to Int may be desired.
This removes simp attributes from `Nat.succ.injEq` and
`Nat.succ_sub_succ_eq_sub` to replace them with simprocs. This is
because any reductions involving `Nat.succ` has a high risk of leading
proof performance problems when dealing with even moderately large
numbers.
Here are a couple examples that will both report a maximum recursion
depth error currently. These examples are fixed by this PR.
```
example : (123456: Nat) = 12345667 := by
simp
example (x : Nat) (p : x = 0) : 1000 - (x + 1000) = 0 := by
simp
```
The `conv` tactic tries to close “trivial” goals after itself. As of
now, it uses
`try rfl`, which means it can close goals that are only trivial after
reducing with
default transparency. This is suboptimal
* this can require a fair amount of unfolding, and possibly slow down
the proof
a lot. And the user cannot even prevent it.
* it does not match what `rw` does, and a user might expect the two to
behave the
same.
So this PR changes it to `with_reducible rfl`, matching `rw`’s behavior.
I considered `with_reducible eq_refl` to only solve trivial goals that
involve equality,
but not other relations (e.g. `Perm xs xs`), but a discussion on mathlib
pointed out
that it’s expected and desirable to solve more general reflexive goals:
https://leanprover.zulipchat.com/#narrow/stream/270676-lean4/topic/Closing.20after.20.60rw.60.2C.20.60conv.60.3A.20.60eq_refl.60.20instead.20of.20.60rfl.60/near/429851605
This makes changes to the `GetElem` class so that it does not lead to
unnecessary overhead in container like `RBMap`.
The changes are to:
1. Make `getElem?` and `getElem!` part of the `GetElem` class so they
can be overridden in instances.
2. Introduce a `LawfulGetElem` class that contains correctness theorems
for `getElem?` and `getElem!` using the original definitions.
3. Reorganize definitions (e.g, by moving `GetElem` out of
`Init.Prelude`) so that the `GetElem` changes are feasible.
4. Provide `LawfulGetElem` instances to complement all existing
`GetElem` instances in Lean core.
To reduce the size of the PR, this doesn't do the work of providing new
`GetElem` instances for `RBMap`, `HashMap` etc. That will be done in a
separate PR (#3688) that depends on this.
---------
Co-authored-by: Mac Malone <tydeu@hatpress.net>
fixes#3657
These functions are mostly not used by lean itself, but it does affect
two occurrences of `ByteArray.toUInt64LE! <$> IO.getRandomBytes 8` which
I left as is instead of switching them to use `toUInt64BE!` to preserve
behavior; but they are random bytes anyway seeded by the OS so it's
unlikely any use of them depending on particular values was sound to
begin with.
Co-authored-by: Scott Morrison <scott.morrison@gmail.com>
Previously:
If the `rfl` macro was going to fail, it would:
1. expand to `eq_refl`, which is implemented by
`Lean.Elab.Tactic.evalRefl`, and call `Lean.MVarId.refl` which would:
* either try kernel defeq (if in `.default` or `.all` transparency mode)
* otherwise try `IsDefEq`
* then fail.
2. Next expand to the `apply_rfl` tactic, which is implemented by
`Lean.Elab.Tactic.Rfl.evalApplyRfl`, and call `Lean.MVarId.applyRefl`
which would look for lemmas labelled `@[refl]`, and unfortunately in
Mathlib find `Eq.refl`, so try applying that (resulting in another
`IsDefEq`)
3. Because of an accidental duplication, if `Lean.Elab.Tactic.Rfl` was
imported, it would *again* expand to `apply_rfl`.
Now:
1. Same behaviour in `eq_refl`.
2. The `@[refl]` attribute will reject `Eq.refl`, and `MVarId.applyRefl`
will fail when applied to equality goals.
3. The duplication has been removed.
[Before](https://github.com/leanprover/lean4/files/14772220/oi.pdf) and
[after](https://github.com/leanprover/lean4/files/14772226/oi2.pdf).
This gets `ByteArray`, `String.Extra`, `ToString.Macro` and `RCases` out
of the imports of `omega`. I'd hoped to get `Array.Subarray` too, but
it's tangled up in the list literal syntax. Further progress could come
from make `split` use available `Decidable` instances, so we could pull
out `Classical` (and possibly some of `PropLemmas`).
fixes#3770
Also start `rfl` with a `fail` message that is hopefully more helpful
than what we get now (see updated test output). This would be a cheaper
way to address #3302 without changing the implementation of rfl (as
tried in #3714).
This shortens `Array.findIdx?` code, by using termination_by (and
well-founded recursion) instead of a structural recursion trick, with
the intent to make it more proof friendly.
One motivation is that it makes it easier to write a proof that
`Array.findIdx?` and `List.findIdx?` are equivalent. Furthermore, this
will be useful to prove that more complex functions are equivalent.
Closes#3646
System.FilePath.parent did not return the correct parent path in the
case of absolute file paths
Example of previous behavior
```
(FilePath.mk "/foo").parent -> some (FilePath.mk "")
(System.FilePath.mk "/").parent -> some (FilePath.mk "")
```
The new behavior is based on rust's std::path::Path::parent function (as
previously described in comment in System.FilePath)
Example of updated behavior
```
(System.FilePath.mk "/foo").parent -> some (FilePath.mk "/")
(System.FilePath.mk "/").parent -> none
```
Behavior for relative file paths is unchanged
Closes#3618
This updates the rw? tactic from Mathlib to use lazy discriminator trees
and upstreams it.
---------
Co-authored-by: Scott Morrison <scott.morrison@gmail.com>
Sets the default value to `pp.fieldNotation.generalized` to `true`.
Updates tests, and fixes some minor flaws in the implementation of the
generalized field notation pretty printer.
Now generalized field notation won't be used for any function that has a
`motive` argument. This is intended to prevent recursors from pretty
printing using it as (1) recursors are more like control flow structures
than actual functions and (2) generalized field notation tends to cause
elaboration problems for recursors.
Note: be sure functions that have an `@[app_unexpander]` use
`@[pp_nodot]` if applicable. For example, `List.toArray` needs
`@[pp_nodot]` to ensure the unexpander prints it using `#[...]`
notation.