This PR enforces rules around arithmetic of `String.Pos.Raw`.
Specifically, it adopts the following conventions:
- Byte indices ("ordinals") in strings should be represented using
`String.Pos.Raw`
- Amounts of bytes ("cardinals") in strings should be represented using
`Nat`.
For example, `String.Slice.utf8ByteSize` now returns `Nat` instead of
`String.Pos.Raw`, and there is a new function `String.Slice.rawEndPos`.
Finally, the `HAdd` and `HSub` instances for `String.Pos.Raw` are
reorganized. This is a **breaking change**.
The `HAdd/HSub String.Pos.Raw String.Pos.Raw String.Pos.Raw` instances
have been removed. For the use case of tracking positions relative to
some other position, we instead provide `offsetBy` and `unoffsetBy`
functions. For the use case of advancing/unadvancing a position by an
arbitrary number of bytes, we instead provide `increaseBy` and
`decreaseBy` functions. For
offsetting/unoffsetting/advancing/unadvancing a position `p` by the size
of a string `s` (resp. character `c`), use `s + p`/`p - s`/`p + s`/`p -
s` (resp. `c + p`/`p - c`/`p + c`/`p - c`).
This PR adds lemmas `forall_fin_zero` and `exists_fin_zero`. It also
marks lemmas `forall_fin_zero`, `forall_fin_one`, `forall_fin_two`,
`exists_fin_zero`, `exists_fin_one`, `exists_fin_two` with `simp`
attribute.
Closes#10629
This PR renames `Nat.and_distrib_right` to `Nat.and_or_distrib_right`.
This is to make the name consistent with other theorems in the same file
(e.g. `Nat.and_or_distrib_left`).
This PR changes how Lean proves the equational theorems for structural
recursion. The core idea is to let-bind the `f` argument to `brecOn` and
rewriting `.brecOn` with an unfolding theorem. This means no extra case
split for the `.rec` in `.brecOn` is needed, and `simp` doesn't change
the `f` argument which can break the definitional equality with the
defined function. With this, we can prove the unfolding theorem first,
and derive the equational theorems from that, like for all other ways of
defining recursive functions.
Backs out the changes from #10415, the old strategy works well with the
new goals.
Fixes#5667Fixes#10431Fixes#10195Fixes#2962
This PR renames some declarations in the range API for better
consistency and readability. For example,
`UpwardEnumerable.succMany?_succ?` is now called `succMany?_add_one`, in
order to (a) correct the erroneous use of `succ?` instead of `succ`
(=`Nat.succ`) and (b) distinguish the successor of natural numbers
(`add_one`) from the successor of the upward-enumerable type (`succ?` or
`succ`).
This PR "monomorphizes" the structure `Std.PRange shape α`, replacing it
with nine distinct structures `Std.Rcc`, `Std.Rco`, `Std.Rci` etc., one
for each possible shape of a range's bounds. This change was necessary
because the shape polymorphism is detrimental to attempts of automation.
**BREAKING CHANGE:** While range/slice notation itself is unchanged,
this essentially breaks the entire remaining (polymorphic) range and
slice API except for the dot-notation(`toList`, `iter`, ...). It is not
possible to deprecate old declarations that were formulated in a
shape-polymorphic way that is not available anymore.
This PR exposes the definitions about `Int*`. The main reason is that
the `SInt` simprocs require many of them to be exposed. Furthermore,
`decide` now works with `Int*` operations. This fixes#10631.
This PR defines `ByteArray.validateUTF8`, uses it to show that
`ByteArray.IsValidUtf8` is decidable and redefines `String.fromUTF8` and
friends to use it.
The functions `String.validateUTF8` and `String.utf8DecodeChar?` are
deprecated in favor of the identically named functions in the
`ByteArray` namespace.
This PR cuts some edges from the import graph.
Specifically:
- `TreeMap` and `HashMap` no longer depend on `String`, so now the
expensive things are all in parallel instead of partially in sequence
- `Omega` no longer relies on `List` lemmas
- The section of the import graph between `Init.Omega` and
`Init.Data.Bitvec.Lemmas` is cleaned up a bit
* Wrap proof subterms in `by exact` so dependencies can be demoted to
private `import`s
* Remove trivial instance re-definitions that may cause name collisions
on import changes
* Remove unused `open`s that may fail on import removals
This PR simplifies the `grind order` module, and internalizes the order
constraints. It removes the `Offset` type class because it introduced
too much complexity. We now cover the same use cases with a simpler
approach:
- Any type that implements at least `Std.IsPreorder`
- Arbitrary ordered rings.
- `Nat` by the `Nat.ToInt` adapter.
This PR ensures that `Substring.beq` is reflexive, and in particular
satisfies the equivalence `ss1 == ss2 <-> ss1.toString = ss2.toString`.
Closes#10511.
Note: I also fixed a strange line in the `String.extract` documentation
which looks like it may have been a copypasta, and added another example
to show how invalid UTF8 positions work, but the doc also makes a point
of saying that it is unspecified so maybe it would be better not to have
the example? 🤷
This PR introduces safe alternatives to `String.Pos` and `Substring`
that can only represent valid positions/slices.
Specifically, the PR
- introduces the predicate `String.Pos.IsValid`;
- proves several nontrivial equivalent conditions for
`String.Pos.IsValid`;
- introduces `String.ValidPos`, which is a `String.Pos` with an
`IsValid` proof;
- introduces `String.Slice`, which is like `Substring` but made from
`String.ValidPos` instead of `Pos`;
- introduces `String.Pos.IsValidForSlice`, which is like
`String.Pos.IsValid` but for slices;
- introduces `String.Slice.Pos`, which is like `String.ValidPos` but for
slices;
- introduces various functions for converting between the two types of
positions.
The API added in this PR is not complete. It will be expanded in future
PRs with addional operations and verification.
This PR removes some `grind` annotations for `Array.attach` and related
functions. These lemmas introduce lambda on the right hand side which
`grind` can't do much with. I've added a test file that verifies that
the theorems with removed annotations can actually be proved already by
grind. Removing the annotations will help with excessive instantiation.
Hi, the doc of `String.fromUTF8` previously said invalid characters are
replaced with 'A'. But the parameter `h : validateUTF8 a` guarantees
there are no invalid characters, so that explanation doesn't make sense
to me. This PR deletes that explanation (and fixes some unrelated
typos).
I also have a patch that uses `h` to prove each of the characters is
valid, eliminating the need for a default character
([pr/chore-String-fromUTF8-prove-valid](27f1ff36b2)),
would you be interested in merging that?
<details>
<summary>Notes on invalid characters from unchecked C++</summary>
I don't know if this function may be called from unchecked C++ with
invalid characters. If it may, I'm not sure what would happen with my
patched function... I'm not familiar with Lean's safety model, but it
seems like a bad idea to have a Lean function that takes a proof of a
proposition but is expected to operate in a certain way even if the
proposition is false. I think the safe approach is to have two functions
-- one that takes a proof and is only called from Lean, and another that
doesn't take a proof and replaces invalid chars (for use from C++, not
sure whether it's useful from Lean); I'd prefer to go even further and
report an error instead of silently replacing invalid characters (I'm
not sure if there is any easy way to report errors/panic in Lean code
called from C++).
</details>
This PR resolves a potential bad interaction between the compiler and
the module system where references to declarations not imported are
brought into scope by inlining or specializing. We now proactively check
that declarations to be inlined/specialized only reference public
imports. The intention is to later resolve this limitation by moving out
compilation into a separate build step with its own import/incremental
system.
This PR fixes a potential miscompilation when using non-exposed type
definitions using the module system by turning it into a static error. A
future revision may lift the restriction by making the compiler metadata
independent of the current module.
This PR redefines `String` to be the type of byte arrays `b` for which
`b.IsValidUtf8`.
This moves the data model of strings much closer to the actual data
representation at runtime.
In the near future, we will
- provide variants of `String.Pos` and `Substring` that only allow for
valid positions
- redefine all `String` functions to be much closer to their C++
implementations
In the near-to-medium future we will then provide comprehensive
verification of `String` based on these refactors.
This PR adds support the Count Trailing Zeros operation `BitVec.ctz` to
the bitvector library and to `bv_decide`, relying on the existing `clz`
circuit. We also build some theory around `BitVec.ctz` (analogous to the
theory existing for `BitVec.clz`) and introduce lemmas
`BitVec.[ctz_eq_reverse_clz, clz_eq_reverse_ctz, ctz_lt_iff_ne_zero,
getLsbD_false_of_lt_ctz, getLsbD_true_ctz_of_ne_zero,
two_pow_ctz_le_toNat_of_ne_zero, reverse_reverse_eq,
reverse_eq_zero_iff]`.
`ctz` operation is common in numerous compiler intrinsics (see
[here](https://clang.llvm.org/docs/LanguageExtensions.html#intrinsics-support-within-constant-expressions))
and architectures (see
[here](https://en.wikipedia.org/wiki/Find_first_set)).
---------
Co-authored-by: Siddharth <siddu.druid@gmail.com>
This PR enables the new E-matching pattern inference heuristic for
`grind`, implemented in PR #10422.
**Important**: Users can still use the old pattern inference heuristic
by setting:
```lean
set_option backward.grind.inferPattern true
```
In PR #10422, we introduced the new modifier `@[grind!]` for enabling
the minimal indexable subexpression condition. This option can now also
be set in `grind` parameters. Example:
```lean
opaque f : Nat → Nat
opaque fInv : Nat → Nat
axiom fInv_f : fInv (f x) = x
/-- trace: [grind.ematch.pattern] fInv_f: [f #0] -/
#guard_msgs in
set_option trace.grind.ematch.pattern true in
example {x y} : f x = f y → x = y := by
/-
The modifier `!` instructs `grind` to use the minimal indexable subexpression
(i.e., `f x` in this case).
-/
grind [!fInv_f]
```
This PR fixes a few bugs in the `rw` tactic: it could "steal" goals
because they appear in the type of the rewrite, it did not do an occurs
check, and new proof goals would not be synthetic opaque. This PR also
lets the `rfl` tactic assign synthetic opaque metavariables so that it
is equivalent to `exact rfl`.
Implementation note: filtering old vs new is not sufficient. This PR
partially addresses the bug where the rw tactic creates natural
metavariables for each of the goals; now new proof goals are synthetic
opaque.
Metaprogramming API: Instead of `Lean.MVarId.rewrite` prefer
`Lean.Elab.Tactic.elabRewrite` for elaborating rewrite theorems and
applying rewrites to expressions.
Closes#10172
This PR adds range support to`BitVec` and the `UInt*` types. This means
that it is now possible to write, for example, `for i in (1 : UInt8)...5
do`, in order to loop over the values 1, 2, 3 and 4 of type `UInt8`.
This PR adds more lemmas about the `toList` and `toArray` functions on
ranges and iterators. It also renames `Array.mem_toArray` into
`List.mem_toArray`.