Commit graph

118 commits

Author SHA1 Message Date
Kim Morrison
b4ff5455ba
feat: lemmas about lexicographic order on Array and Vector (#6399)
This PR adds basic lemmas about lexicographic order on Array and Vector,
achieving parity with List.

Many lemmas are still missing for all three, particularly about how
order interacts with `++`.
2024-12-19 10:36:50 +00:00
Kim Morrison
6893913683
feat: replace List.lt with List.Lex (#6379)
This PR replaces `List.lt` with `List.Lex`, from Mathlib, and adds the
new `Bool` valued lexicographic comparatory function `List.lex`. This
subtly changes the definition of `<` on Lists in some situations.

`List.lt` was a weaker relation: in particular if `l₁ < l₂`, then
`a :: l₁ < b :: l₂` may hold according to `List.lt` even if `a` and `b`
are merely incomparable
(either neither `a < b` nor `b < a`), whereas according to `List.Lex`
this would require `a = b`.

When `<` is total, in the sense that `¬ · < ·` is antisymmetric, then
the two relations coincide.

Mathlib was already overriding the order instances for `List α`,
so this change should not be noticed by anyone already using Mathlib.

We simultaneously add the boolean valued `List.lex` function,
parameterised by a `BEq` typeclass
and an arbitrary `lt` function. This will support the flexibility
previously provided for `List.lt`,
via a `==` function which is weaker than strict equality.
2024-12-15 08:22:39 +00:00
Kim Morrison
1c30c76e72
chore: remove >6 month old deprecations (#6057) 2024-11-13 23:21:23 +00:00
Kim Morrison
3a408e0e54
feat: change Array.get to take a Nat and a proof (#6032)
This PR changes the signature of `Array.get` to take a Nat and a proof,
rather than a `Fin`, for consistency with the rest of the (planned)
Array API. Note that because of bootstrapping issues we can't provide
`get_elem_tactic` as an autoparameter for the proof. As users will
mostly use the `xs[i]` notation provided by `GetElem`, this hopefully
isn't a problem.

We may restore `Fin` based versions, either here or downstream, as
needed, but they won't be the "main" functions.

---------

Co-authored-by: David Thrane Christiansen <david@davidchristiansen.dk>
2024-11-12 03:30:46 +00:00
Kim Morrison
09e1a05ee9
chore: cleanup imports (#5825) 2024-10-23 23:51:13 +00:00
Kim Morrison
e8970463d1
fix: change String.dropPrefix? signature (#5745) 2024-10-17 03:51:45 +00:00
Kim Morrison
ef05bdc449
chore: rename List.bind and Array.concatMap to flatMap (#5731) 2024-10-16 11:30:49 +00:00
Henrik Böving
19e06acc65
refactor: redefine unsigned fixed width integers in terms of BitVec (#5323)
I made a few choices so far that can probably be discussed:
- got rid of `modn` on `UInt`, nobody seems to use it apart from the
definition of `shift` which can use normal `mod`
- removed the previous defeq optimized definition of `USize.size` in
favor for a normal one. The motivation was to allow `OfNat` to work
which doesn't seem to be necessary anymore afaict.
- Minimized uses of `.val`, should we maybe mark it deprecated?
- Mostly got rid of `.val` in basically all theorems as the proper next
level of API would now be `.toBitVec`. We could probably re-prove them
but it would be more annoying given the change of definition.
- Did not yet redefine `log2` in terms of `BitVec` as this would require
a `log2` in `BitVec` as well, do we want this?
- I added a couple of theorems around the relation of `<` on `UInt` and
`Nat`. These were previously not needed because defeq was used all over
the place to save us. I did not yet generalize these to all types as I
wasn't sure if they are the appropriate lemma that we want to have.
2024-10-16 07:28:23 +00:00
Kim Morrison
831fa0899f
chore: upstream String.dropPrefix? (#5728)
Useful String helper functions widely used in tactic implementations.
2024-10-16 02:41:17 +00:00
Kim Morrison
aa2360a41d chore: rename List.join to List.flatten
one more

one more

one more

fix test
2024-10-14 22:28:12 +11:00
Kim Morrison
e3811fd838
chore: cleanup unused variables (#5579)
This pulls changes to the standard library out of #5338.
2024-10-02 01:51:22 +00:00
Kim Morrison
7c364543a3
chore: review of List API (#5260) 2024-09-05 06:27:08 +00:00
Kim Morrison
d08051cf0b
chore: variables appearing on both sides of an iff should be implicit (#5254) 2024-09-04 08:33:24 +00:00
Marc Huisinga
9009c1ac91
fix: ilean loading performance (#4900)
This PR roughly halves the time needed to load the .ilean files by
optimizing the JSON parser and the conversion from JSON to Lean data
structures.

The code is optimized roughly as follows:
- String operations are inlined more aggressively
- Parsers are changed to use new `String.Iterator` functions `curr'` and
`next'` that receive a proof and hence do not need to perform an
additional check
- The `RefIdent` of .ilean files now uses a `String` instead of a `Name`
to avoid the expensive parse step from `String` to `Name` (despite the
fact that we only very rarely actually need a `Name` in downstream code)
- Instead of `List`s and `Subarray`s, the JSON to Lean conversion now
directly passes around arrays and array indices to avoid redundant
boxing
- Parsec's `peek?` sometimes generates redundant `Option` wrappers
because the generation of basic blocks interferes with the ctor-match
optimization, so it is changed to use an `isEof` check where possible
- Early returns and inline-do-blocks cause the code generator to
generate new functions, which then interfere with optimizations, so they
are now avoided
- Mutual defs are used instead of unspecialized passing of higher-order
functions to generate faster code
- The object parser is made tail-recursive

This PR also fixes a stack overflow in `Lean.Json.compress` that would
occur with long lists and adds a benchmark for the .ilean roundtrip
(compressed pretty-printing -> parsing).
2024-08-29 11:51:48 +00:00
Henrik Böving
95549f17da
feat: LeanSAT's LRAT parsers + SAT solver interface (#5100)
Step 5/6 in upstreaming LeanSAT.

---------

Co-authored-by: Markus Himmel <markus@lean-fro.org>
2024-08-20 11:42:26 +00:00
Mario Carneiro
0a1a855ba8
fix: validate UTF-8 at C++ -> Lean boundary (#3963)
Continuation of #3958. To ensure that lean code is able to uphold the
invariant that `String`s are valid UTF-8 (which is assumed by the lean
model), we have to make sure that no lean objects are created with
invalid UTF-8. #3958 covers the case of lean code creating strings via
`fromUTF8Unchecked`, but there are still many cases where C++ code
constructs strings from a `const char *` or `std::string` with unclear
UTF-8 status.

To address this and minimize accidental missed validation, the
`(lean_)mk_string` function is modified to validate UTF-8. The original
function is renamed to `mk_string_unchecked`, with several other
variants depending on whether we know the string is UTF-8 or ASCII and
whether we have the length and/or utf8 char count on hand. I reviewed
every function which leads to `mk_string` or its variants in the C code,
and used the appropriate validation function, defaulting to `mk_string`
if the provenance is unclear.

This PR adds no new error handling paths, meaning that incorrect UTF-8
will still produce incorrect results in e.g. IO functions, they are just
not causing unsound behavior anymore. A subsequent PR will handle adding
better error reporting for bad UTF-8.
2024-06-19 14:05:48 +00:00
Kim Morrison
2a2b276ede
chore: unify String.csize : Nat and Char.utf8Size : UInt32 as Char.size : Nat (#4357)
It seems:
* there was no actual need for the UInt32 valued version
* downstream we were getting duplicative lemmas about both
* so lets reduce the API surface area!

If anyone would prefer the remaining function is still called
`Char.utf8Size` I will happily change it. (`size` is hopefully still
unambiguous, and it's helpful to rename here so we can give a
deprecation warning that explains the type signature change.)

---------

Co-authored-by: Mac Malone <tydeu@hatpress.net>
2024-06-11 02:51:18 +00:00
Kim Morrison
56adfb856d
chore: upstream basic String lemmas (#4354) 2024-06-05 21:28:43 +00:00
Austin Letson
644c1d4e36
doc: add docstrings and examples for String functions (#4332)
Add docstrings, usage examples, and doctests for `String.get'`,
`String.next'`, `String.posOf`, `String.revPosOf`.
2024-06-05 05:16:56 +00:00
Sebastian Ullrich
f97a7d4234
feat: incremental elaboration of definition headers, bodies, and tactics (#3940)
Extends Lean's incremental reporting and reuse between commands into
various steps inside declarations:
* headers and bodies of each (mutual) definition/theorem
* `theorem ... := by` for each contained tactic step, including
recursively inside supported combinators currently consisting of
  * `·` (cdot), `case`, `next`
  * `induction`, `cases`
  * macros such as `next` unfolding to the above

![Recording 2024-05-10 at 11 07
32](https://github.com/leanprover/lean4/assets/109126/c9d67b6f-c131-4bc3-a0de-7d63eaf1bfc9)

*Incremental reuse* means not recomputing any such steps if they are not
affected by a document change. *Incremental reporting* includes the
parts seen in the recording above: the progress bar and messages. Other
language server features such as hover etc. are *not yet* supported
incrementally, i.e. they are shown only when the declaration has been
fully processed as before.

---------

Co-authored-by: Scott Morrison <scott.morrison@gmail.com>
2024-05-22 13:23:30 +00:00
Leonardo de Moura
8c03650359
feat: some Char, UInt, and Fin theorems (#4231)
for SSFT24 summer school: https://github.com/david-christiansen/ssft24

---------

Co-authored-by: Kim Morrison <kim@tqft.net>
Co-authored-by: Kim Morrison <scott.morrison@gmail.com>
Co-authored-by: David Thrane Christiansen <david@davidchristiansen.dk>
2024-05-21 06:11:23 +00:00
Austin Letson
2faa81d41f
doc: add docstrings and examples for String functions (#4166)
Add docstrings, usage examples, and doc tests for `String.prev`,
`.front`, `.back`, `.atEnd`.

Improve docstring examples for `String.next` based on discussion
examples for `String.prev`.

---------

Co-authored-by: Kim Morrison <kim@tqft.net>
2024-05-21 04:27:40 +00:00
Leonardo de Moura
f3ccd6b023
feat: some string simprocs (#4233)
For the SSFT24 summer school.
2024-05-20 22:53:10 +00:00
Kyle Miller
a7338c5ad8
feat: make frontend normalize line endings to LF (#3903)
To eliminate parsing differences between Windows and other platforms,
the frontend now normalizes all CRLF line endings to LF, like [in
Rust](https://github.com/rust-lang/rust/issues/62865).

Effects:
- This makes Lake hashes be faithful to what Lean sees (Lake already
normalizes line endings before computing hashes).
- Docstrings now have normalized line endings. In particular, this fixes
`#guard_msgs` failing multiline tests for Windows users using CRLF.
- Now strings don't have different lengths depending on the platform.
Before this PR, the following theorem is true for LF and false for CRLF
files.
```lean
example : "
".length = 1 := rfl
```

Note: the normalization will take `\r\r\n` and turn it into `\r\n`. In
the elaborator, we reject loose `\r`'s that appear in whitespace. Rust
instead takes the approach of making the normalization routine fail.
They do this so that there's no downstream confusion about any `\r\n`
that appears.

Implementation note: the LSP maintains its own copy of a source file
that it updates when edit operations are applied. We are assuming that
edit operations never split or join CRLFs. If this assumption is not
correct, then the LSP copy of a source file can become slightly out of
sync. If this is an issue, there is some discussion
[here](https://github.com/leanprover/lean4/pull/3903#discussion_r1592930085).
2024-05-20 17:13:08 +00:00
Kim Morrison
799923d145
chore: move have to decreasing_by in substrEq.loop (#4143)
Currently this causes linter warnings downstream in proofs that unfold
substrEq.loop.
2024-05-13 06:18:44 +00:00
Austin Letson
b8e67d87a8
doc: add docstrings and usage examples in Init.Data.String.Basic (#4001)
Add docstrings and usage examples for `String.length`, `.push`,
`.append`, `.get?`, `.set`, `.modyify`, and `.next`. Update docstrings
and add usage examples for `String.toList`, `.get`, and `.get!`.

---------

Co-authored-by: Joachim Breitner <mail@joachim-breitner.de>
Co-authored-by: David Thrane Christiansen <david@davidchristiansen.dk>
2024-05-07 23:49:43 +00:00
Joachim Breitner
e2983e44ef
perf: use with_reducible in special-purpose decreasing_trivial macros (#3991)
Because of the last-added-tried-first rule for macros, all the special
purpose `decreasing_trivial` rules are tried for most recursive
definitions out there, and because they use `apply` and `assumption`
with default transparency may cause some definitoins to be unfolded over
and over again.

A quick test with one of the functions in the leansat project shows that
elaboration time goes down from 600ms to 375ms when using
```
decreasing_by all_goals decreasing_with with_reducible decreasing_trivial
```
instead of
```
decreasing_by all_goals decreasing_with decreasing_trivial
```

This change uses `with_reducible` in most of these macros.

This means that these tactics will no longer work when the
relations/definitions they look for is hidden behind a definition.
This affected in particular `Array.sizeOf_get`, which now has a
companion `sizeOf_getElem`.

In addition, there were three tactics using `apply` to apply Nat-related
lemmas
that we now expect `omega` to solve. We still need them when building
`Init` modules
that don’t have access to `omega`, but they now live in
`decreasing_trivial_pre_omega`,
meant to be only used internally.
2024-04-29 15:12:27 +00:00
Mario Carneiro
70a23945bf
feat: add model implementation for UTF8 enc/dec (#3961)
- [x] Depends on: #3958 
- [x] Depends on: #3960

This makes the UTF-8 encode and decode functions have lean definitions,
so that we can prove properties about them downstream.
2024-04-22 10:24:53 +00:00
Mario Carneiro
62cdb51ed5
feat: UTF-8 string validation (#3958)
Previously, there was a function `opaque fromUTF8Unchecked : ByteArray
-> String` which would convert a list of bytes into a string, but as the
name implies it does not validate that the string is UTF-8 before doing
so and as a result it produces unsound results in the compiler (because
the lean model of `String` indirectly asserts UTF-8 validity). This PR
replaces that function by
```lean
opaque validateUTF8 (a : @& ByteArray) : Bool

opaque fromUTF8 (a : @& ByteArray) (h : validateUTF8 a) : String
```
so that while the function is still "unchecked", we have a proof witness
that the string is valid. To recover the original, actually unchecked
version, use `lcProof` or other unsafe methods to produce the proof
witness.

Because this was the only `ByteArray -> String` conversion function, it
was used in several places in an unsound way (e.g. reading untrusted
input from IO and treating it as UTF-8). These have been replaced by
`fromUTF8?` or `fromUTF8!` as appropriate.
2024-04-20 18:36:37 +00:00
Mario Carneiro
aeacb7b69e
feat: String.Pos.isValid (#3959)
This adds a function that can be used to check whether a position is on
a UTF-8 byte boundary.
2024-04-20 14:57:35 +00:00
Mario Carneiro
e41cd310e9
fix: String.splitOn bug (#3832)
Fixes #3829. As reported on Zulip (both
[recently](https://leanprover.zulipchat.com/#narrow/stream/270676-lean4/topic/current.20definition.20of.20.60String.2EsplitOn.60.20is.20incorrect/near/430930535)
and [a year
ago](https://leanprover.zulipchat.com/#narrow/stream/270676-lean4/topic/should.20we.20redefine.20.60String.2EsplitOnAux.60.3F/near/365899332)),
`String.splitOn` has a bug when dealing with separators of more than one
character (which are luckily rare). The code change here is very small,
replacing a `i` with `i - j`, but it makes termination more complex so
that's where the rest of the line count goes.
2024-04-04 09:30:53 +00:00
Scott Morrison
94d6286e5a
chore: reorganising to reduce imports (#3790)
[Before](https://github.com/leanprover/lean4/files/14772220/oi.pdf) and
[after](https://github.com/leanprover/lean4/files/14772226/oi2.pdf).

This gets `ByteArray`, `String.Extra`, `ToString.Macro` and `RCases` out
of the imports of `omega`. I'd hoped to get `Array.Subarray` too, but
it's tangled up in the list literal syntax. Further progress could come
from make `split` use available `Decidable` instances, so we could pull
out `Classical` (and possibly some of `PropLemmas`).
2024-03-27 11:15:01 +00:00
Joachim Breitner
23d3ac4760
refactor: reduced unsed imports (#3464) 2024-02-22 18:12:57 +00:00
Joe Hendrix
29244f32f6
chore: upstream solve_by_elim (#3408)
This upstreams the solve_by_elim tactic from Std.

It is a key tactic needed by library_search.
2024-02-21 01:16:04 +00:00
Adrien Champion
a898aa18f3
chore: add documentation for the String.iterator API (#3300)
Adds documentation to the `String.Iterator` API, mentored by
@eric-wieser and @david-christiansen

---------

Co-authored-by: David Thrane Christiansen <david@davidchristiansen.dk>
2024-02-20 13:31:27 +00:00
Scott Morrison
904239ae61
feat: upstream some Syntax/Position helper functions used in code actions in Std (#3260)
Co-authored-by: David Thrane Christiansen <david@davidchristiansen.dk>
2024-02-09 10:50:19 +00:00
Joachim Breitner
368ead54b2 refactor: termination_by changes in stdlib 2024-01-10 17:27:35 +01:00
int-y1
ce4ae37c19 chore: fix more typos in comments 2023-10-08 14:37:34 -07:00
Joachim Breitner
b2d668c340
perf: Use flat ByteArrays in Trie (#2529) 2023-09-20 13:22:37 +02:00
Bulhwi Cha
367b38701f
refactor: simplify String.splitOnAux (#2271) 2023-07-19 11:50:27 +00:00
Mario Carneiro
bc841809c2 chore: remove intermediate 2023-06-05 15:50:11 -07:00
Mario Carneiro
2ae78f3c45 fix: tail-recursive String.foldr 2023-06-05 15:50:11 -07:00
Mario Carneiro
e68554b854 fix: use empty string instead of mk 2023-06-05 15:50:11 -07:00
Mario Carneiro
fd72fdf8f8 fix: incorrect utf8 in splitAux 2023-06-05 15:50:11 -07:00
Mario Carneiro
aa60791db3 feat: remove partial in Init.Data.String.Basic 2023-06-05 15:50:11 -07:00
Bulhwi Cha
8d0504b3b7 doc: add docstring to String.next' 2023-05-28 17:32:08 -07:00
Mario Carneiro
7f84bf07ba fix: bug in reference implementation of String.get? 2023-05-15 08:35:20 -07:00
Mario Carneiro
c9e84a6ad6 fix: remove private from string defs 2023-05-05 12:09:38 -07:00
Leonardo de Moura
3e33fcc4f8 chore: use lean_string_utf8_next_fast 2022-11-09 12:06:37 -08:00
Leonardo de Moura
92c03c0050 perf: prepare do add String.next' 2022-11-09 12:00:31 -08:00