To eliminate parsing differences between Windows and other platforms, the frontend now normalizes all CRLF line endings to LF, like [in Rust](https://github.com/rust-lang/rust/issues/62865). Effects: - This makes Lake hashes be faithful to what Lean sees (Lake already normalizes line endings before computing hashes). - Docstrings now have normalized line endings. In particular, this fixes `#guard_msgs` failing multiline tests for Windows users using CRLF. - Now strings don't have different lengths depending on the platform. Before this PR, the following theorem is true for LF and false for CRLF files. ```lean example : " ".length = 1 := rfl ``` Note: the normalization will take `\r\r\n` and turn it into `\r\n`. In the elaborator, we reject loose `\r`'s that appear in whitespace. Rust instead takes the approach of making the normalization routine fail. They do this so that there's no downstream confusion about any `\r\n` that appears. Implementation note: the LSP maintains its own copy of a source file that it updates when edit operations are applied. We are assuming that edit operations never split or join CRLFs. If this assumption is not correct, then the LSP copy of a source file can become slightly out of sync. If this is an issue, there is some discussion [here](https://github.com/leanprover/lean4/pull/3903#discussion_r1592930085).
39 lines
791 B
Text
39 lines
791 B
Text
/-!
|
|
# Test `String.crlfToLf`
|
|
-/
|
|
|
|
/-!
|
|
Leaves single `\n`'s alone.
|
|
-/
|
|
/-- info: "hello\nworld" -/
|
|
#guard_msgs in #eval String.crlfToLf "hello\nworld"
|
|
|
|
/-!
|
|
Turns `\r\n` into `\n`.
|
|
-/
|
|
/-- info: "hello\nworld" -/
|
|
#guard_msgs in #eval String.crlfToLf "hello\r\nworld"
|
|
|
|
/-!
|
|
In a string of `\r...\r\n`, only normalizes the last `\r\n`.
|
|
-/
|
|
/-- info: "hello\x0d\nworld" -/
|
|
#guard_msgs in #eval String.crlfToLf "hello\r\r\nworld"
|
|
|
|
/-!
|
|
Two in a row.
|
|
-/
|
|
/-- info: "hello\n\nworld" -/
|
|
#guard_msgs in #eval String.crlfToLf "hello\r\n\r\nworld"
|
|
|
|
/-!
|
|
Normalizes at the end.
|
|
-/
|
|
/-- info: "hello\nworld\n" -/
|
|
#guard_msgs in #eval String.crlfToLf "hello\r\nworld\r\n"
|
|
|
|
/-!
|
|
Can handle a loose `\r` as the last character.
|
|
-/
|
|
/-- info: "hello\nworld\x0d" -/
|
|
#guard_msgs in #eval String.crlfToLf "hello\r\nworld\r"
|