lean4-htt/tests
Mario Carneiro 0a1a855ba8
fix: validate UTF-8 at C++ -> Lean boundary (#3963)
Continuation of #3958. To ensure that lean code is able to uphold the
invariant that `String`s are valid UTF-8 (which is assumed by the lean
model), we have to make sure that no lean objects are created with
invalid UTF-8. #3958 covers the case of lean code creating strings via
`fromUTF8Unchecked`, but there are still many cases where C++ code
constructs strings from a `const char *` or `std::string` with unclear
UTF-8 status.

To address this and minimize accidental missed validation, the
`(lean_)mk_string` function is modified to validate UTF-8. The original
function is renamed to `mk_string_unchecked`, with several other
variants depending on whether we know the string is UTF-8 or ASCII and
whether we have the length and/or utf8 char count on hand. I reviewed
every function which leads to `mk_string` or its variants in the C code,
and used the appropriate validation function, defaulting to `mk_string`
if the provenance is unclear.

This PR adds no new error handling paths, meaning that incorrect UTF-8
will still produce incorrect results in e.g. IO functions, they are just
not causing unsound behavior anymore. A subsequent PR will handle adding
better error reporting for bad UTF-8.
2024-06-19 14:05:48 +00:00
..
bench fix: make watchdog more resilient against badly behaving clients (#4443) 2024-06-13 13:48:36 +00:00
compiler feat: support Lake for building Lean core oleans (#3886) 2024-06-13 16:18:24 +00:00
elabissues
ir
lean fix: validate UTF-8 at C++ -> Lean boundary (#3963) 2024-06-19 14:05:48 +00:00
pkg feat: incremental elaboration of definition headers, bodies, and tactics (#3940) 2024-05-22 13:23:30 +00:00
playground
plugin
simpperf
.gitignore
common.sh
lean-toolchain