-
-
Notifications
You must be signed in to change notification settings - Fork 14.8k
remove language-level UB for non-UTF-8 str #71033
Copy link
Copy link
Closed
rust-lang/reference
#792Labels
A-UnicodeArea: UnicodeArea: UnicodeC-enhancementCategory: An issue proposing an enhancement or a PR with one.Category: An issue proposing an enhancement or a PR with one.T-langRelevant to the language teamRelevant to the language teamdisposition-mergeThis issue / PR is in PFCP or FCP with a disposition to merge it.This issue / PR is in PFCP or FCP with a disposition to merge it.finished-final-comment-periodThe final comment period is finished for this PR / Issue.The final comment period is finished for this PR / Issue.
Metadata
Metadata
Assignees
Labels
A-UnicodeArea: UnicodeArea: UnicodeC-enhancementCategory: An issue proposing an enhancement or a PR with one.Category: An issue proposing an enhancement or a PR with one.T-langRelevant to the language teamRelevant to the language teamdisposition-mergeThis issue / PR is in PFCP or FCP with a disposition to merge it.This issue / PR is in PFCP or FCP with a disposition to merge it.finished-final-comment-periodThe final comment period is finished for this PR / Issue.The final comment period is finished for this PR / Issue.
Type
Fields
Give feedbackNo fields configured for issues without a type.
This is the Rust-side issue for rust-lang/reference#792 just so that we can use fcpbot. The change description follows.
Ever since Rust 1.0, the reference said that a non-UTF-8
strcauses immediate UB. In terms of today's terminology, that means thatstrhas a validity invariant of being valid UTF-8.However, that seems unnecessary: the compiler does not actually exploit this, nor is there any clear way it could exploit this. Making UTF-8 a library-level safety invariant is more than enough for everything
strdoes. Most likely, it was made a validity invariant because we had not yet properly teased apart those two concepts when the document was initially written.This is also the conclusion that the UCG WG arrived at in rust-lang/unsafe-code-guidelines#78.
I therefore propose we remove the UTF-8 clause from the language spec, so that
strwill have the same validity invariant as[u8].