Skip to content

Make retags an implicit part of typed copies#154341

Open
RalfJung wants to merge 1 commit intorust-lang:mainfrom
RalfJung:retag-on-typed-copy
Open

Make retags an implicit part of typed copies#154341
RalfJung wants to merge 1 commit intorust-lang:mainfrom
RalfJung:retag-on-typed-copy

Conversation

@RalfJung
Copy link
Copy Markdown
Member

@RalfJung RalfJung commented Mar 24, 2026

View all comments

Ever since Stacked Borrows was first implemented in Miri, that was done with Retag statements: given a place (usually a local variable), those statements find all references stored inside the place and refresh their tags to ensure the aliasing requirements are upheld. However, this is a somewhat unsatisfying approach for multiple reasons:

  • It leaves open the question of where to even put Retag statements. Over time, the AddRetag pass settled on one possible answer to this, but it wasn't very canonical.
  • For assignments of the form *ptr = expr, if the assignment involves copying a reference, we probably want to do a retag -- but if we do a Retag(*ptr) as the next instruction, it can be non-trivial to argue that this even retags the right value, so we refrained from doing retags in that case. This has come up as a potential issue for Rust making better use of LLVM "captures" annotations. (That said, there might be other ways to obtain this desired optimization.)
  • Normal compilation avoids generating retags, but we still generate LLVM IR with noalias. What does that even mean? How do MIR optimization passes interact with retags? These are questions we have to figure out to make better use of aliasing information, but currently we can't even really ask such questions.

I think we should resolve all that by making retags part of what happens during a typed copy (a concept and interpreter infrastructure that did not exist yet when retags were initially introduced). Under this proposal, when executing a MIR assignment statement, what conceptually happens is as follows:

  • We evaluate the LHS to a place.
  • We evaluate the RHS to a value. This does a typed load from memory if needed, raising UB if memory does not contain a valid representation of the assignment's type.
  • We walk that value, identify all references inside of it, and retag them. If this happens as part of passing a function argument, this is a protecting retag.
  • We store (a representation of) the value into the place.

However, this semantics doesn't fully work: there's a mandatory MIR pass that turns expressions like &mut ***ptr into intermediate deref's. Those must not do any retags. So far this happened because the AddRetag pass did not add retags for assignments to deref temporaries, but that information is not recorded in cross-crate MIR. Therefore I instead added a field to Rvalue::Use to indicate whether this value should be retagged or not. A non-retagging copy seems like a sufficiently canonical primitive that we should be able to express it. Dealing with the fallout from that is a large chunk of the overall diff. (I also considered adding this field to StatementKind::Assign instead, but decided against that as we only actually need it for Rvalue::Use. I am not sure if this was the right call...)

This neatly answers the question of when retags should occur, and handles cases like *ptr = expr. It avoids traversing values twice in Miri. It makes codegen's use of noalias sound wrt the actual MIR that it is working on. It also gives us a target semantics to evaluate MIR opts against. However, I did not carefully check all MIR opts -- in particular, GVN needs a thorough look under the new semantics; it currently can turn alias-correct code into alias-incorrect code. (But this PR doesn't make things any worse for normal compilation where the retag indicator is anyway ignored.)

Another side-effect of this PR is that -Zmiri-disable-validation now also disables alias checking. It'd be nicer to keep them orthogonal but I find this an acceptable price to pay.

@rustbot rustbot added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-libs Relevant to the library team, which will review and decide on the PR/issue. labels Mar 24, 2026
@rust-log-analyzer

This comment has been minimized.

@RalfJung RalfJung force-pushed the retag-on-typed-copy branch from dbabc07 to c5a3e40 Compare March 24, 2026 22:18
@rust-log-analyzer

This comment has been minimized.

@RalfJung RalfJung force-pushed the retag-on-typed-copy branch from c5a3e40 to df515dd Compare March 24, 2026 22:44
@rustbot rustbot added the T-clippy Relevant to the Clippy team. label Mar 24, 2026
@RalfJung
Copy link
Copy Markdown
Member Author

@bors try
@rust-timer queue

@rust-timer

This comment has been minimized.

@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Mar 24, 2026
@rust-bors

This comment has been minimized.

rust-bors Bot pushed a commit that referenced this pull request Mar 24, 2026
Make retags an implicit part of typed copies
@rust-log-analyzer

This comment has been minimized.

@RalfJung RalfJung force-pushed the retag-on-typed-copy branch from df515dd to 76c8c9d Compare March 24, 2026 22:52
@rust-log-analyzer

This comment has been minimized.

@rust-bors
Copy link
Copy Markdown
Contributor

rust-bors Bot commented Mar 25, 2026

☀️ Try build successful (CI)
Build commit: 82d9903 (82d99031f4626ac962af0c7f6d78d1f7173d7145, parent: 362211dc29abc4e8f8cfc384740237f144929b03)

@rust-timer

This comment has been minimized.

@rust-timer

This comment was marked as outdated.

@rustbot rustbot added perf-regression Performance regression. and removed S-waiting-on-perf Status: Waiting on a perf run to be completed. labels Mar 25, 2026
@RalfJung
Copy link
Copy Markdown
Member Author

Looks like enabling validation of references just to keep retags working in const-eval was not a good idea...

@RalfJung RalfJung force-pushed the retag-on-typed-copy branch from 76c8c9d to d79e607 Compare March 25, 2026 07:23
@RalfJung
Copy link
Copy Markdown
Member Author

@bors try
@rust-timer queue

@rust-timer

This comment has been minimized.

@rust-bors

This comment has been minimized.

rust-bors Bot pushed a commit that referenced this pull request Mar 25, 2026
Make retags an implicit part of typed copies
@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Mar 25, 2026
@rust-log-analyzer

This comment has been minimized.

@rust-bors
Copy link
Copy Markdown
Contributor

rust-bors Bot commented Mar 25, 2026

☀️ Try build successful (CI)
Build commit: 5bbea76 (5bbea7620d94ef1e4dd2e6617ed840cde1cf87f3, parent: 8a703520e80d87d4423c01f9d4fbc9e5f6533a02)

@RalfJung RalfJung force-pushed the retag-on-typed-copy branch from ce09dd9 to 3e548e6 Compare April 14, 2026 15:41
@rustbot

This comment has been minimized.

@rust-bors

This comment has been minimized.

@RalfJung RalfJung force-pushed the retag-on-typed-copy branch from 3e548e6 to 9886c61 Compare April 19, 2026 10:32
@rustbot

This comment has been minimized.

@rust-bors

This comment has been minimized.

@RalfJung RalfJung force-pushed the retag-on-typed-copy branch from 9886c61 to d72567a Compare April 23, 2026 15:12
@rustbot
Copy link
Copy Markdown
Collaborator

rustbot commented Apr 23, 2026

This PR was rebased onto a different main commit. Here's a range-diff highlighting what actually changed.

Rebasing is a normal part of keeping PRs up to date, so no action is needed—this note is just to help reviewers.

Copy link
Copy Markdown
Contributor

@oli-obk oli-obk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

r=me with an explanation of the mutable reference instead of raw pointer (the comment talks about a raw pointer, but the code uses a mutable reference coerced to a raw pointer now)

View changes since this review

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll investigate this after this PR lands, there are some easy cases, and some annoying ones, so I'll start by keeping around whether the copied operands had retags or not

Comment thread library/alloc/src/boxed.rs Outdated
// operation for it's alias tracking. It would be wrong for `into_raw_with_allocator` to
// do the same as that would induce uniqueness assumptions that we only want with
// the default allocator.
&mut **b
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this now also goes through an intermediate mutable reference, why is that?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is so that in MIR there ends up being a reference-to-raw-ptr-cast, and that is what Stacked Borrows recognizes. I'll extend the comment.

@oli-obk oli-obk added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Apr 28, 2026
@dianqk
Copy link
Copy Markdown
Member

dianqk commented Apr 28, 2026

I have a small concern. After the patch, must we be careful about whether to retag or not when creating a new copy? Can we always create a non-retag for transformation? IIUC, the answer is no. This will make Miri much less useful. This can make a transformation pass more confusing.

In LLVM, we can just drop UB via dropUBImplyingAttrsAndMetadata.

@RalfJung
Copy link
Copy Markdown
Member Author

It's always safe to turn a retagging copy into a non-retag one. So dropUBImplyingAttrsAndMetadata would choose WithRetag::No. (But note that most rvalues don't have a way of opting-out from the retag at the moment.)

This will make Miri much less useful.

I think the opposite is true. Currently the optimizations and Miri live in a different world, making it impossible to actually reason about what optimizations do in a way that is coherent with Miri. We'll need to either turn on retags by default or make them implicit in MIR to move mir-opts and Miri into the same world; only then can Miri be useful to check concrete optimization examples.

@RalfJung RalfJung force-pushed the retag-on-typed-copy branch from d72567a to f790c9c Compare April 28, 2026 13:42
@RalfJung
Copy link
Copy Markdown
Member Author

@bors try jobs=x86_64-gnu-aux

@rust-bors

This comment has been minimized.

rust-bors Bot pushed a commit that referenced this pull request Apr 28, 2026
Make retags an implicit part of typed copies


try-job: x86_64-gnu-aux
@rust-bors
Copy link
Copy Markdown
Contributor

rust-bors Bot commented Apr 28, 2026

☀️ Try build successful (CI)
Build commit: 2592ffc (2592ffc844ff395407ce625242df6399451d7890, parent: 4ddb0b7f8ecda9dbdbcbc0519c9988badbd65d1c)

@dianqk
Copy link
Copy Markdown
Member

dianqk commented Apr 29, 2026

This will make Miri much less useful.

I think the opposite is true. Currently the optimizations and Miri live in a different world, making it impossible to actually reason about what optimizations do in a way that is coherent with Miri. We'll need to either turn on retags by default or make them implicit in MIR to move mir-opts and Miri into the same world; only then can Miri be useful to check concrete optimization examples.

Hmm, I agree with this PR, but I mean, for example, if I write a pass that turns all retags to non-retags, then Miri cannot have retags. Or do I misunderstand the semantics of retag? I haven't read through Miri, but I think the retag information is important to Miri. To avoid this, we have to be careful when turning retag to non-retag.

@RalfJung
Copy link
Copy Markdown
Member Author

RalfJung commented Apr 29, 2026

Such a pass would remove UB. So, the pass is valid, but it may mean that later optimization passes work less well because they need the retags.

The retag information is important to Miri to detect UB. But it's also normal that optimization passes sometimes remove UB. For instance if the code contains a let _x = *ptr; and that gets removed, this may remove UB. That's perfectly allowed; whether it is a good idea depends on the concrete case. Miri disables optimizations by default to avoid this loss of UB detection capabilities.

@RalfJung
Copy link
Copy Markdown
Member Author

@bors r=oli-obk
Based on the review above. We can always refine the semantics later.
Excited to see this land, thanks all for the help and especially thanks Oli for reviewing this monster. :)

@rust-bors
Copy link
Copy Markdown
Contributor

rust-bors Bot commented Apr 29, 2026

📌 Commit f790c9c has been approved by oli-obk

It is now in the queue for this repository.

@rust-bors rust-bors Bot added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Apr 29, 2026
@rust-bors
Copy link
Copy Markdown
Contributor

rust-bors Bot commented Apr 29, 2026

⌛ Testing commit f790c9c with merge 4b76db7...

Workflow: https://github.com/rust-lang/rust/actions/runs/25135311557

rust-bors Bot pushed a commit that referenced this pull request Apr 29, 2026
Make retags an implicit part of typed copies



Ever since Stacked Borrows was first implemented in Miri, that was done with `Retag` statements: given a place (usually a local variable), those statements find all references stored inside the place and refresh their tags to ensure the aliasing requirements are upheld. However, this is a somewhat unsatisfying approach for multiple reasons:
- It leaves open the [question](rust-lang/unsafe-code-guidelines#371) of where to even put `Retag` statements. Over time, the AddRetag pass settled on one possible answer to this, but it wasn't very canonical.
- For assignments of the form `*ptr = expr`, if the assignment involves copying a reference, we probably want to do a retag -- but if we do a `Retag(*ptr)` as the next instruction, it can be non-trivial to argue that this even retags the right value, so we refrained from doing retags in that case. This has [come up](llvm/llvm-project#160913 (comment)) as a potential issue for Rust making better use of LLVM "captures" annotations. (That said, there might be [other ways](rust-lang/unsafe-code-guidelines#593 (comment)) to obtain this desired optimization.)
- Normal compilation avoids generating retags, but we still generate LLVM IR with `noalias`. What does that even mean? How do MIR optimization passes interact with retags? These are questions we have to figure out to make better use of aliasing information, but currently we can't even really ask such questions.

I think we should resolve all that by making retags part of what happens during a typed copy (a concept and interpreter infrastructure that did not exist yet when retags were initially introduced). Under this proposal, when executing a MIR assignment statement, what conceptually happens is as follows:
- We evaluate the LHS to a place.
- We evaluate the RHS to a value. This does a typed load from memory if needed, raising UB if memory does not contain a valid representation of the assignment's type.
- We walk that value, identify all references inside of it, and retag them. If this happens as part of passing a function argument, this is a protecting retag.
- We store (a representation of) the value into the place.

However, this semantics doesn't fully work: there's a mandatory MIR pass that turns expressions like `&mut ***ptr` into intermediate deref's. Those must *not* do any retags. So far this happened because the AddRetag pass did not add retags for assignments to deref temporaries, but that information is not recorded in cross-crate MIR. Therefore I instead added a field to `Rvalue::Use` to indicate whether this value should be retagged or not. A non-retagging copy seems like a sufficiently canonical primitive that we should be able to express it. Dealing with the fallout from that is a large chunk of the overall diff. (I also considered adding this field to `StatementKind::Assign` instead, but decided against that as we only actually need it for `Rvalue::Use`. I am not sure if this was the right call...)

This neatly answers the question of when retags should occur, and handles cases like `*ptr = expr`. It avoids traversing values twice in Miri. It makes codegen's use of `noalias` sound wrt the actual MIR that it is working on. It also gives us a target semantics to evaluate MIR opts against. However, I did not carefully check all MIR opts -- in particular, GVN needs a thorough look under the new semantics; it currently can turn alias-correct code into alias-incorrect code. (But this PR doesn't make things any worse for normal compilation where the retag indicator is anyway ignored.)

Another side-effect of this PR is that `-Zmiri-disable-validation` now also disables alias checking. It'd be nicer to keep them orthogonal but I find this an acceptable price to pay.

- [rustc benchmark results](#154341 (comment))
- [miri benchmark results](#154341 (comment))
@JonathanBrouwer
Copy link
Copy Markdown
Contributor

JonathanBrouwer commented Apr 29, 2026

@bors yield
Yielding to a big rollup first, since it contains a p=1 job and this only just started

@rust-bors
Copy link
Copy Markdown
Contributor

rust-bors Bot commented Apr 29, 2026

Auto build was cancelled. Cancelled workflows:

The next pull request likely to be tested is #155979.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-clippy Relevant to the Clippy team. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-libs Relevant to the library team, which will review and decide on the PR/issue.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

10 participants