Skip to content

Remove fewer Storage calls in CopyProp and GVN#142531

Open
ohadravid wants to merge 1 commit intorust-lang:mainfrom
ohadravid:better-storage-calls-copy-prop
Open

Remove fewer Storage calls in CopyProp and GVN#142531
ohadravid wants to merge 1 commit intorust-lang:mainfrom
ohadravid:better-storage-calls-copy-prop

Conversation

@ohadravid
Copy link
Copy Markdown
Contributor

@ohadravid ohadravid commented Jun 15, 2025

View all comments

Modify the CopyProp and GVN MIR optimization passes to remove fewer Storage{Live,Dead} calls, allowing for better optimizations by LLVM - see #141649.

Details

The idea is to use a new MaybeUninitializedLocals analysis and remove only the storage calls of locals that are maybe-uninit when accessed in a new location.

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Jun 15, 2025
@rustbot
Copy link
Copy Markdown
Collaborator

rustbot commented Jun 15, 2025

Some changes occurred to MIR optimizations

cc @rust-lang/wg-mir-opt

@matthiaskrgr
Copy link
Copy Markdown
Member

@bors try @rust-timer queue

@rust-timer

This comment has been minimized.

@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Jun 15, 2025
bors added a commit that referenced this pull request Jun 15, 2025
…try>

Remove fewer Storage calls in `copy_prop`

Modify the `copy_prop` MIR optimization pass to remove fewer `Storage{Live,Dead}` calls, allowing for better optimizations by LLVM - see #141649.

### Details

This is my attempt to fix the mentioned issue (this is the first part, I also implemented a similar solution for GVN in [this branch](https://github.com/rust-lang/rust/compare/master...ohadravid:rust:better-storage-calls-gvn-v2?expand=1)).

The idea is to use the `MaybeStorageDead` analysis and remove only the storage calls of `head`s that are maybe-storage-dead when the associated `local` is accessed (or, conversely, keep the storage of `head`s that are for-sure alive in _every_ relevant access).

When combined with the GVN change, the final example in the issue (#141649 (comment)) is optimized as expected by LLVM. I also measured the effect on a few functions in `rav1d` (where I originally saw the issue) and observed reduced stack usage in several of them.

This is my first attempt at working with MIR optimizations, so it's possible this isn't the right approach — but all tests pass, and the resulting diffs appear correct.

r? tmiasko

since he commented on the issue and pointed to these passes.
@bors
Copy link
Copy Markdown
Collaborator

bors commented Jun 15, 2025

⌛ Trying commit d24d035 with merge ef7d206...

@bors
Copy link
Copy Markdown
Collaborator

bors commented Jun 15, 2025

☀️ Try build successful - checks-actions
Build commit: ef7d206 (ef7d20666974f0dac45b03e051f2e283f9d9f090)

@rust-timer

This comment has been minimized.

@rust-timer
Copy link
Copy Markdown
Collaborator

Finished benchmarking commit (ef7d206): comparison URL.

Overall result: ❌ regressions - please read the text below

Benchmarking this pull request means it may be perf-sensitive – we'll automatically label it not fit for rolling up. You can override this, but we strongly advise not to, due to possible changes in compiler perf.

Next Steps: If you can justify the regressions found in this try perf run, please do so in sufficient writing along with @rustbot label: +perf-regression-triaged. If not, please fix the regressions and do another perf run. If its results are neutral or positive, the label will be automatically removed.

@bors rollup=never
@rustbot label: -S-waiting-on-perf +perf-regression

Instruction count

Our most reliable metric. Used to determine the overall result above. However, even this metric can be noisy.

mean range count
Regressions ❌
(primary)
0.3% [0.2%, 0.4%] 8
Regressions ❌
(secondary)
0.3% [0.2%, 0.4%] 7
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) 0.3% [0.2%, 0.4%] 8

Max RSS (memory usage)

Results (primary 0.7%, secondary 3.4%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

mean range count
Regressions ❌
(primary)
3.5% [1.8%, 5.0%] 5
Regressions ❌
(secondary)
3.4% [3.4%, 3.4%] 1
Improvements ✅
(primary)
-3.9% [-6.5%, -2.0%] 3
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) 0.7% [-6.5%, 5.0%] 8

Cycles

Results (primary -0.6%, secondary -0.1%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
3.8% [3.8%, 3.8%] 1
Improvements ✅
(primary)
-0.6% [-0.6%, -0.6%] 1
Improvements ✅
(secondary)
-4.1% [-4.1%, -4.1%] 1
All ❌✅ (primary) -0.6% [-0.6%, -0.6%] 1

Binary size

Results (primary 0.0%, secondary 0.0%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

mean range count
Regressions ❌
(primary)
0.2% [0.0%, 0.8%] 10
Regressions ❌
(secondary)
0.1% [0.0%, 0.1%] 5
Improvements ✅
(primary)
-0.2% [-0.8%, -0.0%] 8
Improvements ✅
(secondary)
-0.2% [-0.2%, -0.2%] 1
All ❌✅ (primary) 0.0% [-0.8%, 0.8%] 18

Bootstrap: 757.399s -> 756.065s (-0.18%)
Artifact size: 372.20 MiB -> 372.12 MiB (-0.02%)

@rustbot rustbot added perf-regression Performance regression. and removed S-waiting-on-perf Status: Waiting on a perf run to be completed. labels Jun 15, 2025
@ohadravid
Copy link
Copy Markdown
Contributor Author

@matthiaskrgr - I updated the impl to stop re-checking once a head is found to be maybe-dead, which should be a bit better

@matthiaskrgr
Copy link
Copy Markdown
Member

@bors try @rust-timer queue

@rust-timer

This comment has been minimized.

@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Jun 15, 2025
@bors
Copy link
Copy Markdown
Collaborator

bors commented Jun 15, 2025

⌛ Trying commit 905e968 with merge c0a2949...

bors added a commit that referenced this pull request Jun 15, 2025
…try>

Remove fewer Storage calls in `copy_prop`

Modify the `copy_prop` MIR optimization pass to remove fewer `Storage{Live,Dead}` calls, allowing for better optimizations by LLVM - see #141649.

### Details

This is my attempt to fix the mentioned issue (this is the first part, I also implemented a similar solution for GVN in [this branch](https://github.com/rust-lang/rust/compare/master...ohadravid:rust:better-storage-calls-gvn-v2?expand=1)).

The idea is to use the `MaybeStorageDead` analysis and remove only the storage calls of `head`s that are maybe-storage-dead when the associated `local` is accessed (or, conversely, keep the storage of `head`s that are for-sure alive in _every_ relevant access).

When combined with the GVN change, the final example in the issue (#141649 (comment)) is optimized as expected by LLVM. I also measured the effect on a few functions in `rav1d` (where I originally saw the issue) and observed reduced stack usage in several of them.

This is my first attempt at working with MIR optimizations, so it's possible this isn't the right approach — but all tests pass, and the resulting diffs appear correct.

r? tmiasko

since he commented on the issue and pointed to these passes.
@cjgillot
Copy link
Copy Markdown
Contributor

Should this check happen in Replacer::visit_local, and move the replacement of storage statements to a dedicated cleanup visitor?

@bors
Copy link
Copy Markdown
Collaborator

bors commented Jun 15, 2025

☀️ Try build successful - checks-actions
Build commit: c0a2949 (c0a294957df10fc3880e1677c72c0cf122485509)

@rust-timer

This comment has been minimized.

@ohadravid
Copy link
Copy Markdown
Contributor Author

Should this check happen in Replacer::visit_local

I'm not sure how to make this work: using ResultsCursor requires a &body, but it's not possible to have that while running a MutVisitor since it requires a &mut body.

Is there a different way to do this?

@rust-timer
Copy link
Copy Markdown
Collaborator

Finished benchmarking commit (c0a2949): comparison URL.

Overall result: ❌✅ regressions and improvements - please read the text below

Benchmarking this pull request means it may be perf-sensitive – we'll automatically label it not fit for rolling up. You can override this, but we strongly advise not to, due to possible changes in compiler perf.

Next Steps: If you can justify the regressions found in this try perf run, please do so in sufficient writing along with @rustbot label: +perf-regression-triaged. If not, please fix the regressions and do another perf run. If its results are neutral or positive, the label will be automatically removed.

@bors rollup=never
@rustbot label: -S-waiting-on-perf +perf-regression

Instruction count

Our most reliable metric. Used to determine the overall result above. However, even this metric can be noisy.

mean range count
Regressions ❌
(primary)
0.3% [0.2%, 0.4%] 9
Regressions ❌
(secondary)
0.3% [0.2%, 0.4%] 7
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
-0.2% [-0.2%, -0.2%] 1
All ❌✅ (primary) 0.3% [0.2%, 0.4%] 9

Max RSS (memory usage)

Results (primary -0.1%, secondary -1.3%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

mean range count
Regressions ❌
(primary)
4.2% [3.4%, 5.8%] 4
Regressions ❌
(secondary)
3.1% [3.1%, 3.1%] 1
Improvements ✅
(primary)
-4.4% [-6.6%, -1.8%] 4
Improvements ✅
(secondary)
-5.8% [-5.8%, -5.8%] 1
All ❌✅ (primary) -0.1% [-6.6%, 5.8%] 8

Cycles

Results (secondary -1.0%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
2.3% [2.3%, 2.3%] 1
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
-2.6% [-2.6%, -2.5%] 2
All ❌✅ (primary) - - 0

Binary size

Results (primary -0.0%, secondary 0.0%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

mean range count
Regressions ❌
(primary)
0.2% [0.0%, 0.8%] 10
Regressions ❌
(secondary)
0.1% [0.0%, 0.1%] 5
Improvements ✅
(primary)
-0.2% [-0.8%, -0.0%] 8
Improvements ✅
(secondary)
-0.2% [-0.2%, -0.2%] 1
All ❌✅ (primary) -0.0% [-0.8%, 0.8%] 18

Bootstrap: 756.494s -> 757.685s (0.16%)
Artifact size: 372.15 MiB -> 372.11 MiB (-0.01%)

@rustbot rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Jun 15, 2025
Comment thread compiler/rustc_mir_transform/src/copy_prop.rs Outdated
@rust-log-analyzer

This comment has been minimized.

@rust-log-analyzer

This comment has been minimized.

@rust-bors

This comment has been minimized.

rust-bors bot pushed a commit that referenced this pull request Apr 14, 2026
…miasko,cjgillot,saethlin

Remove fewer Storage calls in CopyProp and GVN



Modify the CopyProp and GVN MIR optimization passes to remove fewer `Storage{Live,Dead}` calls, allowing for better optimizations by LLVM - see #141649.

### Details

The idea is to use a new `MaybeUninitializedLocals` analysis and remove only the storage calls of locals that are maybe-uninit when accessed in a new location.
@rust-log-analyzer

This comment has been minimized.

@rust-bors rust-bors bot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. labels Apr 14, 2026
@rust-bors
Copy link
Copy Markdown
Contributor

rust-bors bot commented Apr 14, 2026

💔 Test for f76e482 failed: CI. Failed job:

@ohadravid ohadravid force-pushed the better-storage-calls-copy-prop branch from 97b7a3f to 7686d1e Compare April 14, 2026 05:13
@rust-bors rust-bors bot added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Apr 14, 2026
@ohadravid
Copy link
Copy Markdown
Contributor Author

ohadravid commented Apr 14, 2026

@saethlin fixed.

Seems like the overall stack usage of the test function in issue-141649.rs went down since I added it:

  failures:
  
  ---- [assembly] tests/assembly-llvm/issue-141649.rs#x86_64 stdout ----
  ------FileCheck stdout------------------------------
  
  ------FileCheck stderr------------------------------
  /checkout/tests/assembly-llvm/issue-141649.rs:42:13: error: x86_64: expected string not found in input
   // x86_64: subq $24, %rsp
              ^
  /checkout/obj/build/x86_64-unknown-linux-gnu/test/assembly-llvm/issue-141649.x86_64/issue-141649.s:33:41: note: scanning from here
   .section .text.scoped_two_small_structs,"ax",@progbits
                                          ^
  /checkout/obj/build/x86_64-unknown-linux-gnu/test/assembly-llvm/issue-141649.x86_64/issue-141649.s:45:2: note: possible intended match here
   subq $16, %rsp
   ^
  /checkout/tests/assembly-llvm/issue-141649.rs:68:13: error: x86_64: expected string not found in input
   // x86_64: subq $24, %rsp
              ^
  /checkout/obj/build/x86_64-unknown-linux-gnu/test/assembly-llvm/issue-141649.x86_64/issue-141649.s:80:43: note: scanning from here
   .section .text.scoped_three_small_structs,"ax",@progbits
                                            ^
  /checkout/obj/build/x86_64-unknown-linux-gnu/test/assembly-llvm/issue-141649.x86_64/issue-141649.s:92:2: note: possible intended match here
   subq $16, %rsp
   ^

Changed

// x86_64: subq $24, %rsp

to

// x86_64: subq $16, %rsp

And it passes locally now on linux x86_64 - not sure why this test didn't fail during the regular bors run.

P.S.
also run issue-141649.rs with stage0 to check that the improvement is still real: 32->16 and 48->16, as expected.

$ ./x test --stage=0 --target x86_64-unknown-linux-gnu tests/assembly-llvm/issue-141649.rs
.section .text.scoped_two_small_structs,"ax",@progbits
..
subq $32, %rsp
..
 .section .text.scoped_three_small_structs,"ax",@progbits
..
subq $48, %rsp

@rust-log-analyzer

This comment has been minimized.

@ohadravid ohadravid force-pushed the better-storage-calls-copy-prop branch from 7686d1e to d18b665 Compare April 14, 2026 06:13
@rustbot
Copy link
Copy Markdown
Collaborator

rustbot commented Apr 14, 2026

This PR was rebased onto a different main commit. Here's a range-diff highlighting what actually changed.

Rebasing is a normal part of keeping PRs up to date, so no action is needed—this note is just to help reviewers.

@saethlin
Copy link
Copy Markdown
Member

@bors r=tmiasko,cjgillot,saethlin rollup=never

@rust-bors
Copy link
Copy Markdown
Contributor

rust-bors bot commented Apr 15, 2026

📌 Commit d18b665 has been approved by tmiasko,cjgillot,saethlin

It is now in the queue for this repository.

@rust-bors rust-bors bot added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Apr 15, 2026
@rust-bors

This comment has been minimized.

rust-bors bot pushed a commit that referenced this pull request Apr 15, 2026
…miasko,cjgillot,saethlin

Remove fewer Storage calls in CopyProp and GVN



Modify the CopyProp and GVN MIR optimization passes to remove fewer `Storage{Live,Dead}` calls, allowing for better optimizations by LLVM - see #141649.

### Details

The idea is to use a new `MaybeUninitializedLocals` analysis and remove only the storage calls of locals that are maybe-uninit when accessed in a new location.
@rust-log-analyzer

This comment has been minimized.

@rust-bors rust-bors bot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. labels Apr 15, 2026
@rust-bors
Copy link
Copy Markdown
Contributor

rust-bors bot commented Apr 15, 2026

💔 Test for af9ddc6 failed: CI. Failed job:

@ohadravid
Copy link
Copy Markdown
Contributor Author

Hi @saethlin looks like the -msvc target has a higher stack usage - how should I modify the test? 🙏

  check:68'0     ~~~~~~~~~~~~~~~~~~~
            101:  subq $48, %rsp 
  check:68'0     ~~~~~~~~~~~~~~~~
  check:68'1      ?               possible intended match

@saethlin
Copy link
Copy Markdown
Member

Codegen test annotations support revisions, and the revision name can be used instead of the CHECK parts of FileCheck comments. You can draw inspiration from this test:

//@ revisions: windows-gnu

@ohadravid ohadravid force-pushed the better-storage-calls-copy-prop branch from d18b665 to 5632001 Compare April 17, 2026 13:56
@rust-bors rust-bors bot added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Apr 17, 2026
@ohadravid
Copy link
Copy Markdown
Contributor Author

@saethlin done, split checks to aarch64,x86_64-unknown-linux-gnu,x86_64-pc-windows-msvc.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A-mir-opt-GVN Area: MIR opt Global Value Numbering (GVN) perf-regression Performance regression. S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Projects

None yet

Development

Successfully merging this pull request may close these issues.