Skip to content

Stabilize -Zprofile-sample-use and -Zdebug-info-for-profiling#155942

Open
zamazan4ik wants to merge 3 commits intorust-lang:mainfrom
zamazan4ik:stabilize-profile-sample-use-and-debug-info-for-profiling
Open

Stabilize -Zprofile-sample-use and -Zdebug-info-for-profiling#155942
zamazan4ik wants to merge 3 commits intorust-lang:mainfrom
zamazan4ik:stabilize-profile-sample-use-and-debug-info-for-profiling

Conversation

@zamazan4ik
Copy link
Copy Markdown
Contributor

@zamazan4ik zamazan4ik commented Apr 28, 2026

I propose stabilizing the following options:

  • -Zprofile-sample-use option into -Cprofile-sample-use
  • -Zdebug-info-for-profiling into -Cdebuginfo-for-profiling (see details about renaming below)

Stabilization report

Summary

Remind us what this feature is and what value it provides. Tell the story of what led up to this stabilization.

E.g., see:

We stabilize Sample-based PGO for Rustc, and two corresponding flags for the feature: -Zprofile-sample-use -> -Cprofile-sample-use and -Zdebug-info-for-profiling -> -Cdebuginfo-for-profiling.

Sample-based PGO is an another way to perform PGO, in addition to already existing and stable Instrumentation PGO in Rustc (-Cprofile-generate/-Cprofile-use flags). Sample-based PGO allows perform PGO on Rust binaries without instrumenting them, and that leads to a possibility to collect PGO-suitable profiles via an external profiler like Linux perf without a huge overhead (compared to Instrumentation PGO).

More information can be found in the updated by the PR "Profile-guided Optimization" guide or Clang PGO guide.

Tracking:

Reference PRs:

I wasn't sure should I create a dedicated Reference-only PRs or can put all the things in one PR. As a references, I've used #145974 where Reference documentation was updated in the same PR.

cc @rust-lang/lang @rust-lang/lang-advisors

What is stabilized

Describe each behavior being stabilized and give a short example of code that will now be accepted.

No new language syntax is introduced/stabilized by this stabilization PR - only compiler flags.

What isn't stabilized

Describe any parts of the feature not being stabilized. Talk about what we might want to do later and what doors are being left open for that. If what we're not stabilizing might lead to surprises for users, talk about that in particular.

I think this is the right section to compare Sample-based PGO (SPGO) implementation in Rustc vs its "big brother" - Sample-based PGO in Clang.

Besides -Zprofile-sample-use in Rustc / -fprofile_sample_use in Clang and -Zdebug-info-for-profiling in Rustc / -fdebug-info-for-profiling, -fno-debug-info-for-profiling in Clang, Clang additionally supports the following SPGO-related switches:

  • -fpseudo-probe-for-profiling, -fno-pseudo-probe-for-profiling flags. According to the Clang's PGO guide, this switch is optional for SPGO. This switch has originals from an extension of SPGO that is called "Context-sensitive Sample PGO with Pseudo-Instrumentation" or simply "CSSPGO". Here is original RFC for the thing, also I can link some LLVM commits/discussions about the topic. This flag is not required for regular SPGO - it's just an improvement idea over regular SPGO, and could be added later to the Rustc in a different process (initially to unstable, than later promoted to stable). But that's another story and we can consider it later.
  • -f[no-]unique-internal-linkage-names switch is also mentioned in the Clang PGO guide. I don't think that the switch is applicable to Rustc. Correct me if I am wrong pls.
  • -fsample-profile-use-profi, -fno-sample-profile-use-profi switch. This switch is also marked as optional in the Clang PGO guide. This switch is an attempt to improve some inaccuracies in SPGO profile with some heuristics. SPGO in Rustc can be easily stabilized without this flag, since it's just a non-critical for regular SPGO usage heuristic. If we decide to add support for this switch to Rustc too - we can do in a separate activity without blocking with stabilization process.
  • -fprofile-sample-accurate, -fauto-profile-accurate, -fno-profile-sample-accurate switch. This switch is not mentioned even by the Clang PGO guide :) This flag resolves this issue/feature request from LLVM upstream in SPGO use case. According to the description from Clang: "Specifies that the sample profile is accurate. If the sample profile is accurate, callsites without profile samples are marked as cold. Otherwise, treat callsites without profile samples as if we have no profile". Since we don't specify the flag in Rustc, branches without profile samples are now considered as branches without a profile and optimized as regular release code. Having support for this in Rustc would be definitely a nice addition to be on par with Clang, since there are good use cases for that. But I do not think that this could be a blocker for stabilization of SPGO in Rustc without this functionality - we can add it later. As a proof that SPGO in Rustc works completely okay without it you can use Rust-for-Linux bench results with SPGO via AutoFDO.

I am not aware about any other SPGO-related flags in Clang.

My personal opinion about the information above. We definitely have a gap in SPGO-related flags in Rustc compared to Clang, but none of these gaps is a blocker for stabilization -Zdebug-info-for-profiling/-Zprofile-sample-use right now. However, it would be nice to resolve these gaps later: add them in unstable form, test, and later stabilize them to be on par with Clang from SPGO optimization perspective. Right now all the flags above are missing in Rustc even in the unstable form. Stabilization of -Zdebug-info-for-profiling/-Zprofile-sample-use does not prevent adding all missing SPGO related features later.

Design

Reference

What updates are needed to the Reference? Link to each PR. If the Reference is missing content needed for describing this feature, discuss that.

RFC history

What RFCs have been accepted for this feature?

No RFC was created for these options. All original discussions for Unstable were done in the original Unstable PR: #87918

Answers to unresolved questions

What questions were left unresolved by the RFC? How have they been answered? Link to any relevant lang decisions.

No unresolved questions were found in the original PR.

Post-RFC changes

What other user-visible changes have occurred since the RFC was accepted? Describe both changes that the lang team accepted (and link to those decisions) as well as changes that are being presented to the team for the first time in this stabilization report.

Compared to the unstable (RFC-like) state, I've made the following changes:

  • Renamed -Zdebug-info-for-profiling into -Cdebuginfo-for-profiling (debug_info -> debuginfo). At the time of writing of the original RFC, there was no other stable compiler flag with a debuginfo word, so wording was a bit unclear. Nowadays we have debuginfo and split-debuginfo flags. That's why I decided to change user-facing flag name into -Cdebuginfo_for_profiling - for being consistent with other debuginfo-related flags and not confuse users for nothing. This change shall be adopted by the Rust-for-Linux dev team (if my change will be accepted), but they will need to do some changes anyway due to stabilization - replace -Z... with -C (see current patch) - so it shouldn't be an issue at all.
  • I extend the Rustc's PGO guide with Sample-based PGO information and instructions, how to use it. This change will resolve Extend Profile-Guided Optimization guide with Sampling PGO #117023 . My changes are highly-inspired / cautiously copy-pasted (only needed parts) from the Clang guide. From licensing perspective it should be fine. If it's a problem in any way - I can do some rewording (but I would like to avoid such things). I decided to do some documentation copy due to Sampled-based PGO incompatibilities between Clang and Rustc (Clang supports more options at very least). I believe that current version is more user-friendly and easier to use, compared to just referring to the Clang PGO guide.
  • I changed help message and corresponding documentation for -Zdebug_info_for_profiling switch to be in the same way as Clang already has. It's kinda difficult to describe clearly, what the option does without exposing too much LLVM details - that's why I linked the documentation to the corresponding LLVM pass in the Reference for this option.

Key points

What decisions have been most difficult and what behaviors to be stabilized have proved most contentious? Summarize the major arguments on all sides and link to earlier documents and discussions.

No arguments were raised during stabilization discussion of the feature in any place yet, including Zulip discussion: #t-compiler > Stabilizing Sample PGO (SPGO): `-Zprofile-sample-use`

Nightly extensions

Are there extensions to this feature that remain unstable? How do we know that we are not accidentally committing to those?

I am not aware of any other unstable SPGO-related switches.

Doors closed

What doors does this stabilization close for later changes to the language? E.g., does this stabilization make any other RFCs, lang experiments, or known in-flight proposals more difficult or impossible to do later?

  • Removing Sample-based PGO support from Rustc will be harder. But the technology itself is used on large scales in other ecosystems like Clang (heavily-used in big tech companies internally), and Rust-for-Linux already started to use it even with Rustc. I don't think will be a need to remove it in near future.
  • Renaming flags will be harder. But current naming is done to be consistent with Clang. Clang proved robustness of this naming, so it shouldn't be a concern either.

No other proposals/experiments/etc. are affected.

Feedback

Call for testing

Has a "call for testing" been done? If so, what feedback was received?

No, it wasn't done - it was slightly discussed #t-compiler > Stabilizing Sample PGO (SPGO): `-Zprofile-sample-use` @ 💬.

Right now this feature is already tested personally by me (local experiments with assembly changes verification before/after applying SPGO on an Intel-based (with LBR) Linux machine in some sample apps with llvm-profgen tool and by Rust-for-Linux project in this patch. Additionally, this feature was in unstable state for 5 years (since 2021) with no concerns (due to no bugs or no users - who knows. At least it was implemented for a reason 5 years ).

I think that's enough verification for such kind of feature.

Nightly use

Do any known nightly users use this feature? Counting instances of #![feature(FEATURE_NAME)] on GitHub with grep might be informative.

The only publicly-known user of this feature is Rust-for-Linux project (see this commit). Besides that, no other users were found on GitHub via GitHub search for "-Zprofile-sample-use" query: almost all found entries are various copies of the Unstable book with the documentation for the option, and other places are related to the Linux kernel. Two found issues are related to the tracking issue of these two flags, and corresponding tracking issue in Rust-for-Linux.

I am not personally aware of any closed-source users of this feature. Probably Google (Sampled-based PGO biggest user at least for C++) and other big techs use it somewhere internally too but it's just a guess.

Implementation

Major parts

Summarize the major parts of the implementation and provide links into the code and to relevant PRs.

See, e.g., this breakdown of the major parts of async closures:

No significant developments on the Rustc side - just propagating in a proper way SPGO profile to the LLVM part of the compiler and enabling an additional LLVM pass, when -Cdebuginfo-for-profiling is enabled.

Coverage

Summarize the test coverage of this feature.

Consider what the "edges" of this feature are. We're particularly interested in seeing tests that assure us about exactly what nearby things we're not stabilizing. Tests should of course comprehensively demonstrate that the feature works. Think too about demonstrating the diagnostics seen when common mistakes are made and the feature is used incorrectly.

Within each test, include a comment at the top describing the purpose of the test and what set of invariants it intends to demonstrate. This is a great help to our review.

Describe any known or intentional gaps in test coverage.

Contextualize and link to test folders and individual tests.

The only tests we have for these two compiler switches are added to the tests.rs in the PR (they were originally added in the Unstable PR).

If more testing is required and it's a blocker for stabilization - it should be discussed.

Outstanding bugs

What outstanding bugs involve this feature? List them. Should any block the stabilization? Discuss why or why not.

I am not aware of any SPGO-related bug in Rustc right now.

Outstanding FIXMEs

What FIXMEs are still in the code for that feature and why is it OK to leave them there?

No FIXMEs left in the code.

Tool changes

What changes must be made to our other tools to support this feature. Has this work been done? Link to any relevant PRs and issues.

  • rustfmt
    • Nothing
  • rust-analyzer
    • Nothing
  • rustdoc (both JSON and HTML)
    • Nothing
  • cargo
    • Nothing
  • clippy
    • Nothing
  • rustup
    • Nothing
  • docs.rs
    • Nothing

No changes are required in other tools for the stabilization.

Some things in the ecosystem like cargo-pgo can be extended later with providing additional support for SPGO use cases, but definitely it isn't a blocker for the stabilization.

Breaking changes

If this stabilization represents a known breaking change, link to the crater report, the analysis of the crater report, and to all PRs we've made to ecosystem projects affected by this breakage. Discuss any limitations of what we're able to know about or to fix.

The only thing that probably can be considered as "breaking" is renaming -Zdebug_info_for_profiling into -Cdebuginfo_for_profiling that is described above.

No other breaking changes are expected from this stabilization.

Crater report:

  • N/A

Crater analysis:

  • N/A

PRs to affected crates:

  • N/A

Type system, opsem

Compile-time checks

What compilation-time checks are done that are needed to prevent undefined behavior?

Link to tests demonstrating that these checks are being done.

No compile-time checks are required.

Type system rules

What type system rules are enforced for this feature and what is the purpose of each?

Not applicable

Sound by default?

Does the feature's implementation need specific checks to prevent UB, or is it sound by default and need specific opt-in to perform the dangerous/unsafe operations? If it is not sound by default, what is the rationale?

Sound by default. No other additional UB checks are required.

Breaks the AM?

Can users use this feature to introduce undefined behavior, or use this feature to break the abstraction of Rust and expose the underlying assembly-level implementation? Describe this if so.

No, it's not possible.

Common interactions

Temporaries

Does this feature introduce new expressions that can produce temporaries? What are the scopes of those temporaries?

No, it doesn't.

Drop order

Does this feature raise questions about the order in which we should drop values? Talk about the decisions made here and how they're consistent with our earlier decisions.

No, it doesn't.

Pre-expansion / post-expansion

Does this feature raise questions about what should be accepted pre-expansion (e.g. in code covered by #[cfg(false)]) versus what should be accepted post-expansion? What decisions were made about this?

No, it doesn't.

Edition hygiene

If this feature is gated on an edition, how do we decide, in the context of the edition hygiene of tokens, whether to accept or reject code. E.g., what token do we use to decide?

Not applicable.

SemVer implications

Does this feature create any new ways in which library authors must take care to prevent breaking downstreams when making minor-version releases? Describe these. Are these new hazards "major" or "minor" according to RFC 1105?

No, it doesn't.

Exposing other features

Are there any other unstable features whose behavior may be exposed by this feature in any way? What features present the highest risk of that?

No, there are not.

History

List issues and PRs that are important for understanding how we got here.

Acknowledgments

Summarize contributors to the feature by name for recognition and so that those people are notified about the stabilization. Does anyone who worked on this not think it should be stabilized right now? We'd like to hear about that if so.

I am not aware of any person, who is against stabilization of these two flags.

Open items

List any known items that have not yet been completed and that should be before this is stabilized.

I am not aware of any open issue, that is a blocker for stabilization of this feature.

List of SPGO-related things, which are not stabilization blockers in my opinion:

  • Include llvm-profgen in llvm-tools component #155525 - this could improve SPGO UX with Rustc, but it's not a strict requirement - an externally-installed llvm-profgen can be used instead (I've tested it locally by using llvm-profgen-21 for SPGO with Rustc 1.95, which is LLVM 22-based

- remove pages from the Unstable book for -Zprofile-sample-use and
  -Zdebug-info-for-profiling
- add corresponding pages to the Rustc codegen docs
- update all related to the flags entities from unstable to stable ones:
  internal structs, tests
- rename debug-info-for-profiling into debuginfo-for-profiling to be
  consistent with other "debuginfo" flags like -Cdebuginfo and
-Csplit-debuginfo
@rustbot rustbot added A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Apr 28, 2026
@zamazan4ik zamazan4ik marked this pull request as ready for review April 28, 2026 20:31
@rustbot rustbot added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Apr 28, 2026
@rustbot rustbot removed the S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. label Apr 28, 2026
@rustbot
Copy link
Copy Markdown
Collaborator

rustbot commented Apr 28, 2026

r? @JohnTitor

rustbot has assigned @JohnTitor.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

Why was this reviewer chosen?

The reviewer was selected based on:

  • Owners of files modified in this PR: @ehuss, compiler
  • @ehuss, compiler expanded to 74 candidates
  • Random selection from 22 candidates

@ojeda
Copy link
Copy Markdown
Contributor

ojeda commented Apr 28, 2026

This change shall be adopted by the Rust-for-Linux dev team (if my change will be accepted), but they will need to do some changes anyway due to stabilization - replace -Z... with -C (see current patch) - so it shouldn't be an issue at all.

That is fine, yes -- we are accustomed to flag name changes on stabilization etc. :)

Rust-for-Linux already started to use it even with Rustc.

(...)

The only publicly-known user of this feature is Rust-for-Linux project (see this commit).

Just a quick clarification: it is not a commit yet. While we plan to land it during this cycle, i.e. for v7.2, it has not seen use in mainline Linux yet.

Thanks for this!

@rustbot label +A-rust-for-linux

@rustbot rustbot added the A-rust-for-linux Relevant for the Rust-for-Linux project label Apr 28, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. A-rust-for-linux Relevant for the Rust-for-Linux project S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Extend Profile-Guided Optimization guide with Sampling PGO Partial training option for PGO

4 participants