Skip to content

[MC] Explicitly use memcpy in emitBytes() (NFC)#177187

Merged
nikic merged 1 commit intollvm:mainfrom
nikic:objectstreamer-memcpy
Jan 22, 2026
Merged

[MC] Explicitly use memcpy in emitBytes() (NFC)#177187
nikic merged 1 commit intollvm:mainfrom
nikic:objectstreamer-memcpy

Conversation

@nikic
Copy link
Copy Markdown
Contributor

@nikic nikic commented Jan 21, 2026

We've observed a compile-time regression in LLVM 22 when including large blobs. The root cause was that emitBytes() was copying bytes one-by-one, which is much slower than using memcpy for large objects.

Optimization of std::copy to memmove is apparently much less reliable than one might think. In particular, when using a non-bleeding-edge libstdc++ (anything older than version 15), this does not happen if the types of the input and output iterators do not match (like here, where there is a signed/unsigned mismatch).

As this code is performance sensitive, I think it makes sense to directly use memcpy.

Previously this code used SmallVector::append, which explicitly uses memcpy here:

std::memcpy(reinterpret_cast<void *>(Dest), I, (E - I) * sizeof(T));

We've observed a compile-time regression in LLVM 22 when including
large blobs. The root cause way that emitBytes() was copying bytes
one-by-one, which is much slower than using memcpy.

Optimization of std::copy to memmove is apparently much less
reliable than one might think. In particular, when using a
non-bleeding-edge libstdc++ (anything older than version 15),
this does not happen if the types of the input and output iterators
do not match (like here, where there is a signed/unsigned mismatch).

As this code is performance sensitive, I think it makes sense to
directly use memcpy.
@nikic nikic requested review from MaskRay and aengelke January 21, 2026 15:56
@llvmbot llvmbot added the llvm:mc Machine (object) code label Jan 21, 2026
@llvmbot
Copy link
Copy Markdown
Member

llvmbot commented Jan 21, 2026

@llvm/pr-subscribers-llvm-mc

Author: Nikita Popov (nikic)

Changes

We've observed a compile-time regression in LLVM 22 when including large blobs. The root cause was that emitBytes() was copying bytes one-by-one, which is much slower than using memcpy for large objects.

Optimization of std::copy to memmove is apparently much less reliable than one might think. In particular, when using a non-bleeding-edge libstdc++ (anything older than version 15), this does not happen if the types of the input and output iterators do not match (like here, where there is a signed/unsigned mismatch).

As this code is performance sensitive, I think it makes sense to directly use memcpy.

Previously this code used SmallVector::append, which explicitly uses memcpy here:

std::memcpy(reinterpret_cast<void *>(Dest), I, (E - I) * sizeof(T));


Full diff: https://github.com/llvm/llvm-project/pull/177187.diff

1 Files Affected:

  • (modified) llvm/lib/MC/MCObjectStreamer.cpp (+3-1)
diff --git a/llvm/lib/MC/MCObjectStreamer.cpp b/llvm/lib/MC/MCObjectStreamer.cpp
index 261e9a37ecb55..d44e14a35cac8 100644
--- a/llvm/lib/MC/MCObjectStreamer.cpp
+++ b/llvm/lib/MC/MCObjectStreamer.cpp
@@ -109,7 +109,9 @@ void MCObjectStreamer::addSpecialFragment(MCFragment *Frag) {
 void MCObjectStreamer::appendContents(ArrayRef<char> Contents) {
   ensureHeadroom(Contents.size());
   assert(FragSpace >= Contents.size());
-  llvm::copy(Contents, getCurFragEnd());
+  // As this is performance-sensitive code, explicitly use std::memcpy.
+  // Optimization of std::copy to memmove is unreliable.
+  std::memcpy(getCurFragEnd(), Contents.begin(), Contents.size());
   CurFrag->FixedSize += Contents.size();
   FragSpace -= Contents.size();
 }

@nikic nikic merged commit 15e421d into llvm:main Jan 22, 2026
13 checks passed
@nikic nikic deleted the objectstreamer-memcpy branch January 22, 2026 08:24
@nikic nikic added this to the LLVM 22.x Release milestone Jan 22, 2026
@nikic
Copy link
Copy Markdown
Contributor Author

nikic commented Jan 22, 2026

/cherry-pick 15e421d

@llvmbot
Copy link
Copy Markdown
Member

llvmbot commented Jan 22, 2026

/pull-request #177320

c-rhodes pushed a commit to llvmbot/llvm-project that referenced this pull request Jan 23, 2026
We've observed a compile-time regression in LLVM 22 when including large
blobs. The root cause was that emitBytes() was copying bytes one-by-one,
which is much slower than using memcpy for large objects.

Optimization of std::copy to memmove is apparently much less reliable
than one might think. In particular, when using a non-bleeding-edge
libstdc++ (anything older than version 15), this does not happen if the
types of the input and output iterators do not match (like here, where
there is a signed/unsigned mismatch).

As this code is performance sensitive, I think it makes sense to
directly use memcpy.

Previously this code used SmallVector::append, which explicitly uses
memcpy.

(cherry picked from commit 15e421d)
@nikic nikic mentioned this pull request Jan 23, 2026
10 tasks
@pcc
Copy link
Copy Markdown
Contributor

pcc commented Jan 23, 2026

Looks like this caused a ubsan failure: https://lab.llvm.org/buildbot/#/builders/85/builds/17944 . Can you please take a look?

(Specifically the tools/llvm-dwarfutil/ELF/X86/dwarf4-macro-vendor-specific.test failure; the LLD failures are unrelated and were fixed by #177562.)

@nikic
Copy link
Copy Markdown
Contributor Author

nikic commented Jan 23, 2026

I've pushed a speculative fix at d064f39.

@nikic
Copy link
Copy Markdown
Contributor Author

nikic commented Jan 26, 2026

/cherry-pick d064f39

@llvmbot
Copy link
Copy Markdown
Member

llvmbot commented Jan 26, 2026

/pull-request #177907

rust-bors bot pushed a commit to rust-lang/rust that referenced this pull request Jan 27, 2026
Update to LLVM 22

Scheduled release date: Feb 24
1.94 becomes stable: Mar 5

Changes:
 * Update to rc2, with one patch to work around our outdated illumos sysroot (rust-lang/llvm-project@41256ab).
 * Update the host toolchain as well, otherwise we lose cross-language LTO, in particular for jemalloc.
 * Adjust one loongarch assembly test. The split into r and s variants is based on the suggestion in #151134.

Depends on:

 * [x] #151410
 * [ ] #150756
 * [x] llvm/llvm-project#175190
 * [x] llvm/llvm-project#175912
 * [x] llvm/llvm-project#175965
 * [x] llvm/llvm-project#176195
 * [x] llvm/llvm-project#157073
 * [x] llvm/llvm-project#176421
 * [x] llvm/llvm-project#176925
 * [x] llvm/llvm-project#177187
rust-bors bot pushed a commit to rust-lang/rust that referenced this pull request Jan 28, 2026
Update to LLVM 22

Scheduled release date: Feb 24
1.94 becomes stable: Mar 5

Changes:
 * Update to rc2, with one patch to work around our outdated illumos sysroot (rust-lang/llvm-project@41256ab).
 * Update the host toolchain as well, otherwise we lose cross-language LTO, in particular for jemalloc.
 * Adjust one loongarch assembly test. The split into r and s variants is based on the suggestion in #151134.

Depends on:

 * [x] #151410
 * [ ] #150756
 * [x] llvm/llvm-project#175190
 * [x] llvm/llvm-project#175912
 * [x] llvm/llvm-project#175965
 * [x] llvm/llvm-project#176195
 * [x] llvm/llvm-project#157073
 * [x] llvm/llvm-project#176421
 * [x] llvm/llvm-project#176925
 * [x] llvm/llvm-project#177187
github-actions bot pushed a commit to rust-lang/rust-analyzer that referenced this pull request Jan 29, 2026
Update to LLVM 22

Scheduled release date: Feb 24
1.94 becomes stable: Mar 5

Changes:
 * Update to rc2, with one patch to work around our outdated illumos sysroot (rust-lang/llvm-project@41256ab).
 * Update the host toolchain as well, otherwise we lose cross-language LTO, in particular for jemalloc.
 * Adjust one loongarch assembly test. The split into r and s variants is based on the suggestion in rust-lang/rust#151134.

Depends on:

 * [x] rust-lang/rust#151410
 * [ ] rust-lang/rust#150756
 * [x] llvm/llvm-project#175190
 * [x] llvm/llvm-project#175912
 * [x] llvm/llvm-project#175965
 * [x] llvm/llvm-project#176195
 * [x] llvm/llvm-project#157073
 * [x] llvm/llvm-project#176421
 * [x] llvm/llvm-project#176925
 * [x] llvm/llvm-project#177187
github-actions bot pushed a commit to rust-lang/rustc-dev-guide that referenced this pull request Jan 29, 2026
Update to LLVM 22

Scheduled release date: Feb 24
1.94 becomes stable: Mar 5

Changes:
 * Update to rc2, with one patch to work around our outdated illumos sysroot (rust-lang/llvm-project@41256ab).
 * Update the host toolchain as well, otherwise we lose cross-language LTO, in particular for jemalloc.
 * Adjust one loongarch assembly test. The split into r and s variants is based on the suggestion in rust-lang/rust#151134.

Depends on:

 * [x] rust-lang/rust#151410
 * [ ] rust-lang/rust#150756
 * [x] llvm/llvm-project#175190
 * [x] llvm/llvm-project#175912
 * [x] llvm/llvm-project#175965
 * [x] llvm/llvm-project#176195
 * [x] llvm/llvm-project#157073
 * [x] llvm/llvm-project#176421
 * [x] llvm/llvm-project#176925
 * [x] llvm/llvm-project#177187
github-actions bot pushed a commit to rust-lang/miri that referenced this pull request Jan 30, 2026
Update to LLVM 22

Scheduled release date: Feb 24
1.94 becomes stable: Mar 5

Changes:
 * Update to rc2, with one patch to work around our outdated illumos sysroot (rust-lang/llvm-project@41256ab).
 * Update the host toolchain as well, otherwise we lose cross-language LTO, in particular for jemalloc.
 * Adjust one loongarch assembly test. The split into r and s variants is based on the suggestion in rust-lang/rust#151134.

Depends on:

 * [x] rust-lang/rust#151410
 * [ ] rust-lang/rust#150756
 * [x] llvm/llvm-project#175190
 * [x] llvm/llvm-project#175912
 * [x] llvm/llvm-project#175965
 * [x] llvm/llvm-project#176195
 * [x] llvm/llvm-project#157073
 * [x] llvm/llvm-project#176421
 * [x] llvm/llvm-project#176925
 * [x] llvm/llvm-project#177187
github-actions bot pushed a commit to rust-lang/stdarch that referenced this pull request Feb 5, 2026
Update to LLVM 22

Scheduled release date: Feb 24
1.94 becomes stable: Mar 5

Changes:
 * Update to rc2, with one patch to work around our outdated illumos sysroot (rust-lang/llvm-project@41256ab).
 * Update the host toolchain as well, otherwise we lose cross-language LTO, in particular for jemalloc.
 * Adjust one loongarch assembly test. The split into r and s variants is based on the suggestion in rust-lang/rust#151134.

Depends on:

 * [x] rust-lang/rust#151410
 * [ ] rust-lang/rust#150756
 * [x] llvm/llvm-project#175190
 * [x] llvm/llvm-project#175912
 * [x] llvm/llvm-project#175965
 * [x] llvm/llvm-project#176195
 * [x] llvm/llvm-project#157073
 * [x] llvm/llvm-project#176421
 * [x] llvm/llvm-project#176925
 * [x] llvm/llvm-project#177187
JDevlieghere added a commit to JDevlieghere/llvm-project that referenced this pull request Mar 26, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

llvm:mc Machine (object) code

Projects

Development

Successfully merging this pull request may close these issues.

5 participants