Skip to content

Reduce the diff with the pytorch:main branch#3

Merged
SigureMo merged 5 commits intoPFCCLab:paddlefrom
gouzil:feat/reduce_diff
Apr 11, 2026
Merged

Reduce the diff with the pytorch:main branch#3
SigureMo merged 5 commits intoPFCCLab:paddlefrom
gouzil:feat/reduce_diff

Conversation

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR aims to reduce divergence from the upstream PyTorch torchcodec main branch by aligning several C++ core behaviors and operator registrations.

Changes:

  • Implemented previously stubbed/unimplemented paths for audio resampler flush (maybeFlushSwrBuffers) and tensor-backed AVIO output (AVIOToTensorContext::getOutputTensor).
  • Adjusted custom op registration blocks (including adding a BackendSelect registration block) and minor include/formatting alignment.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

File Description
src/torchcodec/_core/SingleStreamDecoder.cpp Implements returning flushed resampled audio samples instead of throwing.
src/torchcodec/_core/custom_ops.cpp Reorders includes and changes TORCH_LIBRARY_IMPL registrations (adds BackendSelect).
src/torchcodec/_core/AVIOTensorContext.cpp Implements returning the encoded output tensor slice instead of throwing.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@@ -1419,11 +1419,8 @@ std::optional<torch::Tensor> SingleStreamDecoder::maybeFlushSwrBuffers() {
auto actualNumRemainingSamples = swr_convert(
swrContext_.get(), outputBuffers.data(), numRemainingSamples, nullptr, 0);
Copy link

Copilot AI Apr 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

swr_convert() can return a negative value on error. Passing that directly as the length to lastSamples.narrow(...) will throw an unrelated/cryptic error (or potentially create an invalid slice). Add an explicit check for actualNumRemainingSamples < 0 and surface a clear failure (e.g., TORCH_CHECK(actualNumRemainingSamples >= 0, ...)) before narrowing; optionally also handle the == 0 case explicitly.

Suggested change
swrContext_.get(), outputBuffers.data(), numRemainingSamples, nullptr, 0);
swrContext_.get(), outputBuffers.data(), numRemainingSamples, nullptr, 0);
TORCH_CHECK(
actualNumRemainingSamples >= 0,
"swr_convert() failed while flushing buffered audio samples: ",
actualNumRemainingSamples);
if (actualNumRemainingSamples == 0) {
return std::nullopt;
}

Copilot uses AI. Check for mistakes.
Comment on lines +923 to 926
TORCH_LIBRARY_IMPL(torchcodec_ns, BackendSelect, m) {
m.impl("create_from_file", &create_from_file);
m.impl("create_from_tensor", &create_from_tensor);
m.impl("_create_from_file_like", &_create_from_file_like);
Copy link

Copilot AI Apr 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Registering create_from_tensor under BackendSelect is risky because it has a Tensor argument and the implementation assumes host-accessible memory (no video_tensor.is_cpu() check; it is passed into AVIOFromTensorContext which reads via data_ptr()/memcpy). With this dispatch key change, CUDA tensors may no longer fail dispatch and could reach this CPU-only code path. Consider registering create_from_tensor under CPU (and only keeping no-Tensor factory ops like create_from_file / _get_json_ffmpeg_library_versions under BackendSelect), or add a strict CPU device check for video_tensor if BackendSelect is intentional.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown

@SigureMo SigureMo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Paddle 那边支持 schema 最核心解决的问题应该是

Image

这里的 diff 才对,这里有试过回退么?

"""

if isinstance(indices, list):
indices = torch.tensor(indices, dtype=torch.int64).cpu()
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这能删吗?你在 CPU 机器上可能是对的,在 GPU 机器上可能就跑到 GPU 上了,这得确认下,CI 测不出来

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

奇怪,我在 3.3.1 都复现不了了,难道是哪个 PR 修复了?当初也没定位到是哪一层的转换与 PyTorch 不一致导致 list[int] 会被默认转换为 GPU Tensor,现在看来不会了

@SigureMo SigureMo changed the title Reduce the diff with the pytrch:main branch Reduce the diff with the pytorch:main branch Apr 10, 2026
"""

if isinstance(indices, list):
indices = torch.tensor(indices, dtype=torch.int64).cpu()
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

奇怪,我在 3.3.1 都复现不了了,难道是哪个 PR 修复了?当初也没定位到是哪一层的转换与 PyTorch 不一致导致 list[int] 会被默认转换为 GPU Tensor,现在看来不会了

@SigureMo SigureMo merged commit 805df30 into PFCCLab:paddle Apr 11, 2026
27 checks passed
@SigureMo
Copy link
Copy Markdown

ARM 的全挂了

https://github.com/PFCCLab/paddlecodec/actions/runs/24289565129/job/70924993156

只装到了 4.7 号的包,4.7 号之后看起来 aarch64 编包挂了,没有新的包,该不会是你的 PR 搞挂的吧?[doge]

感觉不像,更像是 PaddlePaddle/Paddle#78606

@SigureMo
Copy link
Copy Markdown

SigureMo commented Apr 11, 2026

@ShigureNyako 你能看出来 4.7 4.8 的哪个 PR 把 linux arm 编译搞挂了吗?macos arm 看起来没问题

@SigureMo
Copy link
Copy Markdown

ARM 的全挂了

@youge325 main 挂的问题是已知的,先不用 rerun 了,我明天问问编包同学 linux arm nightly build 是不是挂了

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants