Reduce the diff with the pytorch:main branch by gouzil · Pull Request #3 · PFCCLab/paddlecodec

gouzil · 2026-04-04T16:50:16Z

减少与 Pytorch mian 分支的 diff

可以通过 meta-pytorch/torchcodec@main...PFCCLab:paddlecodec:paddle 查看现阶段主分支与 Pytorch main diff
可以通过 meta-pytorch/torchcodec@main...gouzil:paddlecodec:feat/reduce_diff 查看当前分支 Pytorch main diff

Pull request overview

This PR aims to reduce divergence from the upstream PyTorch torchcodec main branch by aligning several C++ core behaviors and operator registrations.

Changes:

Implemented previously stubbed/unimplemented paths for audio resampler flush (maybeFlushSwrBuffers) and tensor-backed AVIO output (AVIOToTensorContext::getOutputTensor).
Adjusted custom op registration blocks (including adding a BackendSelect registration block) and minor include/formatting alignment.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

File	Description
src/torchcodec/_core/SingleStreamDecoder.cpp	Implements returning flushed resampled audio samples instead of throwing.
src/torchcodec/_core/custom_ops.cpp	Reorders includes and changes `TORCH_LIBRARY_IMPL` registrations (adds `BackendSelect`).
src/torchcodec/_core/AVIOTensorContext.cpp	Implements returning the encoded output tensor slice instead of throwing.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-04T16:53:47Z

src/torchcodec/_core/SingleStreamDecoder.cpp

@@ -1419,11 +1419,8 @@ std::optional<torch::Tensor> SingleStreamDecoder::maybeFlushSwrBuffers() {
  auto actualNumRemainingSamples = swr_convert(
      swrContext_.get(), outputBuffers.data(), numRemainingSamples, nullptr, 0);


swr_convert() can return a negative value on error. Passing that directly as the length to lastSamples.narrow(...) will throw an unrelated/cryptic error (or potentially create an invalid slice). Add an explicit check for actualNumRemainingSamples < 0 and surface a clear failure (e.g., TORCH_CHECK(actualNumRemainingSamples >= 0, ...)) before narrowing; optionally also handle the == 0 case explicitly.

Suggested change

swrContext_.get(), outputBuffers.data(), numRemainingSamples, nullptr, 0);

swrContext_.get(), outputBuffers.data(), numRemainingSamples, nullptr, 0);

TORCH_CHECK(

actualNumRemainingSamples >= 0,

"swr_convert() failed while flushing buffered audio samples: ",

actualNumRemainingSamples);

if (actualNumRemainingSamples == 0) {

return std::nullopt;

}

Copilot · 2026-04-04T16:53:47Z

src/torchcodec/_core/custom_ops.cpp

+TORCH_LIBRARY_IMPL(torchcodec_ns, BackendSelect, m) {
  m.impl("create_from_file", &create_from_file);
  m.impl("create_from_tensor", &create_from_tensor);
  m.impl("_create_from_file_like", &_create_from_file_like);


Registering create_from_tensor under BackendSelect is risky because it has a Tensor argument and the implementation assumes host-accessible memory (no video_tensor.is_cpu() check; it is passed into AVIOFromTensorContext which reads via data_ptr()/memcpy). With this dispatch key change, CUDA tensors may no longer fail dispatch and could reach this CPU-only code path. Consider registering create_from_tensor under CPU (and only keeping no-Tensor factory ops like create_from_file / _get_json_ffmpeg_library_versions under BackendSelect), or add a strict CPU device check for video_tensor if BackendSelect is intentional.

SigureMo

Paddle 那边支持 schema 最核心解决的问题应该是

这里的 diff 才对，这里有试过回退么？

SigureMo · 2026-04-09T16:58:28Z

src/torchcodec/decoders/_video_decoder.py

        """

-        if isinstance(indices, list):
-            indices = torch.tensor(indices, dtype=torch.int64).cpu()


这能删吗？你在 CPU 机器上可能是对的，在 GPU 机器上可能就跑到 GPU 上了，这得确认下，CI 测不出来

奇怪，我在 3.3.1 都复现不了了，难道是哪个 PR 修复了？当初也没定位到是哪一层的转换与 PyTorch 不一致导致 list[int] 会被默认转换为 GPU Tensor，现在看来不会了

SigureMo · 2026-04-11T18:26:57Z

src/torchcodec/decoders/_video_decoder.py

        """

-        if isinstance(indices, list):
-            indices = torch.tensor(indices, dtype=torch.int64).cpu()


奇怪，我在 3.3.1 都复现不了了，难道是哪个 PR 修复了？当初也没定位到是哪一层的转换与 PyTorch 不一致导致 list[int] 会被默认转换为 GPU Tensor，现在看来不会了

SigureMo · 2026-04-11T20:16:28Z

ARM 的全挂了

https://github.com/PFCCLab/paddlecodec/actions/runs/24289565129/job/70924993156

只装到了 4.7 号的包，4.7 号之后看起来 aarch64 编包挂了，没有新的包，该不会是你的 PR 搞挂的吧？[doge]

感觉不像，更像是 PaddlePaddle/Paddle#78606

SigureMo · 2026-04-11T20:31:25Z

@ShigureNyako 你能看出来 ~~4.7~~ 4.8 的哪个 PR 把 linux arm 编译搞挂了吗？macos arm 看起来没问题

SigureMo · 2026-04-12T10:17:06Z

ARM 的全挂了

@youge325 main 挂的问题是已知的，先不用 rerun 了，我明天问问编包同学 linux arm nightly build 是不是挂了

gouzil added 2 commits March 29, 2026 16:42

reduce narrow, see: PaddlePaddle/Paddle#77544

3a6ca0c

feat: support BackendSelect

22b2c3e

Copilot AI review requested due to automatic review settings April 4, 2026 16:50

Copilot started reviewing on behalf of gouzil April 4, 2026 16:50 View session

gouzil mentioned this pull request Apr 4, 2026

【Hackathon 10th Spring No.2】Add schema parser and related functionality for Torch compatibility PaddlePaddle/Paddle#77938

Merged

Copilot AI reviewed Apr 4, 2026

View reviewed changes

gouzil mentioned this pull request Apr 4, 2026

[common] Add support for std::optional in error messages PaddlePaddle/Paddle#78521

Merged

gouzil added 2 commits April 5, 2026 12:32

fix: enable audio frame decoding check in getFramesPlayedInRangeAudio

c348bb8

empty commit

3a7dde7

SigureMo reviewed Apr 9, 2026

View reviewed changes

update

1cb1047

SigureMo reviewed Apr 9, 2026

View reviewed changes

SigureMo changed the title ~~Reduce the diff with the pytrch:main branch~~ Reduce the diff with the pytorch:main branch Apr 10, 2026

SigureMo mentioned this pull request Apr 11, 2026

Add ARM wheels build in CI #4

Merged

SigureMo approved these changes Apr 11, 2026

View reviewed changes

SigureMo merged commit 805df30 into PFCCLab:paddle Apr 11, 2026
27 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reduce the diff with the pytorch:main branch#3

Reduce the diff with the pytorch:main branch#3
SigureMo merged 5 commits intoPFCCLab:paddlefrom
gouzil:feat/reduce_diff

gouzil commented Apr 4, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Apr 4, 2026

Uh oh!

Copilot AI Apr 4, 2026

Uh oh!

SigureMo left a comment

Uh oh!

SigureMo Apr 9, 2026

Uh oh!

SigureMo Apr 11, 2026

Uh oh!

SigureMo Apr 11, 2026

Uh oh!

Uh oh!

SigureMo commented Apr 11, 2026

Uh oh!

SigureMo commented Apr 11, 2026 •

edited

Loading

Uh oh!

SigureMo commented Apr 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

		@@ -1419,11 +1419,8 @@ std::optional<torch::Tensor> SingleStreamDecoder::maybeFlushSwrBuffers() {
		auto actualNumRemainingSamples = swr_convert(
		swrContext_.get(), outputBuffers.data(), numRemainingSamples, nullptr, 0);

-      swrContext_.get(), outputBuffers.data(), numRemainingSamples, nullptr, 0);
+      swrContext_.get(), outputBuffers.data(), numRemainingSamples, nullptr, 0);
+  TORCH_CHECK(
+      actualNumRemainingSamples >= 0,
+      "swr_convert() failed while flushing buffered audio samples: ",
+      actualNumRemainingSamples);
+  if (actualNumRemainingSamples == 0) {
+    return std::nullopt;
+  }

Conversation

gouzil commented Apr 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

相关链接

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Apr 4, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 4, 2026

Choose a reason for hiding this comment

Uh oh!

SigureMo left a comment

Choose a reason for hiding this comment

Uh oh!

SigureMo Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

SigureMo Apr 11, 2026

Choose a reason for hiding this comment

Uh oh!

SigureMo Apr 11, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

SigureMo commented Apr 11, 2026

Uh oh!

SigureMo commented Apr 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

SigureMo commented Apr 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

gouzil commented Apr 4, 2026 •

edited

Loading

SigureMo commented Apr 11, 2026 •

edited

Loading