Skip to content

Test merged Publish ARM wheels to GitHub#6

Closed
youge325 wants to merge 175 commits intopaddlefrom
arm_merge
Closed

Test merged Publish ARM wheels to GitHub#6
youge325 wants to merge 175 commits intopaddlefrom
arm_merge

Conversation

@youge325
Copy link
Copy Markdown

No description provided.

dvrogozh and others added 30 commits October 30, 2025 18:50
…torch#1002)

Co-authored-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@gmail.com>
Co-authored-by: Nicolas Hug <nicolashug@meta.com>
Co-authored-by: Molly Xu <mollyxu@fb.com>
Co-authored-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@gmail.com>
Co-authored-by: Nicolas Hug <nicolashug@meta.com>
Co-authored-by: Molly Xu <mollyxu@fb.com>

Closes meta-pytorch#1009
Co-authored-by: Nicolas Hug <contact@nicolas-hug.com>
Differential Revision: D87109526

Pull Request resolved: meta-pytorch#1059
Copilot AI review requested due to automatic review settings April 11, 2026 14:53
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

This PR updates TorchCodec’s packaging/build/test infrastructure (including FFmpeg variant handling and wheel testing), introduces new core decoding/encoding capabilities (notably NVDEC cache controls, transforms, and WAV decoding), and refreshes documentation/examples to match the new APIs and hosting.

Changes:

  • Added multi-FFmpeg-version CMake package config and refactored FFmpeg target creation utilities.
  • Expanded core/runtime functionality (stable-ABI refactors, decoder/encoder enhancements, NVDEC cache capacity API, transforms/rotation/resize/crop plumbing, WAV decoder support).
  • Updated CI workflows, docs theme and site links, examples, benchmarks, and added new test/reference resources.

Reviewed changes

Copilot reviewed 128 out of 154 changed files in this pull request and generated 9 comments.

Show a summary per file
File Description
test/resources/test_non_zero_start.mp4.stream0.all_frames_info.json Adds new reference frame timing data for tests
test/resources/sine_16ch_s16.wav.stream0.all_frames_info.json Adds new audio reference frame timing data
test/resources/nasa_13013_rotated.mp4.stream0.all_frames_info.json Adds rotated-video reference timing data
test/generate_reference_resources.py Improves frame generation filtergraph construction and adds new NASA reference generators
test/conftest.py Adds ffmpeg-cli marker and fbcode collection behavior
test/MetadataTest.cpp Adds C++ unit tests for StreamMetadata fallback logic
test/CMakeLists.txt Bumps C++ standard and wires up new C++ test target
src/torchcodec/transforms/init.py Exposes decoder transform classes in public module
src/torchcodec/share/cmake/TorchCodec/ffmpeg_versions.cmake Adds CMake helpers to define FFmpeg-versioned targets
src/torchcodec/share/cmake/TorchCodec/TorchCodecConfig.cmake Adds install-time CMake package config for consumers
src/torchcodec/samplers/_time_based.py Modernizes typing and makes timestamp tensors float64 for precision
src/torchcodec/samplers/_index_based.py Modernizes typing with PEP 604 unions
src/torchcodec/samplers/_common.py Switches to collections.abc and PEP 604 union types
src/torchcodec/encoders/_audio_encoder.py Modernizes typing (PEP 604) in encoder API
src/torchcodec/decoders/_wav_decoder.py Adds a Python WAV-only decoder binding
src/torchcodec/decoders/_decoder_utils.py Adds NVDEC cache capacity API wrappers; removes generic decoder factory from this layer
src/torchcodec/decoders/_audio_decoder.py Refactors AudioDecoder creation through core decoder utils
src/torchcodec/decoders/init.py Exposes new CUDA/NVDEC cache APIs and CpuFallbackStatus
src/torchcodec/_samplers/video_clip_sampler.py Typing updates and core API import refactor
src/torchcodec/_internally_replaced_utils.py Adds robust shared-library loading across FFmpeg variants with tracebacks
src/torchcodec/_frame.py Typing refactor to collections.abc and PEP 604 unions
src/torchcodec/_core/pybind_ops.cpp Adjusts include paths for in-tree headers
src/torchcodec/_core/fetch_and_expose_non_gpl_ffmpeg_libs.cmake Adds Linux aarch64 FFmpeg fetch support and reuses shared ffmpeg target helper
src/torchcodec/_core/_decoder_utils.py Introduces core-level decoder factories (audio/video/wav) and transform spec handling
src/torchcodec/_core/init.py Updates exported ops and adds NVDEC cache/wav functions
src/torchcodec/_core/WavDecoder.h Adds C++ WAV decoder interface
src/torchcodec/_core/ValidationUtils.h Adds uint64->streampos validation API
src/torchcodec/_core/ValidationUtils.cpp Implements streampos validation and uses stable ABI checks
src/torchcodec/_core/Transform.h Expands transform system (crop center mode, rotation, validation signature changes)
src/torchcodec/_core/Transform.cpp Implements crop validation updates and adds rotation transform logic
src/torchcodec/_core/SwScale.h Adds SwScale wrapper for two-pass conversion+resize pipeline
src/torchcodec/_core/SwScale.cpp Implements SwScale conversion pipeline
src/torchcodec/_core/StreamOptions.h Expands encoding options and adopts stable ABI device type
src/torchcodec/_core/StableABICompat.h Adds stable-ABI utilities/helpers and visibility macros
src/torchcodec/_core/NVDECCacheConfig.h Adds global NVDEC cache capacity config interface
src/torchcodec/_core/NVDECCacheConfig.cpp Implements NVDEC cache capacity config with eviction
src/torchcodec/_core/NVDECCache.h Refactors NVDEC cache to LRU multimap and stable device types
src/torchcodec/_core/NVDECCache.cpp Implements LRU eviction and cross-device eviction helpers
src/torchcodec/_core/NVCUVIDRuntimeLoader.cpp Converts checks to stable-ABI-safe macro and adjusts includes
src/torchcodec/_core/Metadata.h Extends metadata model and adds computed fallback methods and color metadata
src/torchcodec/_core/Metadata.cpp Implements StreamMetadata fallback/computed methods
src/torchcodec/_core/Frame.h Converts outputs to torch::stable::Tensor and stable device types
src/torchcodec/_core/Frame.cpp Updates tensor allocation to stable ABI APIs
src/torchcodec/_core/FilterGraph.h Renames filter context struct to FiltersConfig
src/torchcodec/_core/FilterGraph.cpp Refactors filtergraph setup, adds more robust checks and naming
src/torchcodec/_core/FFMPEGCommon.h Adds UniqueAVDictionary, rotation extraction, and SwsConfig changes
src/torchcodec/_core/Encoder.h Refactors encoders to stable ABI tensors; adds MultiStreamEncoder API
src/torchcodec/_core/DeviceInterface.h Refactors device interface to stable ABI types and adds encoding hooks
src/torchcodec/_core/DeviceInterface.cpp Updates device parsing and stable tensor conversion, improves checks
src/torchcodec/_core/CudaDeviceInterface.h Updates interface for stable ABI and encoding support
src/torchcodec/_core/CpuDeviceInterface.h Adds audio init/flush, SwScale, encoding conversion hooks, and config caching
src/torchcodec/_core/Cache.h Refactors cache to stable ABI device and checks
src/torchcodec/_core/CUDACommon.h Refactors CUDA helpers to stable ABI tensor/device and cudaStream_t
src/torchcodec/_core/BetaCudaDeviceInterface.h Refactors to stable ABI and adds rotation support surface
src/torchcodec/_core/AVIOTensorContext.h Refactors tensor I/O to stable ABI and exports visibility
src/torchcodec/_core/AVIOTensorContext.cpp Updates stable tensor reads/writes and uses stableCat
src/torchcodec/_core/AVIOFileLikeContext.h Adjusts includes for in-tree header layout
src/torchcodec/_core/AVIOFileLikeContext.cpp Converts checks to stable-ABI-safe macro
src/torchcodec/_core/AVIOContextHolder.h Adjusts includes for in-tree header layout
src/torchcodec/_core/AVIOContextHolder.cpp Converts checks to stable-ABI-safe macro and improves errors
src/torchcodec/init.py Exposes encoders/transforms at package top-level
setup.py Adds env flag for Homebrew rpath handling and adjusts version file logic
pyproject.toml Bumps minimum Python to 3.10 and updates tooling/pytest config
packaging/remove_src.sh Adds helper script to ensure tests run against installed wheel
packaging/install_torchcodec_wheel.sh Adds helper script to locate/install built wheel artifacts
packaging/install_test_dependencies.sh Adds helper script for Python test dependencies
packaging/install_pytorch.sh Adds helper script to install PyTorch from appropriate channel
packaging/install_ffmpeg.sh Adds helper script to install FFmpeg via conda-forge after asserting absence
mypy.ini Enables follow_untyped_imports
examples/encoding/audio_encoding.py Adds sphinx-gallery thumbnail metadata
examples/decoding/sampling.py Updates Optional typing to PEP 604
examples/decoding/parallel_decoding.py Typing modernization and adds sphinx-gallery thumbnail metadata
examples/decoding/file_like.py Adds sphinx-gallery thumbnail metadata
examples/decoding/custom_frame_mappings.py Adds thumbnail metadata and reformats ffprobe command
examples/decoding/basic_example.py Updates Optional typing to PEP 604
examples/decoding/basic_cuda_example.py Updates README links and adds CPU fallback detection section
examples/decoding/audio_decoding.py Adds thumbnail metadata
examples/decoding/approximate_mode.py Documentation wording improvements and adds thumbnail metadata
docs/source/index.rst Updates links/structure, adds new examples and API ref entrypoints
docs/source/conf.py Migrates to pytorch_sphinx_theme2 and adds sitemap/mermaid/version switcher config
docs/source/api_ref_transforms.rst Adds transforms API reference page
docs/source/api_ref_encoders.rst Adds VideoEncoder to encoder API docs
docs/source/api_ref_decoders.rst Adds metadata summaries, NVDEC cache APIs, and CpuFallbackStatus docs
docs/source/api_ref.rst Adds central API Reference toctree
docs/source/_templates/layout.html Removes old theme layout override
docs/source/_static/css/custom_torchcodec.css Removes custom CSS for old theme
docs/requirements.txt Updates doc build dependencies for new Sphinx/theme stack
benchmarks/encoders/benchmark_encoders.py Adds encoder benchmarking script
benchmarks/decoders/memprofile_decoders.py Updates imports split between _core and _core.ops
benchmarks/decoders/gpu_benchmark.py Refactors to use direct _core/_core.ops imports
benchmarks/decoders/benchmark_transforms.py Updates transforms benchmark to use VideoDecoder transforms interface
benchmarks/decoders/benchmark_decoders_library.py Refactors imports split between _core and _core.ops
README.md Updates docs links/FFmpeg guidance/compat table and CPU/CUDA install instructions
CONTRIBUTING.md Updates FFmpeg install guidance and adds local docs serving instructions
CMakeLists.txt Defines LINUX variable globally for internal builds
.pre-commit-config.yaml Adds pyupgrade hook for py310+ in src/test
.github/workflows/windows_wheel.yaml Refactors wheel test job to use new packaging scripts and removes src/
.github/workflows/reference_resources.yaml Refactors reference generation job to use packaging scripts
.github/workflows/macos_wheel.yaml Refactors wheel test job to use packaging scripts and removes src/
.github/workflows/linux_wheel.yaml Refactors wheel test jobs, adds third-party-interface job, uses packaging scripts
.github/workflows/linux_cuda_aarch64_wheel.yaml Adds new workflow for Linux CUDA aarch64 wheel build/test
.github/workflows/lint.yaml Updates actions versions and uses packaging scripts for PyTorch install
.github/workflows/docs.yaml Removes previous docs workflow
.github/workflows/cpp_tests.yaml Updates actions versions and uses packaging scripts for PyTorch install
.github/workflows/build_ffmpeg.yaml Adds Linux aarch64 FFmpeg build job

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +52 to +56
if in_fbcode():
# fbcode doesn't like skipping tests, so instead we just don't collect the test
# so that they don't even "exist", hence the continue statements.
if needs_ffmpeg_cli or has_skip_marker or skipif_condition_is_true:
continue
Copy link

Copilot AI Apr 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In pytest_collection_modifyitems(), using continue does not remove the item from the items list, so the test will still be collected and executed. To actually avoid collection, filter items in-place (e.g., rebuild a list of kept items and assign to items[:]) or add an unconditional skip marker in fbcode rather than continuing.

Copilot uses AI. Check for mistakes.
Comment on lines +24 to +35
if (LINUX)
if (ffmpeg_major_version EQUAL 4)
set(library_file_names libavutil.so.56 libavcodec.so.58 libavformat.so.58 libavdevice.so.58 libavfilter.so.7 libswscale.so.5 libswresample.so.3)
elseif (ffmpeg_major_version EQUAL 5)
set(library_file_names libavutil.so.57 libavcodec.so.59 libavformat.so.59 libavdevice.so.59 libavfilter.so.8 libswscale.so.6 libswresample.so.4)
elseif (ffmpeg_major_version EQUAL 6)
set(library_file_names libavutil.so.58 libavcodec.so.60 libavformat.so.60 libavdevice.so.60 libavfilter.so.9 libswscale.so.7 libswresample.so.4)
elseif (ffmpeg_major_version EQUAL 7)
set(library_file_names libavutil.so.59 libavcodec.so.61 libavformat.so.61 libavdevice.so.61 libavfilter.so.10 libswscale.so.8 libswresample.so.5)
elseif (ffmpeg_major_version EQUAL 8)
set(library_file_names libavutil.so.60 libavcodec.so.62 libavformat.so.62 libavdevice.so.62 libavfilter.so.11 libswscale.so.9 libswresample.so.6)
endif()
Copy link

Copilot AI Apr 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LINUX is not a standard CMake platform variable for downstream consumers, so on Linux this will often evaluate false and fall through to the 'unsupported OS' fatal error path. Use CMake’s built-in platform conditionals instead (e.g., if(UNIX AND NOT APPLE) for Linux) or CMAKE_SYSTEM_NAME checks inside this helper, since this file is intended to be used from an installed package config.

Copilot uses AI. Check for mistakes.
Comment on lines +24 to +30
if (CMAKE_SYSTEM_PROCESSOR MATCHES "aarch64|arm64|ARM64")
set(
platform_url
${base_url}/linux_aarch64
)

set(
Copy link

Copilot AI Apr 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This CMake logic contains a duplicated if (CMAKE_SYSTEM_PROCESSOR MATCHES ...) block (one starting at line 18 and another at line 24) without a clearly corresponding endif() for the first one in the shown hunk. As written, this is very likely to produce a CMake parse error or unintended nesting/branching. Remove the duplicate block and ensure the if/else/endif structure is balanced and only sets platform_url once.

Suggested change
if (CMAKE_SYSTEM_PROCESSOR MATCHES "aarch64|arm64|ARM64")
set(
platform_url
${base_url}/linux_aarch64
)
set(
set(

Copilot uses AI. Check for mistakes.
Comment on lines +128 to +133
Rotation rotationFromDegrees(std::optional<double> degrees) {
if (!degrees.has_value()) {
return Rotation::NONE;
}
// Round to nearest multiple of 90 degrees
int rounded = static_cast<int>(std::round(*degrees / 90.0)) * 90;
Copy link

Copilot AI Apr 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

std::round requires including <cmath>. This file’s includes in the diff do not show <cmath>, which can cause compilation failures depending on transitive includes. Add an explicit #include <cmath> in this translation unit.

Copilot uses AI. Check for mistakes.
Comment on lines +175 to +177
default:
return "";
}
Copy link

Copilot AI Apr 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The default branch silently returns an empty filter string, which can hide unsupported/invalid Rotation values and makes failures harder to diagnose. Prefer a checked default (e.g., STD_TORCH_CHECK(false, ...)) so unexpected enum values fail loudly.

Suggested change
default:
return "";
}
}
STD_TORCH_CHECK(
false, "Unknown rotation: ", static_cast<int>(rotation_));

Copilot uses AI. Check for mistakes.
0,
config_.inputHeight,
dstPointers,
dstLinesizes);
Copy link

Copilot AI Apr 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The resize swscale pass result is not validated. If sws_scale returns a value different from config_.outputHeight, callers may get partially converted output without an error. Consider adding a STD_TORCH_CHECK validating the returned height matches the expected output height when needsResize_ is true.

Suggested change
dstLinesizes);
dstLinesizes);
STD_TORCH_CHECK(
colorConvertedHeight == config_.outputHeight,
"Resize swscale pass failed: colorConvertedHeight != config_.outputHeight: ",
colorConvertedHeight,
" != ",
config_.outputHeight);

Copilot uses AI. Check for mistakes.
Comment on lines 203 to 211
# the content of `version.txt` plus some suffix like "+cpu" or "+cu112".
# See
# https://github.com/pytorch/test-infra/blob/61e6da7a6557152eb9879e461a26ad667c15f0fd/tools/pkg-helpers/pytorch_pkg_helpers/version.py#L113
version = version.replace("+cpu", "")
with open(_ROOT_DIR / "version.txt", "w") as f:
f.write(f"{version}")
else:
with open(_ROOT_DIR / "version.txt") as f:
version = f.readline().strip()
try:
Copy link

Copilot AI Apr 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The +cpu suffix stripping is now only applied in the else: branch but not in the branch that writes version.txt. This creates inconsistent version normalization depending on whether version.txt already exists. If the intent is to always strip +cpu, apply the same normalization in both branches (before writing and after reading).

Copilot uses AI. Check for mistakes.


class WavDecoder:
# TODO: Docstrings
Copy link

Copilot AI Apr 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WavDecoder is introduced as a new public-facing decoder class but has an explicit 'TODO: Docstrings' and no class/method docstrings. Please add at least a class docstring describing supported WAV constraints (e.g., uncompressed only), what metadata contains, and the return type/shape/dtype of get_all_samples() to make the API self-documenting.

Suggested change
# TODO: Docstrings
"""Decode audio from a WAV file.
This decoder supports a single audio stream from WAV input backed by the
native WAV decoder. It is intended for standard, uncompressed WAV audio
data (for example PCM-style WAV files).
Args:
source: Path to the WAV file to decode, as a string or ``pathlib.Path``.
Attributes:
metadata (AudioStreamMetadata): Metadata for the decoded audio stream.
The populated fields are:
- ``sample_rate``: sampling rate in Hz.
- ``num_channels``: number of audio channels.
- ``sample_format``: sample format reported by the decoder.
- ``duration_seconds``: decoded stream duration in seconds.
- ``stream_index``: stream index in the container.
- ``codec``: codec name reported for the WAV audio stream.
- ``bit_rate``: bit rate reported by the decoder.
- ``duration_seconds_from_header``: duration derived from the WAV
header when available.
- ``begin_stream_seconds``: stream start time in seconds.
- ``begin_stream_seconds_from_header``: always ``None`` for WAV,
because the format does not provide stream start time metadata.
Methods:
get_all_samples(): Returns all decoded audio samples as a
``torch.Tensor`` with shape ``[num_channels, num_samples]`` and
dtype ``torch.float32``.
"""

Copilot uses AI. Check for mistakes.
)
transformed_frames = decoder.get_frames_played_at(pts_seconds).data
assert len(transformed_frames) == len(pts_seconds)
return transformed_frames.data
Copy link

Copilot AI Apr 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here transformed_frames is already a Tensor (because of the .data on line 77). Returning transformed_frames.data uses the deprecated/unsafe .data attribute and is redundant. Return transformed_frames directly to avoid surprising autograd semantics and keep the benchmark representative of normal tensor usage.

Suggested change
return transformed_frames.data
return transformed_frames

Copilot uses AI. Check for mistakes.
@youge325 youge325 marked this pull request as draft April 11, 2026 15:19
@youge325 youge325 closed this Apr 11, 2026
@youge325 youge325 deleted the arm_merge branch April 12, 2026 00:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.