Skip to content

chore: GPU dyn dispatch plumbing#7563

Draft
0ax1 wants to merge 2 commits intodevelopfrom
ad/cuda-patches-v2
Draft

chore: GPU dyn dispatch plumbing#7563
0ax1 wants to merge 2 commits intodevelopfrom
ad/cuda-patches-v2

Conversation

@0ax1
Copy link
Copy Markdown
Contributor

@0ax1 0ax1 commented Apr 20, 2026

Structural plumbing for per-op exception patches in the fused dynamic dispatch kernel. Adds PackedPatchesHeader and kernel helpers.

@0ax1 0ax1 requested a review from myrrc April 20, 2026 11:02
@0ax1 0ax1 changed the title chore: add patches_ptr to BitunpackParams and AlpParams chore: GPU dyn dispatch plumbing Apr 20, 2026
@0ax1 0ax1 added the changelog/chore A trivial change label Apr 20, 2026
@0ax1
Copy link
Copy Markdown
Contributor Author

0ax1 commented Apr 20, 2026

This PR will be followed up by using the patches in ALP and bitpacking on the GPU.

@0ax1 0ax1 force-pushed the ad/cuda-patches-v2 branch from 1e6e8a5 to 4f2397d Compare April 20, 2026 11:04
@0ax1 0ax1 marked this pull request as draft April 20, 2026 11:42
@0ax1 0ax1 force-pushed the ad/cuda-patches-v2 branch from 4f2397d to af8e9cc Compare April 20, 2026 13:36
Structural plumbing for per-op exception patches in the fused
dynamic dispatch kernel. Adds PackedPatchesHeader and kernel
helpers (patch_fl_chunk, patch_all_fl_chunks) but does not yet
populate patches_ptr - all constructors initialize it to 0.

Signed-off-by: Alexander Droste <alexander.droste@protonmail.com>
@0ax1 0ax1 force-pushed the ad/cuda-patches-v2 branch from af8e9cc to 9d3b988 Compare April 20, 2026 13:46
@codspeed-hq
Copy link
Copy Markdown

codspeed-hq bot commented Apr 20, 2026

Merging this PR will improve performance by 13.4%

⚡ 1 improved benchmark
✅ 1162 untouched benchmarks
⏩ 1462 skipped benchmarks1

Performance Changes

Mode Benchmark BASE HEAD Efficiency
Simulation new_alp_prim_test_between[f32, 16384] 120.4 µs 106.2 µs +13.4%

Comparing ad/cuda-patches-v2 (d6486c0) with develop (f77cf60)

Open in CodSpeed

Footnotes

  1. 1462 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports.

Signed-off-by: Alexander Droste <alexander.droste@protonmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

changelog/chore A trivial change

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant