perf: cache default ArraySpec for regular chunk grids#3908
perf: cache default ArraySpec for regular chunk grids#3908d-v-b wants to merge 6 commits intozarr-developers:mainfrom
Conversation
For regular grids, all chunks have the same codec_shape, so we can build the ArraySpec once and reuse it for every chunk — avoiding the per-chunk ChunkGrid.__getitem__ + ArraySpec construction overhead. Adds _get_default_chunk_spec() and uses it in _get_selection and _set_selection. Saves ~5ms per 1000 chunks. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #3908 +/- ##
=======================================
Coverage 93.11% 93.11%
=======================================
Files 85 85
Lines 11369 11375 +6
=======================================
+ Hits 10586 10592 +6
Misses 783 783
🚀 New features to boost your workflow:
|
| def _get_default_chunk_spec( | ||
| metadata: ArrayMetadata, | ||
| chunk_grid: ChunkGrid, | ||
| array_config: ArrayConfig, | ||
| prototype: BufferPrototype, | ||
| ) -> ArraySpec | None: |
There was a problem hiding this comment.
given the name of _get_default_chunk_spec, should this be a ChunkSpec?
There was a problem hiding this comment.
it's the arrayspec for a chunk, not a chunkspec. And the consumer needs an ArraySpec, so we can't change the return type.
There was a problem hiding this comment.
ok, still a bit confusing but I guess that might relate to the fact that a better design wouldn't need a per-chunk ArraySpec.
My only other question is why build a new function rather than making chunk_coords: tuple[int, ...] optional in _get_chunk_spec, such that it could return a default?
There was a problem hiding this comment.
what would the default chunk coordinates be? the "origin" chunk coordinate depends on the dimensionality of the array
There was a problem hiding this comment.
Part of the goal here is to avoid object creation overhead inside _get_chunk_spec. Adding default parameters to _get_chunk_spec would not help us, because we would still create many identical ArraySpec objects. The only change to _get_chunk_spec that would help is adding a caching layer via @lru_cache, I should see if that works
There was a problem hiding this comment.
yeah, I agree with the goal. It might just be me, but I find the code confusing which was the reason for all my comments. I think a flow that would be more interpretable but accomplish the same could look like:
regular_grid = chunk_grid.is_regular:
if regular_grid:
regular_chunk_spec = _get_chunk_spec(metadata, chunk_grid, _config, prototype)
results = await codec_pipeline.read(
[
(
store_path / metadata.encode_chunk_key(chunk_coords),
regular_chunk_spec
if regular_grid
else _get_chunk_spec(metadata, chunk_grid, chunk_coords, _config, prototype),
chunk_selection,
out_selection,
is_complete_chunk,
...
Merging this PR will improve performance by 11.1%
|
| Mode | Benchmark | BASE |
HEAD |
Efficiency | |
|---|---|---|---|---|---|
| ⚡ | WallTime | test_slice_indexing[None-(slice(None, None, None), slice(None, None, None), slice(None, None, None))-memory] |
389.8 ms | 352.5 ms | +10.59% |
| ⚡ | WallTime | test_slice_indexing[None-(slice(0, None, 4), slice(0, None, 4), slice(0, None, 4))-memory] |
386.7 ms | 348.3 ms | +11.01% |
| ⚡ | WallTime | test_slice_indexing[None-(slice(10, -10, 4), slice(10, -10, 4), slice(10, -10, 4))-memory] |
211.6 ms | 190.4 ms | +11.1% |
| ⚡ | WallTime | test_slice_indexing[None-(slice(None, None, None), slice(0, 3, 2), slice(0, 10, None))-memory_get_latency] |
4.3 ms | 3.9 ms | +10.54% |
Comparing d-v-b:perf/cache-default-chunk-spec (6fa93cb) with main (029c376)2
Footnotes
-
6 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports. ↩
-
No successful run was found on
main(c4730be) during the generation of this report, so 029c376 was used instead as the comparison base. There might be some changes unrelated to this pull request in this report. ↩
For regular grids, all chunks have the same codec_shape, so we can
build the ArraySpec once and reuse it for every chunk — avoiding the
per-chunk ChunkGrid.getitem + ArraySpec construction overhead.
Adds _get_default_chunk_spec() and uses it in _get_selection and
_set_selection. Saves ~5ms per 1000 chunks.
Co-Authored-By: Claude Opus 4.6 (1M context) noreply@anthropic.com