Refactor compression pipeline: shared zarr LocalStore, hybrid MPI+threads, in-memory evaluation#4
Open
kotsaloscv wants to merge 14 commits into
Open
Refactor compression pipeline: shared zarr LocalStore, hybrid MPI+threads, in-memory evaluation#4kotsaloscv wants to merge 14 commits into
kotsaloscv wants to merge 14 commits into
Conversation
…eads, in-memory evaluation
nfarabullini
approved these changes
Apr 28, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
End-to-end refactor of the
evaluate_combos/compress_with_optimal/merge_compressed_fieldspipeline. The sweep is now fully in-memory, within-node parallelism moves from MPI ranks to aThreadPoolExecutor, and compressed fields land directly in a shared{dataset}.zarrLocalStore instead of per-field.zarr.zipfiles that had to be unzipped and re-merged.What changed
Storage layout
.zarr.zipis gone.compress_with_optimalwrites each field directly into a shared{where_to_write}/{dataset}.zarrLocalStore (component = field name).merge_compressed_fieldsno longer unzips/copies/rezips — it just runszarr.consolidate_metadata.open_zarr_zip_file_and_inspect→open_zarr_and_inspect,from_zarr_zip_to_netcdf→from_zarr_to_netcdf. Both now take a.zarrdirectory path.utils.open_datasetdrops.zarr.zipsupport;.nc,.grib,.zarrremain.Parallelism (hybrid MPI + threads)
evaluate_combosnow requires 1 MPI rank per node; within-node parallelism is aThreadPoolExecutor. Multi-rank-per-node launches are rejected with a clear launch-string hint.detect_node_topology(MPI-3Split_type, hostname fallback),detect_cores_available(cgroup/Slurm-aware viasched_getaffinity),compute_default_threads_per_rank.--threads-per-rankflag (auto-detected if omitted).--oversubscription-checkwarns/aborts ifOMP_NUM_THREADS/MKL_NUM_THREADS/OPENBLAS_NUM_THREADS/BLOSC_NTHREADS/NUMBA_NUM_THREADSaren't pinned to 1. Zarr v3's internal thread pool is also pinned.progress_barandTimer(locks around shared counters/dict).sys.exit(1)paths that could hang siblings at the next collective are nowcomm.Abort(1).Evaluation pipeline
evaluate_codec_pipelineruns entirely against aMemoryStore— no disk I/O per combo, no zip wrapping. Error norms are accumulated chunk-wise so a full decompressed copy of the sample is never held in memory.with dask.config.set) to prevent nested pools.Persistence path (with sharding)
persist_with_codec_pipelinewrites viadask.array.to_zarrwith inner chunks + shards; Dask chunks are rechunked to shard shape so each write = one shard.compress_with_optimalgains--inner-chunk-mib(default 16),--shard-mib(default 512),--threads, and--verify/--no-verify(on by default; disable to skip the re-read pass for trusted combos).Representative sampling
build_representative_sample: stride-samples along the leading dim vianp.linspace, keeping trailing spatial dims full. Deterministic soevaluate_combosandcompress_with_optimalbuild identical codec spaces.--eval-data-size-limitflag (default"5GB") withparse_sizehelper forGB/GiB/MiB/... strings. Must match betweenevaluate_combosandcompress_with_optimalso the codec-space indices resolve to the same objects.--field-percentage-to-compressis removed.MPI
broadcast_numpyusesBcast(uppercase, buffer-protocol) with shape/dtype metadata piggybacked over pickle. Lifts the payload ceiling from ~2 GB (the old[buf, MPI.BYTE]path would silently cap / corrupt at the 5 GB default sample budget) to ~16 GB for float64.Results / audit trail
config_space_{var}_rank{N}.csv(flushed per row; survives mid-sweep crashes), consolidated intoresults_{var}.parqueton rank 0. Both passing and filtered-out combos are recorded, distinguished by akeepcolumn.var(notfield_to_compress or "all") so multi-field runs no longer collide.analyze_clusteringnow takes required--where-to-writeand--varflags instead of silently readingconfig_space.csvfrom cwd.Imports / deps
matplotlib,sklearn,plotly,tqdm) are now imported lazily insideperform_clustering/analyze_clustering.evaluate_combosandcompress_with_optimalno longer pay their import cost.pyproject.toml: dropped strict pins onnumpy,dask,zarr,numcodecs.Dockerfile: removed thepip install --force-reinstall "dask[...]" "numpy==..."line that conflicted with the loosened pins..gitignore: added*.parquet.Breaking changes
.zarr.zipoutput and readers are removed.open_zarr_zip_file_and_inspect→open_zarr_and_inspect,from_zarr_zip_to_netcdf→from_zarr_to_netcdf.evaluate_combos:where_to_writeis now the flag--where-to-write(required);--field-percentage-to-compressremoved;--eval-data-size-limitadded.compress_with_optimal: new required topology of same--eval-data-size-limitas the sweep; new--inner-chunk-mib/--shard-mib/--threads/--verifyflags.analyze_clustering:--where-to-writeand--varare now required.evaluate_combosnow requires 1 MPI rank per node; relaunch withmpirun -n <NODES> --ntasks-per-node=1 ...(orsrun --nodes=<N> --ntasks-per-node=1 ...).Migration
Pass the same
--eval-data-size-limittocompress_with_optimal, otherwise the(comp_idx, filt_idx, ser_idx)returned by the sweep may resolve to codec objects with slightly different parameters (symptom: worse compression ratio than the sweep reported, no error).