Problem
I would like to load zarr data directly onto non-CPU devices (especially GPU). The current approach appears to rely on using cupy to load onto cupy-supported devices e.g. https://github.com/rapidsai/kvikio/blob/branch-25.02/notebooks/zarr.ipynb.
Unfortunately, there are a number of devices that are not supported by cupy e.g. I don't believe that my Apple Metal GPU is supported. This means that I must load from zarr via CPU if I would like to use these devices e.g. zarr on disk -> numpy -> torch (which has Metal support).
This is slow(er) and I don't believe is necessary from the zarr specification alone (?).
Background
Multi-device support is a very important requirement in the AI/ML community. I would like to use zarr (and specifically the Python implementation) to run models such as LLMs on multiple devices. The quicker it is to load the model onto device (and with reduced memory usage etc), the better the UX and developer experience is.
Questions
- Is
cupy the correct/only way to load direct to GPU with zarr-python?
- Is there/will there be any way of loading direct to devices such as Metal with
zarr-python?
- (Related) What is the best way to load a PyTorch neural network on GPU with
zarr-python? Is it cupy and then using something like dlpack for zero-copy exchange? Are there alternatives?
Related issues
#1967
#2574
cc @jhamman (as suggested by @TomNicholas)
Problem
I would like to load
zarrdata directly onto non-CPU devices (especially GPU). The current approach appears to rely on usingcupyto load ontocupy-supported devices e.g. https://github.com/rapidsai/kvikio/blob/branch-25.02/notebooks/zarr.ipynb.Unfortunately, there are a number of devices that are not supported by
cupye.g. I don't believe that my Apple Metal GPU is supported. This means that I must load fromzarrvia CPU if I would like to use these devices e.g.zarron disk ->numpy->torch(which has Metal support).This is slow(er) and I don't believe is necessary from the
zarrspecification alone (?).Background
Multi-device support is a very important requirement in the AI/ML community. I would like to use
zarr(and specifically the Python implementation) to run models such as LLMs on multiple devices. The quicker it is to load the model onto device (and with reduced memory usage etc), the better the UX and developer experience is.Questions
cupythe correct/only way to load direct to GPU withzarr-python?zarr-python?zarr-python? Is itcupyand then using something like dlpack for zero-copy exchange? Are there alternatives?Related issues
#1967
#2574
cc @jhamman (as suggested by @TomNicholas)