What happened?
I'm trying to compute masks (from DataArray's data itself) and assign them as coordinates, but it appears that depending on the combination of coords/dims of the computed masks, sometimes .assign_coords will fail.
It seems like
- it fails when all the mask DataArray's (each is a mask computed, but it probably doesn't matter) to be assigned as coordinates, share a dimension common to the target DataArray, and the dimension contains only a singular value (across all mask DataArray's)
- it doesn't fail when the shared dimension contains more than one value.
It's a bit hard to describe as I don't know the xarray internal itself, but my self-contained minimal example below should demonstrate the issue much clearer.
What did you expect to happen?
No response
Minimal Complete Verifiable Example
import xarray as xr
data = xr.DataArray(
data=[
[0, 1, 2],
[0, 1, 2]
],
coords={
'd1': ['m', 'n'],
'd2': ['a', 'b', 'c']
}
)
# this will fail:
data.assign_coords({'mask_d1_m': data.sel(d1='m')==0})
# ValueError: dimension 'd1' already exists as a scalar variable
# this will fail too:
data.assign_coords({'mask_d1_n': data.sel(d1='n')==0})
# ValueError: dimension 'd1' already exists as a scalar variable
# but this will work:
data.assign_coords(
{
'mask_d1_m': data.sel(d1='m')==0,
'mask_d1_n': data.sel(d1='n')==0
}
)
# <xarray.DataArray (d1: 2, d2: 3)>
# array([[0, 1, 2],
# [0, 1, 2]])
# Coordinates:
# * d1 (d1) <U1 'm' 'n'
# * d2 (d2) <U1 'a' 'b' 'c'
# mask_d1_m (d2) bool True False False
# mask_d1_n (d2) bool True False False
MVCE confirmation
Relevant log output
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
Cell In[27], line 1
----> 1 data.assign_coords({'mask_d1_n': data.sel(d1='n')==0})
File ~/mambaforge/envs/quickquant/lib/python3.8/site-packages/xarray/core/common.py:615, in DataWithCoords.assign_coords(self, coords, **coords_kwargs)
613 data = self.copy(deep=False)
614 results: dict[Hashable, Any] = self._calc_assign_results(coords_combined)
--> 615 data.coords.update(results)
616 return data
File ~/mambaforge/envs/quickquant/lib/python3.8/site-packages/xarray/core/coordinates.py:177, in Coordinates.update(self, other)
173 self._maybe_drop_multiindex_coords(set(other_vars))
174 coords, indexes = merge_coords(
175 [self.variables, other_vars], priority_arg=1, indexes=self.xindexes
176 )
--> 177 self._update_coords(coords, indexes)
File ~/mambaforge/envs/quickquant/lib/python3.8/site-packages/xarray/core/coordinates.py:393, in DataArrayCoordinates._update_coords(self, coords, indexes)
391 coords_plus_data = coords.copy()
392 coords_plus_data[_THIS_ARRAY] = self._data.variable
--> 393 dims = calculate_dimensions(coords_plus_data)
394 if not set(dims) <= set(self.dims):
395 raise ValueError(
396 "cannot add coordinates with new dimensions to a DataArray"
397 )
File ~/mambaforge/envs/quickquant/lib/python3.8/site-packages/xarray/core/variable.py:3209, in calculate_dimensions(variables)
3207 for dim, size in zip(var.dims, var.shape):
3208 if dim in scalar_vars:
-> 3209 raise ValueError(
3210 f"dimension {dim!r} already exists as a scalar variable"
3211 )
3212 if dim not in dims:
3213 dims[dim] = size
ValueError: dimension 'd1' already exists as a scalar variable
Anything else we need to know?
No response
Environment
Details
~/mambaforge/envs/quickquant/lib/python3.8/site-packages/_distutils_hack/__init__.py:33: UserWarning: Setuptools is replacing distutils.
warnings.warn("Setuptools is replacing distutils.")
INSTALLED VERSIONS
commit: None
python: 3.8.17 | packaged by conda-forge | (default, Jun 16 2023, 07:11:32)
[Clang 14.0.6 ]
python-bits: 64
OS: Darwin
OS-release: 22.6.0
machine: arm64
processor: arm
byteorder: little
LC_ALL: en_US.UTF-8
LANG: None
LOCALE: ('en_US', 'UTF-8')
libhdf5: None
libnetcdf: None
xarray: 2023.1.0
pandas: 1.5.3
numpy: 1.24.0
scipy: 1.10.1
netCDF4: None
pydap: None
h5netcdf: None
h5py: None
Nio: None
zarr: 2.15.0
cftime: None
nc_time_axis: None
PseudoNetCDF: None
rasterio: None
cfgrib: None
iris: None
bottleneck: None
dask: 2023.5.0
distributed: 2023.5.0
matplotlib: 3.7.2
cartopy: None
seaborn: 0.12.2
numbagg: None
fsspec: 2023.9.0
cupy: None
pint: 0.21
sparse: None
flox: None
numpy_groupies: None
setuptools: 68.1.2
pip: 23.2.1
conda: 23.7.3
pytest: 7.4.1
mypy: None
IPython: 8.12.2
sphinx: 4.5.0
What happened?
I'm trying to compute masks (from DataArray's data itself) and assign them as coordinates, but it appears that depending on the combination of coords/dims of the computed masks, sometimes
.assign_coordswill fail.It seems like
It's a bit hard to describe as I don't know the xarray internal itself, but my self-contained minimal example below should demonstrate the issue much clearer.
What did you expect to happen?
No response
Minimal Complete Verifiable Example
MVCE confirmation
Relevant log output
Anything else we need to know?
No response
Environment
Details
~/mambaforge/envs/quickquant/lib/python3.8/site-packages/_distutils_hack/__init__.py:33: UserWarning: Setuptools is replacing distutils. warnings.warn("Setuptools is replacing distutils.")INSTALLED VERSIONS
commit: None
python: 3.8.17 | packaged by conda-forge | (default, Jun 16 2023, 07:11:32)
[Clang 14.0.6 ]
python-bits: 64
OS: Darwin
OS-release: 22.6.0
machine: arm64
processor: arm
byteorder: little
LC_ALL: en_US.UTF-8
LANG: None
LOCALE: ('en_US', 'UTF-8')
libhdf5: None
libnetcdf: None
xarray: 2023.1.0
pandas: 1.5.3
numpy: 1.24.0
scipy: 1.10.1
netCDF4: None
pydap: None
h5netcdf: None
h5py: None
Nio: None
zarr: 2.15.0
cftime: None
nc_time_axis: None
PseudoNetCDF: None
rasterio: None
cfgrib: None
iris: None
bottleneck: None
dask: 2023.5.0
distributed: 2023.5.0
matplotlib: 3.7.2
cartopy: None
seaborn: 0.12.2
numbagg: None
fsspec: 2023.9.0
cupy: None
pint: 0.21
sparse: None
flox: None
numpy_groupies: None
setuptools: 68.1.2
pip: 23.2.1
conda: 23.7.3
pytest: 7.4.1
mypy: None
IPython: 8.12.2
sphinx: 4.5.0