Skip to content

Add inherit='all_coords' option to DataTree.to_dataset()#11230

Merged
kmuehlbauer merged 9 commits intopydata:mainfrom
aladinor:inherit-all-coords
Mar 25, 2026
Merged

Add inherit='all_coords' option to DataTree.to_dataset()#11230
kmuehlbauer merged 9 commits intopydata:mainfrom
aladinor:inherit-all-coords

Conversation

@aladinor
Copy link
Copy Markdown
Contributor

Summary

Adds inherit='all' option to DataTree.to_dataset() so users can inherit all parent coordinates, not just indexed ones.

Currently, DataTree only inherits indexed (dimension) coordinates from parent nodes. Non-index coordinates are silently dropped during inheritance. This PR extends the inherit parameter from bool to bool | Literal["all", "indexes"]:

  • True / "indexes" — inherit only indexed coordinates (default, backward compatible)
  • "all" — inherit all coordinates, including non-index coordinates
  • False — node-only coordinates

Motivation: inheriting non-index coordinates

As described in #10812, non-index coordinates defined on a parent node are not inherited by child nodes. This is a common need in projects like xradar, where non-index coordinates such as latitude, longitude, and altitude are stored on the root node and need to be accessible on child sweep nodes:

import numpy as np
import xarray as xr

root = xr.Dataset(
    {"volume_number": ("time", [1, 2])},
    coords={
        "time": np.array(["2026-01-01", "2026-01-02"], dtype="datetime64[ns]"),
        "latitude": 36.74,
        "longitude": -98.13,
        "altitude": 378.0,
    },
)

child = xr.Dataset(
    {"temperature": (["time", "x"], np.random.rand(2, 3))},
    coords={"x": [0, 1, 2]},
)

dt = xr.DataTree.from_dict({"/": root, "sweep_0": child})

# Before: latitude, longitude, altitude are not inherited
dt["sweep_0"].to_dataset(inherit=True)

# After: all parent coords including non-index ones are available
dt["sweep_0"].to_dataset(inherit="all")

See also: openradar/xradar#337

Implementation

  • Added _coord_variables_all property on DataTree (like _coord_variables but inherits all parent coords, not just indexed ones)
  • Added _resolve_inherit() helper to centralize inherit parameter validation and dispatch
  • Updated to_dataset() and _to_dataset_view() to use the new helper

cc @kmuehlbauer

@github-actions github-actions Bot added the topic-DataTree Related to the implementation of a DataTree class label Mar 13, 2026
@aladinor aladinor changed the title Add inherit='all' option to DataTree.to_dataset() Add inherit='all' option to DataTree.to_dataset() Mar 14, 2026
Copy link
Copy Markdown
Contributor

@kmuehlbauer kmuehlbauer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aladinor Thanks, this is LGTM! Would be great if participants of #10812 could chime in on the implementation here.

cc @shoyer @TomNicholas

@kmuehlbauer kmuehlbauer added the plan to merge Final call for comments label Mar 24, 2026
@kmuehlbauer
Copy link
Copy Markdown
Contributor

kmuehlbauer commented Mar 24, 2026

I'm inclined to get this in soon, latest after dev meeting tomorrow, as this is blocking a quite long chain of PR over at https://github.com/openradar/xradar.

Please add any suggestions or comments. Thanks!

@shoyer
Copy link
Copy Markdown
Member

shoyer commented Mar 24, 2026

Implementation-wise this looks fine to me, but maybe we should call it inherit='all_coords' or inherit='coords'? inherit='all' sounds like it inherits data variables.

@aladinor
Copy link
Copy Markdown
Contributor Author

aladinor commented Mar 24, 2026

Thanks, @shoyer, for your feedback. I'll do "all_coords", which seems to be more accurate/explicit.

@aladinor aladinor changed the title Add inherit='all' option to DataTree.to_dataset() Add inherit='all_coords' option to DataTree.to_dataset() Mar 25, 2026
@kmuehlbauer kmuehlbauer merged commit 561e5e8 into pydata:main Mar 25, 2026
40 checks passed
@aladinor aladinor deleted the inherit-all-coords branch March 25, 2026 16:24
@kmuehlbauer
Copy link
Copy Markdown
Contributor

Thanks @aladinor! Thanks All for reviews and suggestions.

aladinor added a commit to aladinor/xradar that referenced this pull request Mar 25, 2026
kmuehlbauer added a commit to openradar/xradar that referenced this pull request Mar 25, 2026
…Tree (#337)

* ENH: skip redundant station coord reads in DataTree and extract _apply_site_coords helper

Pass site_coords=False for all sweeps when building a DataTree so that
latitude/longitude/altitude are only placed as coordinates on the root
node. Sweeps keep them as plain data variables, avoiding redundant I/O
and coordinate promotion. Extracts the repeated if/else site_coords
logic into a shared _apply_site_coords() helper in common.py, used by
all 10 applicable backends.

Closes #334

* FIX: ensure CfRadial1 sweeps always have station vars as data variables

CfRadial1's conform_cfradial2_sweep_group strips station variables
(latitude, longitude, altitude) from sweep datasets. Previously these
were only re-added when site_coords=True. With the DataTree optimization
(site_coords=False), sweeps had no station vars at all, breaking
georeference functions that access ds.longitude on sweep datasets.

Now station vars are always assigned to sweeps, and _apply_site_coords
controls whether they are promoted to coordinates or kept as data vars.

Also adds a CfRadial1 test to TestStationCoordsOnRoot.

* CI: re-trigger CI after transient network failures

* CI: retry flaky Python 3.11 network failures

* DOC: add changelog entry for skip redundant station coord reads (PR #337)

* STY: rename site_coords to site_as_coords for clarity

Per reviewer feedback from @kmuehlbauer, rename the parameter
across all backends, tests, and documentation for explicitness.

* FIX: revert site_coords tuple property renames per review

The site_as_coords rename should only apply to the boolean kwarg,
not to the site_coords property that returns the (lon, lat, alt) tuple.

* FIX: copy read-only arrays before in-place mutation in _ipol_time

NumPy/xarray may return non-writable arrays from .values and .loc,
causing ValueError when _ipol_time tries to modify them in-place.

* FIX: use da.where instead of copy for read-only array in _ipol_time

Replace da.copy() + in-place mutation with da.where(da.notnull(), interpolated)
to avoid full copy of the DataArray per review suggestion.

* FIX: refactor _assign_root to return cleaned sweeps and drop station vars

- _assign_root returns (root, cleaned_sweeps) instead of mutating input list
- _attach_sweep_groups drops station vars before attaching to DataTree
- Use drop_vars(_STATION_VARS, errors="ignore") for idiomatic xarray

* FIX: use zip and unpack _assign_root tuple in nexrad datatree builder

* FIX: use zip and unpack _assign_root tuple in UF datatree builder

* TST: assert station vars not in sweep data_vars or coords

* TST: update nexrad data_vars counts after station var removal

* TST: update UF data_vars counts after station var removal

* TST: update GAMIC data_vars count after station var removal

* TST: update DataMet data_vars count after station var removal

* TST: update HPL data_vars count after station var removal

* TST: update Iris data_vars count after station var removal

* TST: update Furuno variables count after station var removal

* FIX: use cleaned sweeps from _assign_root in _get_required_root_dataset

* STY: apply black formatting to nexrad and uf datatree builders

* Apply suggestion from @kmuehlbauer

* lint

* CI: install xarray dev to pick up inherit='all' from pydata/xarray#11230

* ENH: use inherit='all_coords' for georef and pin xarray dev

- Use to_dataset(inherit="all_coords") in georef tree functions so
  sweeps inherit lat/lon/alt from root
- Pin xarray to git main (unreleased inherit support)
- Bump requires-python to >=3.11 (matches xarray dev)
- Add explicit setuptools package discovery

* STY: black format projection.py

* ENH: use inherit='all_coords' in map_over_sweeps and export functions

Sweep datasets no longer carry lat/lon/alt directly — they must
inherit them from the root via to_dataset(inherit="all_coords").

* DOC: update notebooks to use to_dataset(inherit="all_coords")

Station coords now live on the root node only. Sweep datasets must
use to_dataset(inherit="all_coords") to access lat/lon/alt.

* STY: black format HaloPhotonics notebook

* DOC: update fix_sitecoords to handle scalar inherited coords

Station coords are now scalar on root and inherited by sweeps.
fix_sitecoords skips coords that are missing or already scalar.

* CI: pin xarray dev in all conda environment files

Pin xarray to git main in environment.yml (RTD), ci/unittests.yml,
and ci/notebooktests.yml so inherit='all_coords' is available
everywhere.

* FIX: make map_over_sweeps decorator use apply_to_sweeps with inherit

The @map_over_sweeps decorator previously used xr.map_over_datasets
which passes DatasetView without inherited coordinates. Now it uses
apply_to_sweeps which calls to_dataset(inherit="all_coords") so
sweep datasets include lat/lon/alt from root.

Also update Assign_GeoCoords notebook to fix per-ray station coords
on root for mobile radar (DOW) before georeferencing.

---------

Co-authored-by: Kai Mühlbauer <kmuehlbauer@wradlib.org>
jsignell pushed a commit to jsignell/xarray that referenced this pull request Apr 15, 2026
…11230)

* Add inherit='all' option to DataTree.to_dataset()

* Add whats-new entry for inherit='all' (pydata#11230)

* Fix prune() signature accidentally modified by ruff-format

* Fix mypy errors: remove unused type-ignore, add typing.cast in test

* Rename inherit='all' to inherit='all_coords' per review feedback
jsignell pushed a commit to jsignell/xarray that referenced this pull request Apr 15, 2026
…11230)

* Add inherit='all' option to DataTree.to_dataset()

* Add whats-new entry for inherit='all' (pydata#11230)

* Fix prune() signature accidentally modified by ruff-format

* Fix mypy errors: remove unused type-ignore, add typing.cast in test

* Rename inherit='all' to inherit='all_coords' per review feedback
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

plan to merge Final call for comments topic-DataTree Related to the implementation of a DataTree class

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Datatree does not inherit non-dimension coordinates

3 participants