Skip to content

test: make pre-release tests work#1071

Merged
melonora merged 4 commits intomainfrom
fix-tests
Feb 19, 2026
Merged

test: make pre-release tests work#1071
melonora merged 4 commits intomainfrom
fix-tests

Conversation

@flying-sheep
Copy link
Member

@flying-sheep flying-sheep commented Feb 12, 2026

Fixes #1039

I changed the test setup to use uv add, which actually makes sense and results in a single solve per env, as I’ve been preaching since the dawn of time.

Don’t use [uv] pip install in CI people. Ever. Why does nobody trust me on that?

@codecov
Copy link

codecov bot commented Feb 12, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 91.94%. Comparing base (cd8546a) to head (a13ac11).
⚠️ Report is 1 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #1071   +/-   ##
=======================================
  Coverage   91.94%   91.94%           
=======================================
  Files          51       51           
  Lines        7673     7673           
=======================================
  Hits         7055     7055           
  Misses        618      618           
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@flying-sheep flying-sheep marked this pull request as ready for review February 12, 2026 09:11
@LucaMarconato
Copy link
Member

LucaMarconato commented Feb 16, 2026

@melonora I summon you as a "Windows x Dask" expert. Do you have any insight on why this specific Windows job fails (the dask computational graph shows double the size): https://github.com/scverse/spatialdata/actions/runs/21940345294/job/63754609282?pr=1071.

The test failing means that there may be some performance penalty in that environment. Anyway since the result is correct and it the test doesn't fail for other OS versions or more recent Dask, that will not block the merge.

@melonora
Copy link
Collaborator

I will check when back at home, but yes I have my suspicion

@LucaMarconato
Copy link
Member

Thanks I will fix the docs meanwhile.

@LucaMarconato
Copy link
Member

Docs are green!

@melonora
Copy link
Collaborator

Reporting: I think the failing test is overall faulty, depending on the actual purpose. We are actually testing more dask internals here instead of ensuring that we don't have more than expected entry points in our codebase to dask.

This means that Array.__getitem__ calls can also just happen multiple times within the dask codebase which seems to be the case. By writing a side effect for the mock object I saw that all calls with only ({"width": 32, "chunk_size": 16, "scale_factors": (2,)},) happened in _write_raster_datatree, specifically in da.compute(*dask_delayed, optimize_graph=False) on line 341. This line was only passed once. Checking downstream ome-zarr-py did not explain the 8 calls made and furthermore switching to dask version 2025.11.0 (prior to sharding) makes the test pass. This means the test only fails for dask version 2025.2.0 and maybe earlier.

My suggestion would be to either ensure that there are no more expected calls from the spatialdata side, e.g. if you write there should be only one entry point or we test that the amount of times that chunks are accessed fall within a range of 2*chunks and if that test fails we report it upstream to dask. @LucaMarconato WDYT?

@LucaMarconato
Copy link
Member

Thanks for reporting!

fall within a range of 2*chunks and if that test fails we report it upstream to dask.

sounds good to me!

@melonora
Copy link
Collaborator

@flying-sheep Thanks! Just waiting for tests to pass and then will merge. Love your commit messages btw haha

@flying-sheep
Copy link
Member Author

flying-sheep commented Feb 19, 2026

Love your commit messages btw haha

Hahaha yeah, I’ve given up making these descriptive when I’m just speculating anyway

  1. the important info lives in the PR discussion and code comments

  2. we’re doing squash & merge anyway, so the commit messages usually don’t become part of the main git history unless you’ve set …

    grafik

    oh you’ve set that! I’m sorry that you have to edit the commit message then. I usually try to make my PR description a summary of what’s done, so if I didn’t get preachy in it, it’d be a better source of info than collecting the commit messages. Here’s a variant you can use:

    Changed the test setup to use uv add to install dependencies with a single solve. This causes anndata-git and pandas >=3 to actually be used and therefore tested.

    Fixed all tests affected by this

uv pip install --prerelease allow pandas
else
uv sync --extra test
sed -i '' 's/requires-python.*//' pyproject.toml # otherwise uv complains that anndata requires python>=3.12 and we only do >=3.11 😱
Copy link
Member Author

@flying-sheep flying-sheep Feb 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changing the whole test setup to hatch would remove the need for this hack btw.

Not saying you should do it if the current setup works for you, only that having something manage multiple test conditions for you (locally the same way as in CI) is something I enjoy in developing my projects (even though learning hatch’s config semantics takes time)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could also use pixi no? E.g. would solve the same thing (bit biased here as I use pixi more haha))

@melonora
Copy link
Collaborator

Love your commit messages btw haha

Hahaha yeah, I’ve given up making these descriptive when I’m just speculating anyway

  1. the important info lives in the PR discussion and code comments

  2. we’re doing squash & merge anyway, so the commit messages usually don’t become part of the main git history unless you’ve set …

    grafik

    oh you’ve set that! I’m sorry that you have to edit the commit message then. I usually try to make my PR description a summary of what’s done, so if I didn’t get preachy in it, it’d be a better source of info than collecting the commit messages. Here’s a variant you can use:

    Changed the test setup to use uv add to install dependencies with a single solve. This causes anndata-git and pandas >=3 to actually be used and therefore tested.
    Fixed all tests affected by this

I might just do a interactive rebase and then combining them:) Windows test is currently stalling for some reason, will try to run again, but locally it is passing.

flying-sheep and others added 4 commits February 19, 2026 12:19
tests: partial fix of tests failing due to anndata-git and pandas >=3
…ask 2025.2.0.

fix tests

test: adjust test to account for dask issues
fix docs

attempt fix docs

attempt 2 fix docs
@melonora melonora merged commit cdf3435 into main Feb 19, 2026
9 checks passed
@melonora melonora deleted the fix-tests branch February 19, 2026 11:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Better pre-release CI workflow

3 participants

Comments