Skip to content

fix: restore performance in Coordinates.to_index()#11308

Closed
armorbreak001 wants to merge 1 commit intopydata:mainfrom
armorbreak001:fix/coordinates-to-index-performance
Closed

fix: restore performance in Coordinates.to_index()#11308
armorbreak001 wants to merge 1 commit intopydata:mainfrom
armorbreak001:fix/coordinates-to-index-performance

Conversation

@armorbreak001
Copy link
Copy Markdown

Summary

Fixes #11305 — performance regression in Coordinates.to_index() (and downstream to_dataframe()).

Details

Commit a13a255 (mypy type fix) converted code_list from ndarray to Python lists via [list(c) for c in code_list] before passing to pd.MultiIndex(). This introduced a major slowdown because:

  1. Converting large numpy arrays to Python lists is O(n) per array
  2. Pandas internally converts list codes back to arrays anyway

The fix passes code_list (ndarray) directly to pd.MultiIndex(codes=...), which pandas accepts natively. The level_list += levels line was already reverted in a later commit, so only the codes= conversion needed fixing.

Change: 1 line, net -1/+1.

…ssary list conversions

The mypy type fix in a13a255 converted code_list (ndarray) and level_list
(Index) to Python lists before passing to pd.MultiIndex, causing a major
performance regression in to_index() and downstream operations like
to_dataframe(). Revert the conversions since pandas accepts ndarray codes
and Index levels directly.
@thodson-usgs
Copy link
Copy Markdown
Contributor

duplicates #11306

@ianhi
Copy link
Copy Markdown
Collaborator

ianhi commented Apr 22, 2026

thank you @armorbreak001 but I'm going to close as an earlier PR fixed this issue

@ianhi ianhi closed this Apr 22, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Major performance regression in Coordinates.to_index()

3 participants