Skip to content

V1 refactor scadadf#108

Merged
aclerc merged 4 commits into
v1from
v1-refactor-scadadf
Jun 25, 2026
Merged

V1 refactor scadadf#108
aclerc merged 4 commits into
v1from
v1-refactor-scadadf

Conversation

@aclerc

@aclerc aclerc commented Jun 25, 2026

Copy link
Copy Markdown
Contributor

Fully remove wind-up v0 method data column assumptions from the rest of the benchmarking code.

@aclerc aclerc force-pushed the v1-refactor-scadadf branch from 5ccd61d to e2ee2bd Compare June 25, 2026 09:26
@aclerc aclerc force-pushed the v1-refactor-scadadf branch from e2ee2bd to f783901 Compare June 25, 2026 09:29
@aclerc aclerc marked this pull request as ready for review June 25, 2026 09:35
@aclerc aclerc requested a review from Copilot June 25, 2026 09:40

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Refactors the benchmarking/synthetic pipeline and harness to operate on source-native SCADA column names (rather than wind-up v0 DataColumns aliases), keeping v0 aliasing confined to the v0 baseline on-ramp.

Changes:

  • Introduces a ColumnSchema abstraction and threads it through synthetic generation, plotting, and ground-truth uplift computation.
  • Updates the Hill of Towie adapter to load/cache source-native wtc_* tags, reshape wide→long for method-facing data, and adds a v0-only long_to_wind_up_format conversion.
  • Makes the harness/method seam carry the turbine identifier column (turbine_col) and updates baselines/tests accordingly (including making NaiveRatioMethod configured by active-power column).

Reviewed changes

Copilot reviewed 30 out of 30 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
tests/benchmarking/synthetic/test_upgrades.py Updates synthetic upgrade tests to use HOT_COLUMNS source-native schema.
tests/benchmarking/synthetic/test_plots.py Updates plot tests to use source-native column schema.
tests/benchmarking/synthetic/test_make_example_datasets.py Updates example dataset tests to use source-native column schema.
tests/benchmarking/synthetic/test_ground_truth.py Updates ground-truth tests to use source-native column schema.
tests/benchmarking/synthetic/test_generator.py Updates generator tests to use source-native column schema and turbine id column.
tests/benchmarking/synthetic/sources/test_hill_of_towie.py Splits tests between source-native reshape and v0 on-ramp conversion.
tests/benchmarking/synthetic/sources/test_hill_of_towie_cache.py Adds offline tests for per-(year,turbine) parquet caching behavior.
tests/benchmarking/harness/test_scoring.py Updates harness scoring tests to use HOT_COLUMNS and drop DataColumns.
tests/benchmarking/harness/test_replicates.py Updates replicate tests to use HOT_COLUMNS and turbine id column.
tests/benchmarking/harness/stubs.py Updates stub/oracle helpers to use seam-provided turbine column + source-native power.
tests/benchmarking/baselines/test_v0_binned.py Adjusts v0 baseline tests for source-native input + conversion on-ramp.
tests/benchmarking/baselines/test_toggle_end_to_end.py Updates end-to-end toggle test wiring for configured naive method + HOT_COLUMNS.
tests/benchmarking/baselines/test_naive_ratio.py Reworks naive-ratio tests to prove no wind_up imports and use configured columns.
benchmarking/synthetic/upgrades.py Refactors upgrades to be keyed by ColumnSchema (defaulting to HoT schema).
benchmarking/synthetic/sources/hill_of_towie.py Loader now keeps wtc_* tags; adds scada_wide_to_long + v0-only long_to_wind_up_format; adds per-turbine-year cache.
benchmarking/synthetic/schema.py New ColumnSchema dataclass defining semantic roles for source-native columns.
benchmarking/synthetic/plots.py Plotting now selects columns via ColumnSchema instead of DataColumns.
benchmarking/synthetic/ground_truth.py Ground-truth uplift now selects columns via ColumnSchema instead of DataColumns.
benchmarking/synthetic/generator.py Generator now operates on source-native long SCADA and threads ColumnSchema through outputs.
benchmarking/synthetic/init.py Re-exports ColumnSchema and HOT_COLUMNS from the synthetic package API.
benchmarking/harness/scoring.py Threads ColumnSchema through replicate-building; injects turbine_col into MethodInput.
benchmarking/harness/replicates.py Adds columns parameter and subsets by schema’s turbine column.
benchmarking/harness/method.py Extends MethodInput to carry the turbine-identifier column name.
benchmarking/harness/example_hot_study.py Updates oracle/example wiring to use source-native HOT_COLUMNS.
benchmarking/baselines/v0_binned.py Converts source-native long SCADA to wind-up format inside the v0 baseline only.
benchmarking/baselines/naive_ratio.py Makes naive method source-agnostic via configured active-power column + seam-provided turbine column.
benchmarking/baselines/inspect_v0_run.py Updates inspection tool to provide turbine_col and use source-native schema.
benchmarking/baselines/inspect_naive.py Updates inspection tool to configure naive method with source-native power col + turbine_col.
benchmarking/baselines/example_toggle_study.py Updates example toggle study to configure naive method with source-native power col.
benchmarking/baselines/example_prepost_study.py Updates example prepost study to configure naive method with source-native power col.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread benchmarking/synthetic/plots.py
Comment thread benchmarking/synthetic/sources/hill_of_towie.py
@aclerc aclerc merged commit 7d810de into v1 Jun 25, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants