Skip to content

v1 issue4#107

Merged
aclerc merged 14 commits into
v1from
v1-issue4
Jun 25, 2026
Merged

v1 issue4#107
aclerc merged 14 commits into
v1from
v1-issue4

Conversation

@aclerc

@aclerc aclerc commented Jun 24, 2026

Copy link
Copy Markdown
Contributor

Issue 4 — Naive energy-ratio method (WS3)

Goal: a second, deliberately simple, fully independent method that validates the
harness is not implicitly tuned to v0 — and proves the existing thin method seam is
genuinely pluggable.

Why this, not the original "data contract" issue: the thin
MethodInput/MethodOutput seam from Issues 2–3 already is the shared, method-
agnostic contract; the drafted "per test-reference conditioned dataset" was over-fit to
v0 (an R-learner fits once per test turbine over all references at once), and the
assessment_method production selector only earns its place once there is a winner to
promote. The durable kernel of the old issue — a treatment-invariant reference-only
feature builder + the §8 bias-guard test (design note §3/§8) — folds into Issue 5.

The method. For a set of rows let ρ = Σ test_power / Σ reference_total_power over
complete-case timestamps (test turbine and every reference finite). Estimate
uplift = ρ(treated) / ρ(baseline) − 1. It never reads the test turbine's own wind
speed (design note §3), shares no code with v0, and has no wind_up dependency. It makes
no covariate-shift correction by design, so it is the "don't condition at all" floor:
biased on prepost, near-unbiased on toggle (interleaved on/off share a wind
distribution).

Scope

  • NaiveRatioMethod behind the existing Method seam; prepost and toggle.
  • Rich per-run diagnostics (a data-stats CSV per all/baseline/upgraded segment, a
    headline-results CSV, optional plots) so a human can confirm the right data was
    received and interpreted; the headline uplift is re-derivable from the stats CSV.
  • Add toggle support to V0BinnedMethod (wiring wind_up's native toggle assessment) so
    v0 can be scored on toggle campaigns too.
  • Add the naive method to the existing prepost driver; add a new toggle example driver
    (3% Cp increase, 20-min-on/20-min-off) scoring naive + v0 + oracle.

Done when: naive_ratio is scored alongside v0_binned and the oracle on the
synthetic profiles for both prepost and toggle, the per-run diagnostics are written, and
its accuracy/precision appears in the leaderboard.

@aclerc aclerc requested a review from Copilot June 24, 2026 16:12
@aclerc aclerc marked this pull request as ready for review June 24, 2026 16:17

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds a second, fully independent baseline uplift estimator (a naive energy-ratio method) and extends the existing v0 binned baseline to support toggle campaigns, enabling end-to-end scoring for both prepost and toggle synthetic studies (with rich per-run diagnostics).

Changes:

  • Introduce NaiveRatioMethod (prepost + toggle) with per-run CSV diagnostics and optional diagnostic plots.
  • Add toggle support to V0BinnedMethod by wiring wind_up’s native toggle assessment and generating a toggle signal dataframe.
  • Improve Hill of Towie 10‑minute loader performance via per-(year, turbine) parquet caching; add new example driver(s) + tests (including a slow toggle E2E).

Reviewed changes

Copilot reviewed 11 out of 11 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
tests/benchmarking/baselines/test_v0_end_to_end.py Updates HoT v0 context construction to use explicit turbine names.
tests/benchmarking/baselines/test_v0_binned.py Adds unit tests for toggle wiring/config and toggle signal dataframe semantics.
tests/benchmarking/baselines/test_toggle_end_to_end.py Adds slow end-to-end toggle study test scoring v0 + naive + oracle.
tests/benchmarking/baselines/test_naive_ratio.py Adds comprehensive unit tests for naive ratio estimator, diagnostics, and plots.
docs/v1/issues.md Rescopes Issue 4 documentation to the naive energy-ratio method and toggle/v0 wiring.
benchmarking/synthetic/sources/hill_of_towie.py Adds cached unpacking of HoT year zips into per-turbine-year parquet files.
benchmarking/baselines/v0_binned.py Implements toggle-mode support via wind_up toggle config + toggle_df generation.
benchmarking/baselines/naive_ratio.py Implements the new naive energy-ratio baseline with diagnostics and plots.
benchmarking/baselines/inspect_naive.py Adds a manual inspection driver to run naive ratio replicates with plots enabled.
benchmarking/baselines/example_v0_study.py Adds NaiveRatioMethod to the existing prepost example driver.
benchmarking/baselines/example_toggle_study.py Adds a new toggle-mode example driver scoring v0 + naive + oracle.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread benchmarking/baselines/naive_ratio.py
Comment thread benchmarking/baselines/naive_ratio.py
Comment thread benchmarking/baselines/naive_ratio.py
Comment thread benchmarking/baselines/naive_ratio.py

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 14 out of 14 changed files in this pull request and generated 3 comments.

Comments suppressed due to low confidence (2)

benchmarking/baselines/example_prepost_study.py:111

  • main(..., data_dir=...) passes data_dir to load_hot_scada, but run_prepost_study builds the v0 context with build_hot_v0_context(...) using the default data dir. This can lead to duplicated downloads/caches (SCADA/metadata) or surprising behavior when data_dir is overridden.
    benchmarking/baselines/example_prepost_study.py:185
  • main(..., data_dir=...) passes data_dir to load_hot_scada, but it is not forwarded into run_prepost_study(...) (and thus into build_hot_v0_context(...)). Forwarding it keeps SCADA + metadata using the same cache directory when data_dir is overridden.

Comment thread benchmarking/baselines/naive_ratio.py Outdated
Comment on lines +162 to +163
if self.save_plots:
_save_plots(run_dir / "plots", wide=wide, mi=mi, test=mi.test_wtg)
Comment thread benchmarking/baselines/example_toggle_study.py
n_replicates=n_replicates,
seed=0,
)
return run_toggle_study(scada_df, profiles=TOGGLE_PROFILES, study=study, out_root=out_root)
@aclerc aclerc merged commit 2c7f8bb into v1 Jun 25, 2026
@aclerc aclerc deleted the v1-issue4 branch June 25, 2026 09:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants