Skip to content

Fix MAG L1C gap-fill bugs for validation tests T015 and T016#2899

Open
sapols wants to merge 10 commits intoIMAP-Science-Operations-Center:devfrom
sapols:mag-l1c-fix
Open

Fix MAG L1C gap-fill bugs for validation tests T015 and T016#2899
sapols wants to merge 10 commits intoIMAP-Science-Operations-Center:devfrom
sapols:mag-l1c-fix

Conversation

@sapols
Copy link
Copy Markdown

@sapols sapols commented Apr 1, 2026

Summary

Closes issue #1993

Six localized bug fixes to the existing MAG L1C gap-filling code. These fixes un-skip and pass validation tests T015 and T016 (both mago and magi) while preserving the existing passes on T013, T014, and T024. All 10 L1C validation cases now pass with exact row-count matches against the checked-in expected outputs.

This PR intentionally keeps the current production algorithm shape. It does not rewrite the gap-fill pipeline, add new dataclasses, or replace the simplified L1C flow that was developed collaboratively between SDC and the MAG team. It only patches the specific correctness bugs that were causing T015 and T016 to fail.

Background

What was failing and why

Current dev skips T015 and T016 because they produce wrong output. Running them un-skipped shows three concrete failures:

Test Sensor Failure Root cause fixes
T015 magi +5 extra rows Fix 5 alone
T016 mago -663 missing rows Fix 1 + Fix 3 + Fix 4 jointly
T016 magi +330 extra rows Fix 1 + Fix 3 + Fix 4 jointly

T015-mago and T024 (both sensors) already passed on dev.

The validation tests exercise increasingly complex mode transitions:

  • T013/T014: Simple NM-to-BM-to-NM transitions, no Config mode. Already passing because the NM rate is always 2 Hz (matching the old hardcoded cadence) and there are no Config-mode boundary artifacts.
  • T015: NM-to-BM-to-NM with a Config mode transition at the start of burst. NM rate changes from 2 Hz to 4 Hz (mago) or 2 Hz to 1 Hz (magi).
  • T016: NM-to-BM-to-NM with a Config mode transition at the end of burst. NM rate changes from 4 Hz to 2 Hz (mago) or 1 Hz to 2 Hz (magi).
  • T024: Burst-only, no normal data at all.

Why T015 magi failed (+5 rows): When MAG transitions through Config mode, the vectors_per_second metadata timestamp doesn't perfectly align with where the cadence actually changes in the data. The old find_all_gaps() split the timeline at the declared metadata timestamp, and find_gaps() misidentified the boundary misalignment as 1-sample phantom gaps, generating 5 spurious gap-fill rows. T015 mago was unaffected because its Config transition boundary happened to align cleanly. Fix 5 (_find_rate_segments walkback) eliminates these phantom gaps at the root.

Why T016 failed (-663 mago / +330 magi rows): Three bugs interacted to produce the T016 failures. No single fix alone was sufficient:

  • Fix 1 (rate-aware cadence): Gap-fill timestamps were generated at a hardcoded 0.5s (2 Hz) regardless of the actual NM rate. T016 mago has a 4 Hz pre-gap rate (needs 0.25s cadence) and T016 magi has a 1 Hz pre-gap rate (needs 1.0s cadence). The wrong cadence produced the wrong number of gap-fill timestamps. Reverting Fix 1 alone: T016 mago -662, T016 magi +332.
  • Fix 3 (off-by-one burst endpoint): The interpolation window silently dropped the last burst sample, producing wrong interpolated values near the burst-resume boundary. Reverting Fix 3 alone: row counts matched, but first value mismatch at index 2475 (mago) and 618 (magi).
  • Fix 4 (CIC filter clipping): The CIC delay compensation clipped the final interpolated timestamp at the tail of the burst region. Reverting Fix 4 alone: T016 mago -1, T016 magi -1.

Reverting all three together reproduces the exact dev failures (-663 mago, +330 magi), confirming they interact non-linearly.

Why T024 was already passing but needed Fix 6: T024 passed on dev because it's burst-only and doesn't exercise the broken transition paths. However, Fix 4 (extrapolate=True) causes T024 to produce one extra trailing sample. Fix 6 (burst-only trailing-edge trim) compensates for this, preventing a regression.

How this was investigated

This issue was investigated using two AI coding agents (Claude Code and Codex) in parallel, guided by manual review. The process was:

  1. Initial exploration: Both agents independently identified the same set of bugs from the failing tests. Their proposed fixes overlapped significantly but differed in approach.
  2. Spec review: The MAG Science Algorithm Document (IMAP-MAG-SW-009-01B, i5r1) and the SDC Data Validation Document (IMAP-OPS-TP-ICL-001) were consulted. Both agents initially proposed full spec-aligned rewrites of section 7.3.4 step 3, but this was scaled back after MAG team feedback (see below).
  3. MAG team feedback: @alastairtree and @maxinelasp clarified that the current code was developed in collaboration with the MAG team and that deviations from the algorithm document may be intentional simplifications. The team requested a conservative fix rather than a full rewrite.
  4. Minimal patches: Both agents produced minimal draft PRs that preserved the existing algorithm shape. These were iteratively compared and refined until both converged on identical code.
  5. CDF validation on flight data: The code was run on real commissioning data (2025-09-27, 2025-09-28, 2025-10-03, 2025-12-05, 2025-12-06) to compare outputs between dev and the patched code. An exhaustive search of the MAG L1B archive through 2026-03-31 found only 8 same-day norm+burst sensor-day cases. Unfortunately, none of these real-flight cases separated current dev from the patched code (they both produced identical CDFs; the available real data does not exercise the exact broken transition patterns).
  6. Semi-synthetic validation: Because real flight data was insufficient to demonstrate the fix, semi-synthetic input data was derived from real 2025-09-27 MAG transitions to create cases that specifically exercise the T015/T016 failure modes. These cases demonstrated clear semantic differences between dev and the patched code, with dev producing incorrect row counts and the patched code matching expectations.

Relationship to the algorithm document

The MAG Science Algorithm Document (section 7.3.4) describes a six-step L1C process. The current code implements a simplified version of these steps that was developed collaboratively between SDC and the MAG team. This PR preserves that simplified implementation. Specifically:

Step 3 (Generate a new NM time sequence) is the area with the largest gap between the document and the code. The document describes a burst-driven approach: find the BM anchor time tC closest to the gap start tA, decimate BM timestamps, shift by (tA - tC), and trim. The current code instead generates gap timestamps arithmetically from the gap boundaries at the appropriate NM cadence, then interpolates burst data onto those timestamps.

Both approaches produce a regular NM time sequence that bridges the gap. The current approach is simpler and was explicitly approved by the MAG team. The key fix in this PR is that the cadence used for gap-fill timestamps now respects the actual NM rate declared in the gap metadata, rather than always using 0.5s (2 Hz). This aligns with the document's intent that "the rate depends on the rate of the NM data before the gap."

Step 5 (Identify remaining gaps with >1.1s threshold) is not changed by this PR. The current implementation uses a per-rate tolerance check rather than a fixed 1.1s threshold, which is arguably more correct for non-2 Hz rates.

The six bugs and their fixes

Fix 1: Rate-aware gap cadence (generate_missing_timestamps)

Bug: Gap-fill timestamps were always generated at 0.5s intervals (2 Hz) regardless of the actual normal mode rate. For non-2 Hz gaps, this was wrong by construction.

Fix: Read the rate from gap[2] (the third element of each gap tuple) and compute the cadence as int(1e9 / rate). Fall back to 0.5s if no rate is provided. Use integer arithmetic throughout to avoid float precision loss at TTJ2000-scale nanosecond timestamps.

Which tests this fixed: T016 (jointly with Fixes 3 and 4). Reverting this fix alone produces T016 mago -662 rows and T016 magi +332 rows -- the dominant contributor to the T016 failure. Also covered by the new test_generate_missing_timestamps_uses_gap_rate() and the updated rate-aware case in test_generate_timeline().

Spec alignment: This directly implements the document's instruction that the interpolated rate should depend on "the rate of the NM data before the gap."

Fix 2: 2D array indexing in interpolate_gaps

Bug: filled_norm_timeline is an (n, 8) array, but the gap-timeline extraction compared the entire 2D array against scalar timestamps without specifying a column. The boolean mask broadcasted across all 8 columns instead of just the epoch column (index 0).

Fix: Extract norm_epochs = filled_norm_timeline[:, 0] once at the top of the function and filter against it explicitly.

Which tests this fixed: No isolated validation delta reproduced when this line was reverted in the current patch set. This is still a real correctness fix: it makes the gap-timeline mask operate on the epoch column by construction instead of relying on non-epoch columns being numerically unrelated to TTJ2000 timestamps.

Fix 3: Off-by-one burst slice endpoint

Bug: burst_end was computed as an inclusive index (via min(len-1, ...)), but then used in a Python slice burst_epochs[burst_start:burst_end] which is end-exclusive, silently dropping the last intended burst sample from the interpolation window.

Fix: Make burst_end exclusive from the start: burst_end = min(len(burst_epochs), burst_gap_end + burst_buffer + 1).

Which tests this fixed: T016. In isolated replay, reverting this fix preserved row counts but changed the first mismatched interpolated vector at the burst-resume boundary for both mago and magi.

Fix 4: CIC filter edge clipping (linear_filtered)

Bug: After the CIC filter compensates for group delay, the filtered timestamp range is shorter than the input by delay samples at the tail. linear() then calls remove_invalid_output_timestamps(), which clips output timestamps to this shortened filtered range, dropping valid gap timestamps near the burst boundary.

Fix: Pass extrapolate=True to linear() inside linear_filtered(). This matches the behavior of the spec's reference CIC implementation, which uses InterpolatedUnivariateSpline (extrapolates by default).

Which tests this fixed: T016. In isolated replay, reverting this fix alone causes both T016-mago and T016-magi to lose the final interpolated row.

Interaction with T024: Adding extrapolate=True alone would cause a +1 row regression on T024 (burst-only). Fix 6 below compensates for this.

Fix 5: Spurious micro-gaps at Config mode transitions (_find_rate_segments)

Bug: When MAG transitions between modes via Config mode, the vectors_per_second metadata records a transition timestamp that does not perfectly align with where the cadence actually changes in the data. The existing find_all_gaps() split the timeline at the declared metadata timestamps, and find_gaps() misidentified the boundary misalignment as 1-sample phantom gaps. These caused extra rows in the T015 output.

Fix: New _find_rate_segments() function that walks each configured transition backward while the observed cadence already matches the new rate. This aligns rate segment boundaries to the actual cadence in the data, preventing spurious micro-gaps from being generated in the first place. Also extracts the existing 0.075 tolerance magic number into a shared GAP_TOLERANCE constant used by find_gaps().

Which tests this fixed: T015-magi (had 5 extra rows due to phantom gaps at the Config boundary). Reverting the walkback logic alone reintroduces that exact +5 row failure.

Why not a post-hoc filter: An earlier approach filtered out gaps shorter than 2 cadence periods after detection. This is simpler (around 5 lines vs around 35 lines) but cannot distinguish Config-transition artifacts from genuine single-sample data loss. The root-cause fix via _find_rate_segments() preserves detection of legitimate short gaps while eliminating only the artifacts.

Fix 6: Burst-only trailing-edge trim for T024

Bug: Fix 4 (extrapolate=True) allows interpolation slightly past the CIC-filtered range, which is correct for T016 but causes T024 (burst-only, no normal data) to produce one extra trailing sample beyond the expected output.

Fix: Detect the burst-only case by checking whether any samples in the filled timeline have ModeFlags.NORM status (has_norm_context). When there is no normal-mode context, shorten the burst coverage upper bound by one output cadence to compensate for the CIC filter delay. This removes the extra trailing sample without affecting the normal+burst path.

Which tests this fixed: T024-mago and T024-magi (regressed to 687 rows with Fix 4 alone, expected 686). Reverting this trim alone reproduces that exact +1 row failure.

Additional cleanup: Dead np.delete removal

The existing code called np.delete(filled_norm_timeline, timeline_index) without assigning the result. Since np.delete() returns a new array, this was a no-op. Removed.

Defensive additions

  • Guard clause in find_gaps(): Returns empty when timeline_data has fewer than 2 elements, preventing crashes on single-timestamp rate segments.
  • Integer dtype consistency: Gap arrays use dtype=np.int64 throughout to prevent silent float precision loss on nanosecond timestamps.
  • generate_timeline() bug fix: Update last_index when generated timestamps are empty, preventing duplicate epoch data copies with adjacent gaps.

Test changes

  • Un-skip T015 and T016: Removed the pytest.skip() guard in test_mag_l1c_validation.
  • Exact row-count assertion: Added an assert checking len(expected_output.index) == len(l1c["epoch"].data) before the per-row value checks. This catches off-by-one regressions immediately rather than silently passing when the output has extra rows.
  • Updated test_process_mag_l1c expectations: The existing unit test now expects rate-aware cadence (17 timestamps instead of 15 in the 4 Hz gap region, with updated flag indices).
  • New: test_find_all_gaps_uses_observed_transition_boundary(): Regression test for the _find_rate_segments() micro-gap fix.
  • New: test_generate_missing_timestamps_uses_gap_rate(): Regression test for rate-aware cadence, including the 2 Hz fallback.
  • New: Rate-aware gap fill case in test_generate_timeline(): Verifies 4 Hz gap timestamps are spaced at 0.25s.
  • New: Adjacent-boundary case in test_generate_timeline(): Verifies no duplicate timestamps when adjacent gaps share a real epoch boundary, with a mixed-rate transition.

Files changed

File Lines What
imap_processing/mag/l1c/mag_l1c.py +131/-32 Fixes 1-3, 5-6, guard clauses, cleanup, and helper docstrings required by numpydoc
imap_processing/mag/l1c/interpolation_methods.py +1/-1 Fix 4 (extrapolate=True)
imap_processing/tests/mag/test_mag_l1c.py +73/-6 Updated expectations + 4 new regression tests
imap_processing/tests/mag/test_mag_validation.py +1/-4 Remove skip guard + add exact row-count assertion
poetry.lock +10/-0 Required poetry-lock hook refresh

What this PR does NOT change

  • No rewrite of the gap-fill pipeline architecture
  • No GapFillPlan dataclass or BM-anchor/shift/trim step-3 implementation
  • No day-boundary / inherited-timeline work (T017-T019 scope)
  • No changes to public API or CLI interfaces
  • No timestamp precision fixes outside the gap-fill path

Validation results

Rerun locally on mag-l1c-fix:

imap_processing/tests/mag/test_mag_l1c.py: 18 passed in 0.84s
compare_mag_l1c_validation.py on T013/T014/T015/T016/T024 (mago,magi): all 10 cases PASS
pre-commit run --all-files: PASS

Exact row-count validation (previously skipped T015/T016 now pass):

Test Sensor Expected Got Status
T013 mago 1856 1856 MATCH
T013 magi 1856 1856 MATCH
T014 mago 1792 1792 MATCH
T014 magi 1792 1792 MATCH
T015 mago 2498 2498 MATCH
T015 magi 1538 1538 MATCH
T016 mago 3052 3052 MATCH
T016 magi 1195 1195 MATCH
T024 mago 686 686 MATCH
T024 magi 686 686 MATCH

sapols and others added 3 commits April 1, 2026 01:12
Six localized bug fixes to the existing MAG L1C gap-filling algorithm,
plus targeted regression tests. Keeps the current production algorithm
shape — no architectural rewrite.

Fixes:
- Rate-aware gap cadence in generate_missing_timestamps()
- 2D array indexing in interpolate_gaps() (column 0 explicitly)
- Off-by-one burst slice endpoint
- CIC filter edge clipping via extrapolate=True in linear_filtered()
- Spurious micro-gaps at Config transitions via _find_rate_segments()
- Burst-only trailing-edge trim for T024 exact row count
- Remove dead np.delete() no-op

Refs IMAP-Science-Operations-Center#1993

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Codex <noreply@openai.com>
@sapols
Copy link
Copy Markdown
Author

sapols commented Apr 1, 2026

One scope clarification: My read is that this PR is enough to close the issue, but we haven't completed every historical MAG L1C TODO that came up in the discussion.

Relative to @maxinelasp’s earlier checklist:

  • done: one-input-file handling
  • done/effectively already present: CIC delay compensation and 2x buffer sizing
  • partial: beginning/end-of-day handling (synthetic day bounds are handled, but previous/next-day files are still not loaded)
  • not done here: using FILLVAL instead of zero placeholders

Relative to @alastairtree’s broader “L1C should” list:

  • this PR improves the validated transition/gap-fill behavior substantially
  • but it does not implement true previous/next-day context stitching
  • it does not add broader proof for out-of-order data handling
  • it does not claim comprehensive MAG OFF / no-science coverage

@sapols sapols marked this pull request as ready for review April 1, 2026 09:37
@sapols sapols requested a review from maxinelasp April 1, 2026 09:38
@sapols
Copy link
Copy Markdown
Author

sapols commented Apr 1, 2026

Another clarification: we're probably sweeping a lot under the rug by not counting day-boundary / inherited-timeline work (T017-T019) as in-scope for this ticket. E.g. Alastair's comment about gaps in the Dec 5-6 data appears to be directly related to L1C not treating leading/trailing burst-only intervals on mixed norm+burst days as fillable gaps. In those windows the parent burst L1B data does exist, but the current mixed-input L1C path collapses the output window to the existing normal-mode coverage instead of extending it across those burst-only spans.

@maxinelasp
Copy link
Copy Markdown
Contributor

@sapols Agreed, would you be able to open a follow-up ticket for the day boundary coverage if one does not exist?

@maxinelasp
Copy link
Copy Markdown
Contributor

Initial thoughts from reading through the PR description only:

  1. We should get a thumbs up from the MAG team around using extrapolation in the CIC filter only. Am I correct in understanding that these extrapolated data points are later clipped?
  2. for _find_rate_segements - sounds like a nice change. Does this still use the vecsec transitions attribute in any way?
  3. I'm a little confused by step 6 - what is the root of this extra timestamp getting produced? I worry this solution is a little too closely aligned with the validation test, will there ever be more than one extra timestamps?

Nice work! I'll review the code more thoroughly in a bit.

@sapols
Copy link
Copy Markdown
Author

sapols commented Apr 1, 2026

@maxinelasp

  1. Nah, they're not later clipped, and that's intentional. linear_filtered() passes extrapolate=True to linear(), which skips the remove_invalid_output_timestamps() clipping; that's the whole point of the fix. interpolate_gaps() bounds the candidate gap timestamps to the burst window via the short mask (lines 506/514) before interpolation, and within that window the boundary samples get extrapolated, filled, marked burst, and kept. So the T016 boundary sample is intentionally preserved.
    The MAG team should know that in this small window (less than one output cadence), the linear spline is extrapolating beyond the CIC-filtered input rather than interpolating within it.

  2. Yes, _find_rate_segments still uses vectors_per_second. process_mag_l1c() parses the attribute and passes it through find_all_gaps() into _find_rate_segments(), which iterates the configured (start_time, rate) pairs. The only change is that it walks each boundary backward while the observed cadence already matches the new rate—so it still uses the metadata, just not as an exact sample-level split point.

  3. The root of that extra timestamp is the CIC filter's group delay shortening the filtered range by delay = decimation_factor - 1 burst-rate samples at the tail. With extrapolate=True, the burst-only path can fill trailing NM timestamps past that shortened range. The delay in output-cadence terms is (decimation_factor - 1) / decimation_factor, which is always < 1.0 (e.g. 0.75 for 4x, 0.5 for 2x decimation). So at most one extra output timestamp can appear—this is a mathematical bound, not just empirically matched to the test.

input_filtered, vectors_filtered = cic_filter(
input_vectors, input_timestamps, output_timestamps, input_rate, output_rate
)
return linear(vectors_filtered, input_filtered, output_timestamps)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably, we should add extrapolate=true to all the _filtered methods if we're doing it for one.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, you're right, the same post-CIC tail-clipping bug exists in quadratic_filtered and cubic_filtered. It's dormant because L1C_INTERPOLATION_METHOD = "linear_filtered", but it'd bite us if the config changed. I'll add it to the other methods, and add an edge-case test that hits all three at the tail boundary.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Comment thread imap_processing/mag/l1c/mag_l1c.py Outdated

logger = logging.getLogger(__name__)

GAP_TOLERANCE = 0.075
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should go in mag/constants.py. Might need a clearer name too (L1C_GAP_TOLERANCE) and a comment describing what this is.

Copy link
Copy Markdown
Author

@sapols sapols Apr 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Moved to constants.py under the new name L1C_CADENCE_TOLERANCE (to make clear that it applies to cadence matching generally, not just gap detection) and I added this comment:

# Relative tolerance (7.5%) for L1C cadence checks; allows small clock-drift
# variation around the expected cadence before a spacing is treated as a gap.

(If you like L1C_GAP_TOLERANCE better we can use it just lmk)

Comment thread imap_processing/mag/l1c/mag_l1c.py Outdated
6-7 - compression flags.
"""
burst_epochs = burst_dataset["epoch"].data
norm_epochs = filled_norm_timeline[:, 0]
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you rename this variable to make it clear it's the full or filled timeline?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renamed norm_epochs to filled_timeline_epochs

Comment thread imap_processing/mag/l1c/mag_l1c.py Outdated
Comment on lines +510 to +512
# In the burst-only fallback, CIC delay compensation shortens the usable
# filtered range at the trailing edge by roughly one output cadence.
short_end -= int(1e9 / norm_rate.value)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See my comment in the main section about my concerns about this section.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assuming this refers to your question 3, I replied above. Happy to expand here too if useful.

Comment thread imap_processing/mag/l1c/mag_l1c.py Outdated
]
gap_timeline = norm_epochs[(norm_epochs > gap[0]) & (norm_epochs < gap[1])]

short_end = burst_epochs[burst_end - 1]
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The name of this variable is kind of confusing to me

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Totally fair. It was called short_end because it’s the shortened upper bound used for the short mask after the optional one-cadence trim in the burst-only fallback, but that name wasn’t clear about what was being shortened or that it was an epoch. So I renamed it to usable_burst_end_epoch.

)
generated_timestamps = generate_missing_timestamps(gap)
if generated_timestamps.size == 0:
last_index = int(np.searchsorted(epoch_data, gap[1], side="left"))
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice

difference_ns = 0.5 * 1e9
output: np.ndarray = np.arange(gap[0], gap[1], difference_ns)
difference_ns = int(0.5 * 1e9)
if len(gap) > 2:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the reasoning for this if statement? Add a comment plz

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The if len(gap) > 2 is there to support both legacy (start, end) gaps, which keep the historical 0.5 s cadence, and newer (start, end, rate) gaps, which use the declared cadence.

I added this comment:

# Support both legacy (start, end) gaps, which use the historical 0.5 s cadence,
# and newer (start, end, rate) gaps, which use the declared cadence.

@sapols
Copy link
Copy Markdown
Author

sapols commented Apr 6, 2026

@sapols Agreed, would you be able to open a follow-up ticket for the day boundary coverage if one does not exist?

Created #2925

@sapols
Copy link
Copy Markdown
Author

sapols commented Apr 9, 2026

@maxinelasp I addressed all comments. Ready for review again!

(I also updated the PR description to better explain exactly which changes fixed which tests)

Not required for this interpolation method.
extrapolate : bool, optional
Whether to allow extrapolation of output timestamps outside the range of input
timestamps. Default is False.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you convince/reassure me this does not lead to any extrapolated science. We are only allowed to filter/decimate science between any two MAG timestamps. those timestamps could be organic/real ones or ones where we have created a timestamp between 2 real MAG timestamps because we are downsampling/decimating the data.
Wherever we are gap filling we should usually have a period where we have data in both BM and NM streams. this allows you throw away data in the overlap period when filtering (because you need to throw away points like in the CIC filter) and still have a complete series across the gap because of the overlap.
Sometimes we change mode and do not overlap because if instrument buffer requirements. In these cases there is a short gap (up to a few secs) between streams. This should appear as a real gap and not be extrapolated.
So at least the science should never be extrapolated, the overlapping data streams allow that to be possible without gaps most of the time, and gaps should not be filled with extrapolation usually. Adding an extrapolation option makes me nervous as it sounds like making up science data which is obliviously a bad look.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with this concern - from our discussions, I believe Shawn that this will never extrapolate timestamps that end up in the final output, but maybe we could add another step in here that tracks and removes all extrapolated timestamps before returning anything, to ensure they are always cropped out. It's my understanding that in this method, the extrapolated values would never be passed out of the interpolation step - so it shouldn't affect anything to call remove_invalid_output_timestamps as a final step, right?

Copy link
Copy Markdown
Author

@sapols sapols Apr 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alastairtree @maxinelasp Okay, I think the requirement Alastair is describing here is stricter than I had been interpreting. So unfortunately, we need to stop here and discuss before I know how to move forward.

The current code is not extrapolating outside the original burst timestamp range, but it is technically extrapolating science beyond the CIC-filtered support a tiny amount (by less than one output cadence). For T016, the difference is exactly one L1C output row in each affected product. Without extrapolate=True, T016 is short by one row (and therefore fails); with it, that final boundary row is kept (so it passes).

That extrapolation is bounded by the CIC group delay. The filtered range is shortened at the trailing edge by decimation_factor - 1 burst-rate samples, which is (decimation_factor - 1) / decimation_factor output cadences. That is always less than one output cadence, so at most one output timestamp can land in that tail interval. In the T016 case, the extrapolated row is ~21.4 ms past the last CIC-filtered MAGO sample and ~5.8 ms past the last CIC-filtered MAGI sample.

That seems to put two expectations in conflict. The T016 validation case expects a full gapless Normal product across this Config-mode transition, including that final boundary row. But the requirement Alastair is describing says that an output row should only be produced where the CIC-filtered burst stream has real support; if the BM/NM streams do not overlap enough after filtering, the remaining boundary interval should stay a real gap.

So I think that leaves us with two clean paths:

  1. If MAG considers this one sub-cadence CIC-edge row acceptable, then the current extrapolate=True approach is doing what T016 expects, and I can tighten the comments/tests around that bounded behavior.

  2. If MAG says absolutely no science may be produced outside the CIC-filtered support, then I should remove the extrapolate=True changes from the filtered interpolation methods. That would shrink this PR and guarantee no post-CIC extrapolated samples leave interpolation, but T016 will go back to failing by one row in each affected product because the available burst data does not contain enough CIC-supported samples to produce that final boundary row.

So I need MAG direction before changing this further: should the T016 gapless-output expectation win for this one bounded edge sample, or should the no-extrapolated-science rule win even though that means T016 is one row short with the current validation inputs/expected output?

Comment thread imap_processing/mag/constants.py Outdated
FILLVAL = -1e31
# Relative tolerance (7.5%) for L1C cadence checks; allows small clock-drift
# variation around the expected cadence before a spacing is treated as a gap.
L1C_CADENCE_TOLERANCE = 0.075
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are the units of this? Percentage of what? Might be useful to see what this actually is for 1/2/4/8Hz products. We are pretty sensitive to timestamp jitter (appears as sin waves in our data due to s/c spin)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a percentage of gap, which is defined as expected_gap = 1 / vectors_per_second * 1e9. So this attempts to compensate for very small amounts of timestamp jitter which may appear as a data gap to a more strict gap-checking system.

This is already in the existing code as a magic number. I believe we landed on that percentage when meeting in person about L1C to compensate for potential clock drift.

@sapols, can you expand the comment a little to address Alastair's question?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And possibly rename to something like, L1C_TIMESTAMP_GAP_TOLERANCE?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure thing! Renamed to L1C_TIMESTAMP_GAP_TOLERANCE and updated the comment to:

# Relative tolerance for L1C timestamp-gap checks; allows small clock-drift
# variation around the expected cadence before a spacing is treated as a gap.
# This is 7.5% of expected_gap = 1e9 / vectors_per_second ns
# (75, 37.5, 18.75, or 9.375 ms at 1, 2, 4, or 8 Hz, respectively).
L1C_TIMESTAMP_GAP_TOLERANCE = 0.075

@alastairtree
Copy link
Copy Markdown
Collaborator

Thanks for this. Have added a couple of comments inline. Feel free to send over some CDFs generated with this to the MAG team and we can take a look at the results if that helps.

Copy link
Copy Markdown
Contributor

@maxinelasp maxinelasp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me, as long as we're able to address @alastairtree's concerns. I'll approve once we get Alastair's thumbs up, but no major changes requested from me :)
Thanks Shawn!!

Not required for this interpolation method.
extrapolate : bool, optional
Whether to allow extrapolation of output timestamps outside the range of input
timestamps. Default is False.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with this concern - from our discussions, I believe Shawn that this will never extrapolate timestamps that end up in the final output, but maybe we could add another step in here that tracks and removes all extrapolated timestamps before returning anything, to ensure they are always cropped out. It's my understanding that in this method, the extrapolated values would never be passed out of the interpolation step - so it shouldn't affect anything to call remove_invalid_output_timestamps as a final step, right?

Comment thread imap_processing/mag/constants.py Outdated
FILLVAL = -1e31
# Relative tolerance (7.5%) for L1C cadence checks; allows small clock-drift
# variation around the expected cadence before a spacing is treated as a gap.
L1C_CADENCE_TOLERANCE = 0.075
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a percentage of gap, which is defined as expected_gap = 1 / vectors_per_second * 1e9. So this attempts to compensate for very small amounts of timestamp jitter which may appear as a data gap to a more strict gap-checking system.

This is already in the existing code as a magic number. I believe we landed on that percentage when meeting in person about L1C to compensate for potential clock drift.

@sapols, can you expand the comment a little to address Alastair's question?

)


def _find_rate_segments(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice

Comment thread imap_processing/mag/constants.py Outdated
FILLVAL = -1e31
# Relative tolerance (7.5%) for L1C cadence checks; allows small clock-drift
# variation around the expected cadence before a spacing is treated as a gap.
L1C_CADENCE_TOLERANCE = 0.075
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And possibly rename to something like, L1C_TIMESTAMP_GAP_TOLERANCE?

)
assert np.array_equal(output_rate_gap, expected_rate_gap)

# Test Case: Adjacent gaps share a real epoch boundary at gap[0].
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, I like this test case

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants