Skip to content

Fix DAG auto-pause ordering to use run_after#65207

Merged
ephraimbuddy merged 4 commits into
apache:mainfrom
astronomer:fix-max-consecutive-failed-dagruns
Apr 17, 2026
Merged

Fix DAG auto-pause ordering to use run_after#65207
ephraimbuddy merged 4 commits into
apache:mainfrom
astronomer:fix-max-consecutive-failed-dagruns

Conversation

@ephraimbuddy

Copy link
Copy Markdown
Contributor

Auto-pausing for consecutive failed DAG runs should evaluate the most recent runs by run_after, not by logical_date. This fixes cases where manual runs or other non-chronological logical dates could cause
the scheduler to count the wrong runs and pause or not pause a DAG incorrectly.

closes: #65125

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes DAG auto-pause evaluation for consecutive failed runs by ordering recent DagRun records using run_after (with deterministic tie-breaker) instead of logical_date, avoiding incorrect behavior when logical_date is nullable or non-chronological.

Changes:

  • Update _check_last_n_dagruns_failed to order DagRun selection by run_after DESC, id DESC.
  • Add unit tests intended to validate auto-pause behavior when run_after and logical_date ordering differ.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File Description
airflow-core/src/airflow/models/dagrun.py Switch consecutive-failure evaluation ordering to run_after (plus id tie-break) instead of logical_date.
airflow-core/tests/unit/models/test_dag.py Add tests to cover ordering-sensitive auto-pause scenarios.
Comments suppressed due to low confidence (1)

airflow-core/src/airflow/models/dagrun.py:872

  • There is a standalone triple-quoted string (""" Marking dag as paused, if needed""") after the query. This isn’t a docstring (it’s a no-op string literal) and is easy to mistake for documentation. Please convert it to a normal comment or remove it.
            .order_by(DagRun.run_after.desc(), DagRun.id.desc())
            .limit(max_consecutive_failed_dag_runs)
        ).all()
        """ Marking dag as paused, if needed"""
        to_be_paused = len(dag_runs) >= max_consecutive_failed_dag_runs and all(

Comment thread airflow-core/src/airflow/models/dagrun.py
Comment thread airflow-core/tests/unit/models/test_dag.py Outdated

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

Comment thread airflow-core/tests/unit/models/test_dag.py Outdated
Comment thread airflow-core/tests/unit/models/test_dag.py
Comment thread airflow-core/tests/unit/models/test_dag.py Outdated
ephraimbuddy and others added 4 commits April 17, 2026 14:58
Auto-pausing for consecutive failed DAG runs should
evaluate the most recent runs by run_after, not by
logical_date. This fixes cases where manual runs or
other non-chronological logical dates could cause
the scheduler to count the wrong runs and pause or
not pause a DAG incorrectly.

closes: apache#65125
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
@ephraimbuddy ephraimbuddy force-pushed the fix-max-consecutive-failed-dagruns branch from 8e1736c to 2e33037 Compare April 17, 2026 14:02
@ephraimbuddy ephraimbuddy merged commit f1cd3f9 into apache:main Apr 17, 2026
78 checks passed
@ephraimbuddy ephraimbuddy deleted the fix-max-consecutive-failed-dagruns branch April 17, 2026 20:28
github-actions Bot pushed a commit to aws-mwaa/upstream-to-airflow that referenced this pull request Apr 17, 2026
* Fix DAG auto-pause ordering to use run_after

Auto-pausing for consecutive failed DAG runs should
evaluate the most recent runs by run_after, not by
logical_date. This fixes cases where manual runs or
other non-chronological logical dates could cause
the scheduler to count the wrong runs and pause or
not pause a DAG incorrectly.

closes: apache#65125

* fixup! Fix DAG auto-pause ordering to use run_after

* Update airflow-core/tests/unit/models/test_dag.py

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* apply suggestions from code review

---------
(cherry picked from commit f1cd3f9)

Co-authored-by: Ephraim Anierobi <splendidzigy24@gmail.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
vatsrahul1001 added a commit that referenced this pull request May 15, 2026
* Fix DAG auto-pause ordering to use run_after

Auto-pausing for consecutive failed DAG runs should
evaluate the most recent runs by run_after, not by
logical_date. This fixes cases where manual runs or
other non-chronological logical dates could cause
the scheduler to count the wrong runs and pause or
not pause a DAG incorrectly.

closes: #65125

* fixup! Fix DAG auto-pause ordering to use run_after

* Update airflow-core/tests/unit/models/test_dag.py



* apply suggestions from code review

---------


(cherry picked from commit f1cd3f9)

Co-authored-by: Ephraim Anierobi <splendidzigy24@gmail.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
vatsrahul1001 added a commit that referenced this pull request May 20, 2026
* Fix DAG auto-pause ordering to use run_after

Auto-pausing for consecutive failed DAG runs should
evaluate the most recent runs by run_after, not by
logical_date. This fixes cases where manual runs or
other non-chronological logical dates could cause
the scheduler to count the wrong runs and pause or
not pause a DAG incorrectly.

closes: #65125

* fixup! Fix DAG auto-pause ordering to use run_after

* Update airflow-core/tests/unit/models/test_dag.py



* apply suggestions from code review

---------


(cherry picked from commit f1cd3f9)

Co-authored-by: Ephraim Anierobi <splendidzigy24@gmail.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
vatsrahul1001 added a commit that referenced this pull request May 20, 2026
* Fix DAG auto-pause ordering to use run_after

Auto-pausing for consecutive failed DAG runs should
evaluate the most recent runs by run_after, not by
logical_date. This fixes cases where manual runs or
other non-chronological logical dates could cause
the scheduler to count the wrong runs and pause or
not pause a DAG incorrectly.

closes: #65125

* fixup! Fix DAG auto-pause ordering to use run_after

* Update airflow-core/tests/unit/models/test_dag.py



* apply suggestions from code review

---------


(cherry picked from commit f1cd3f9)

Co-authored-by: Ephraim Anierobi <splendidzigy24@gmail.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
vatsrahul1001 added a commit that referenced this pull request May 21, 2026
* Fix DAG auto-pause ordering to use run_after

Auto-pausing for consecutive failed DAG runs should
evaluate the most recent runs by run_after, not by
logical_date. This fixes cases where manual runs or
other non-chronological logical dates could cause
the scheduler to count the wrong runs and pause or
not pause a DAG incorrectly.

closes: #65125

* fixup! Fix DAG auto-pause ordering to use run_after

* Update airflow-core/tests/unit/models/test_dag.py



* apply suggestions from code review

---------


(cherry picked from commit f1cd3f9)

Co-authored-by: Ephraim Anierobi <splendidzigy24@gmail.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Scheduler: max_consecutive_failed_dag_runs broken by NULL logical_date

3 participants