Skip to content

Fix catchup by limiting queued dagrun creation using max_active_runs#18897

Merged
jedcunningham merged 3 commits into
apache:mainfrom
astronomer:fix-catchup
Oct 20, 2021
Merged

Fix catchup by limiting queued dagrun creation using max_active_runs#18897
jedcunningham merged 3 commits into
apache:mainfrom
astronomer:fix-catchup

Conversation

@ephraimbuddy

Copy link
Copy Markdown
Contributor

Currently, when catchup is True, we create a lot of dagruns limited by
max_queued_runs_per_dag setting. This is not efficient as some dagruns takes
longer to run.

This PR brings back the old behaviour of not creating dagruns once max_active_runs
is reached thereby solving the catchup issue.

Now, the dagruns appears as though they were created in running state

Closes: #18487


^ Add meaningful description above

Read the Pull Request Guidelines for more information.
In case of fundamental code change, Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in UPDATING.md.

@boring-cyborg boring-cyborg Bot added the area:Scheduler including HA (high availability) scheduler label Oct 11, 2021
Comment thread airflow/config_templates/config.yml Outdated
@ephraimbuddy

Copy link
Copy Markdown
Contributor Author

Most of the codes here are from previous versions

ashb
ashb previously requested changes Oct 12, 2021

@ashb ashb left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs more tests please:

  • Check that when there are n-1th dag_runs and we call _schedule_dag_run to create the n-th dag run that the relevant date columns are set to null
  • Check that when at max active runs (inc queued) that parsing the dag does not reset the columns
  • Check that when a dag run was in running state but that it's final task is completed that _schedule_dag_runs correctly sets the next_dagrun columns

Comment thread airflow/jobs/scheduler_job.py Outdated
Comment thread airflow/jobs/scheduler_job.py Outdated
Comment thread airflow/models/dag.py Outdated
Comment thread airflow/models/dag.py Outdated
@ephraimbuddy ephraimbuddy added the full tests needed We need to run full set of tests for this PR to merge label Oct 19, 2021
Comment thread airflow/jobs/scheduler_job.py Outdated
Comment thread airflow/jobs/scheduler_job.py Outdated
Comment thread airflow/models/dagrun.py Outdated
Comment thread airflow/jobs/scheduler_job.py Outdated
Comment thread airflow/jobs/scheduler_job.py Outdated
@ashb ashb added this to the Airflow 2.2.1 milestone Oct 20, 2021
ephraimbuddy and others added 2 commits October 20, 2021 10:20
Currently, when catchup is True, we create a lot of dagruns limited by
max_queued_runs_per_dag setting. This is not efficient as some dagruns takes
longer to run.

This PR brings back the old behaviour of not creating dagruns once max_active_runs
is reached thereby solving the catchup issue.

Now, the dagruns appears as though they were created in running state

improve code and add more tests

fixup! improve code and add more tests

Add information about removing of max_queued_runs_per_dag

add method for active runs

remove comment

deduplicate dag_ids

Update airflow/models/dagrun.py

Co-authored-by: Tzu-ping Chung <uranusjr@gmail.com>

Update airflow/jobs/scheduler_job.py

Co-authored-by: Tzu-ping Chung <uranusjr@gmail.com>

Update airflow/jobs/scheduler_job.py

Co-authored-by: Tzu-ping Chung <uranusjr@gmail.com>
Comment thread UPDATING.md
@jedcunningham jedcunningham dismissed ashb’s stale review October 20, 2021 20:32

Tests have been added and feedback addressed.

@jedcunningham jedcunningham merged commit 05eea00 into apache:main Oct 20, 2021
@jedcunningham jedcunningham deleted the fix-catchup branch October 20, 2021 20:33
jedcunningham pushed a commit that referenced this pull request Oct 20, 2021
…18897)

Currently, when catchup is True, we create a lot of dagruns limited by
max_queued_runs_per_dag setting. This is not efficient as some dagruns takes
longer to run.

This PR brings back the old behavior of not creating dagruns once max_active_runs
is reached thereby solving the catchup issue.

Now, the dagruns appears as though they were created in running state

(cherry picked from commit 05eea00)
jedcunningham pushed a commit to astronomer/airflow that referenced this pull request Oct 26, 2021
…pache#18897)

Currently, when catchup is True, we create a lot of dagruns limited by
max_queued_runs_per_dag setting. This is not efficient as some dagruns takes
longer to run.

This PR brings back the old behavior of not creating dagruns once max_active_runs
is reached thereby solving the catchup issue.

Now, the dagruns appears as though they were created in running state

(cherry picked from commit 05eea00)
sharon2719 pushed a commit to sharon2719/airflow that referenced this pull request Oct 27, 2021
…pache#18897)

Currently, when catchup is True, we create a lot of dagruns limited by
max_queued_runs_per_dag setting. This is not efficient as some dagruns takes
longer to run.

This PR brings back the old behavior of not creating dagruns once max_active_runs
is reached thereby solving the catchup issue.

Now, the dagruns appears as though they were created in running state
jedcunningham pushed a commit to astronomer/airflow that referenced this pull request Oct 27, 2021
…pache#18897)

Currently, when catchup is True, we create a lot of dagruns limited by
max_queued_runs_per_dag setting. This is not efficient as some dagruns takes
longer to run.

This PR brings back the old behavior of not creating dagruns once max_active_runs
is reached thereby solving the catchup issue.

Now, the dagruns appears as though they were created in running state

(cherry picked from commit 05eea00)
kaxil pushed a commit that referenced this pull request Nov 1, 2021
Fix #19304, and also an issue on scheduling a DAG's first-ever run introduced in #18897. We could fix it outside this function, but if `next_dagrun` is None, the next run's data interval is supposed to be None in the first place, so checking inside this function just makes sense.

closes #19343
closes #19304
jedcunningham pushed a commit to astronomer/airflow that referenced this pull request Nov 1, 2021
)

Fix apache#19304, and also an issue on scheduling a DAG's first-ever run introduced in apache#18897. We could fix it outside this function, but if `next_dagrun` is None, the next run's data interval is supposed to be None in the first place, so checking inside this function just makes sense.

closes apache#19343
closes apache#19304

(cherry picked from commit dc4dcaa)
jedcunningham pushed a commit to astronomer/airflow that referenced this pull request Nov 1, 2021
)

Fix apache#19304, and also an issue on scheduling a DAG's first-ever run introduced in apache#18897. We could fix it outside this function, but if `next_dagrun` is None, the next run's data interval is supposed to be None in the first place, so checking inside this function just makes sense.

closes apache#19343
closes apache#19304

(cherry picked from commit dc4dcaa)
kaxil pushed a commit that referenced this pull request Nov 2, 2021
Fix #19304, and also an issue on scheduling a DAG's first-ever run introduced in #18897. We could fix it outside this function, but if `next_dagrun` is None, the next run's data interval is supposed to be None in the first place, so checking inside this function just makes sense.

closes #19343
closes #19304

(cherry picked from commit dc4dcaa)
jedcunningham pushed a commit that referenced this pull request Nov 3, 2021
Fix #19304, and also an issue on scheduling a DAG's first-ever run introduced in #18897. We could fix it outside this function, but if `next_dagrun` is None, the next run's data interval is supposed to be None in the first place, so checking inside this function just makes sense.

closes #19343
closes #19304

(cherry picked from commit dc4dcaa)
@jedcunningham jedcunningham added the type:bug-fix Changelog: Bug Fixes label Apr 7, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:Scheduler including HA (high availability) scheduler full tests needed We need to run full set of tests for this PR to merge type:bug-fix Changelog: Bug Fixes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2.1.3/4 queued dag runs changes catchup=False behaviour

4 participants