Under which category would you file this issue?
Airflow Core
Apache Airflow version
3.2.2
What happened and how to reproduce it?
Again more slop but hopefully useful enough.
<🤖>
When a backfill is created for a DAG with fast-completing tasks (sub-second per run),
the scheduler marks the backfill as complete before all queued runs have been executed.
The root cause is in _mark_backfills_complete (scheduler_job_runner.py ~line 1967),
which runs every 30 seconds and marks a backfill complete when no dag runs are in
running or queued state:
~exists(
select(DagRun.id).where(
and_(DagRun.backfill_id == Backfill.id, DagRun.state.in_(unfinished_states))
)
)
When tasks complete faster than the scheduler's next scheduling loop can queue new
runs, there is a window where all current runs are success and the next batch has
not yet been dispatched. The completion check fires in this window and incorrectly
marks the backfill done, leaving remaining queued runs stranded.
To reproduce:
- Create a DAG with a no-op task and a long date range:
from airflow.sdk import dag, task
from datetime import datetime
@dag(
dag_id="test_backfill_bug",
schedule="@daily",
start_date=datetime(2020, 1, 1),
end_date=datetime(2022, 12, 31),
catchup=False,
)
def test_backfill_bug():
@task
def noop():
pass
noop()
test_backfill_bug()
- Create a backfill:
airflow backfill create \
--dag-id test_backfill_bug \
--from-date 2020-01-01 \
--to-date 2022-12-31 \
--max-active-runs 10
- Observe that the backfill completes having only processed a fraction of the 1096 runs:
import sqlite3, os
conn = sqlite3.connect(os.path.expanduser('~/airflow/airflow.db'))
cur = conn.cursor()
cur.execute('SELECT id, completed_at FROM backfill WHERE dag_id="test_backfill_bug"')
b_id, completed_at = cur.fetchone()
cur.execute('SELECT state, COUNT(*) FROM dag_run WHERE backfill_id=? GROUP BY state', (b_id,))
print('completed_at:', completed_at)
for row in cur.fetchall(): print(row)
conn.close()
Observed output:
completed_at: 2026-06-18 03:42:47.759611
('success', 441)
('queued', 455) <- remaining runs never executed
What you think should happen instead?
The backfill should only be marked complete when all dag runs associated with it have
reached a terminal state (success or failed), regardless of whether there is a
momentary window where none are running or queued.
A possible fix: check that the count of terminal dag runs equals the total
BackfillDagRun associations (excluding skipped entries) before marking complete.
Operating System
macOS
Deployment
Virtualenv installation
Apache Airflow Provider(s)
No response
Versions of Apache Airflow Providers
No response
Official Helm Chart version
Not Applicable
Kubernetes Version
No response
Helm Chart configuration
No response
Docker Image customizations
No response
Anything else?
Note: PR #62561 (merged in 3.2.2) fixed a related but distinct issue where a backfill
was marked complete before any dag runs were created (zero-runs race). This issue
occurs after dag runs are created and begin executing — the completion window opens
between scheduling batches when tasks complete faster than new ones are dispatched.
Related issue: the SQLite database is locked error (see
apache/airflow#68699) can cause fewer
dag runs to be created than expected, which makes this bug easier to trigger since
fewer runs complete faster.
Are you willing to submit PR?
Code of Conduct
Under which category would you file this issue?
Airflow Core
Apache Airflow version
3.2.2
What happened and how to reproduce it?
Again more slop but hopefully useful enough.
<🤖>When a backfill is created for a DAG with fast-completing tasks (sub-second per run),
the scheduler marks the backfill as complete before all queued runs have been executed.
The root cause is in
_mark_backfills_complete(scheduler_job_runner.py~line 1967),which runs every 30 seconds and marks a backfill complete when no dag runs are in
runningorqueuedstate:When tasks complete faster than the scheduler's next scheduling loop can queue new
runs, there is a window where all current runs are
successand the next batch hasnot yet been dispatched. The completion check fires in this window and incorrectly
marks the backfill done, leaving remaining queued runs stranded.
To reproduce:
Observed output:
What you think should happen instead?
The backfill should only be marked complete when all dag runs associated with it have
reached a terminal state (
successorfailed), regardless of whether there is amomentary window where none are
runningorqueued.A possible fix: check that the count of terminal dag runs equals the total
BackfillDagRunassociations (excluding skipped entries) before marking complete.Operating System
macOS
Deployment
Virtualenv installation
Apache Airflow Provider(s)
No response
Versions of Apache Airflow Providers
No response
Official Helm Chart version
Not Applicable
Kubernetes Version
No response
Helm Chart configuration
No response
Docker Image customizations
No response
Anything else?
Note: PR #62561 (merged in 3.2.2) fixed a related but distinct issue where a backfill
was marked complete before any dag runs were created (zero-runs race). This issue
occurs after dag runs are created and begin executing — the completion window opens
between scheduling batches when tasks complete faster than new ones are dispatched.
Related issue: the SQLite
database is lockederror (seeapache/airflow#68699) can cause fewer
dag runs to be created than expected, which makes this bug easier to trigger since
fewer runs complete faster.
Are you willing to submit PR?
Code of Conduct