Use execution_date to check for existing DagRun for TriggerDagRunOperator#18968
Conversation
|
Congratulations on your first Pull Request and welcome to the Apache Airflow community! If you have any issues or are unsure about any anything please check our Contribution Guide (https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst)
|
|
The PR most likely needs to run full matrix of tests because it modifies parts of the core of Airflow. However, committers might decide to merge it quickly and take the risk. If they don't merge it quickly - please rebase it to the latest main at your convenience, or amend the last commit of the PR, and push it with --force-with-lease. |
|
As a side note, we are moving toward decoupling DAG runs against |
|
Actually, DagRun has a unique constraint on both airflow/airflow/api_connexion/endpoints/dag_run_endpoint.py Lines 256 to 263 in 86a2a19 |
| dag_id: Optional[Union[str, List[str]]] = None, | ||
| run_id: Optional[str] = None, | ||
| execution_date: Optional[datetime] = None, | ||
| execution_date: Optional[Union[datetime, List[datetime]]] = None, |
There was a problem hiding this comment.
Not related to this PR but noticed that this did not match the type in the docstring
|
Hi @uranusjr , thanks for taking the time to review this PR and for the suggestion. |
| @@ -270,7 +270,7 @@ def next_dagruns_to_examine( | |||
| def find( | |||
There was a problem hiding this comment.
@uranusjr , I'm thinking while we're updating this file, should we also convert this function into a classmethod?
execution_date to check for existing DagRun for TriggerDagRunOperator
|
Awesome work, congrats on your first merged pull request! |
…DagRunOperator`` (#18968) A small suggestion to change `DagRun.find` in `trigger_dag` to use `execution_date` as a parameter rather than `run_id`. I feel it would be better to use this rather than `run_id` as a parameter since using `run_id` will miss out checking for a scheduled run that ran at the same `execution_date` and throw the error below when it tries to create a new run with the same `execution_date`: ``` sqlalchemy.exc.IntegrityError: (psycopg2.errors.UniqueViolation) duplicate key value violates unique constraint "dag_run_dag_id_execution_date_key" ``` There is a constraint in `dag_run` called `dag_run_dag_id_execution_date_key` which can be found [here](https://github.com/apache/airflow/blob/c4f5233cd10ae03ee69fba861c8a6fa64e1f8a71/airflow/models/dagrun.py#L103). (cherry picked from commit e54ee6e)
A small suggestion to change
DagRun.findintrigger_dagto useexecution_dateas a parameter rather thanrun_id.I feel it would be better to use this rather than
run_idas a parameter since usingrun_idwill miss out checking for a scheduled run that ran at the sameexecution_dateand throw the error below when it tries to create a new run with the sameexecution_date:There is a constraint in
dag_runcalleddag_run_dag_id_execution_date_keywhich can be found here.