Apache Airflow Provider(s)
google
Versions of Apache Airflow Providers
apache-airflow-providers-google==7.0.0
Apache Airflow version
2.3.2
Operating System
Debian GNU/Linux
Deployment
Official Apache Airflow Helm Chart
Deployment details
No response
What happened
Deferrable mode for BigQueryToGCSOperator #27683 changed the functionality of the BigQueryToGCSOperator so that it no longer waits for the completion of the operation. This is because the nowait=True parameter is now being set.
What you think should happen instead
This is unexpected behavior. Any downstream tasks of the BigQueryToGCSOperator that expect the CSVs to have been written by the time they are called may result in errors (and have done so in our own operations).
The property should at least be configurable.
How to reproduce
- Leverage the
BigQueryToGcsOperator in your DAG.
- Have it write a large table to a CSV somewhere in GCS
- Notice that the task completes almost immediately but the CSVs may not exist in GCS until later.
Anything else
No response
Are you willing to submit PR?
Code of Conduct
Apache Airflow Provider(s)
google
Versions of Apache Airflow Providers
apache-airflow-providers-google==7.0.0
Apache Airflow version
2.3.2
Operating System
Debian GNU/Linux
Deployment
Official Apache Airflow Helm Chart
Deployment details
No response
What happened
Deferrable mode for BigQueryToGCSOperator #27683 changed the functionality of the
BigQueryToGCSOperatorso that it no longer waits for the completion of the operation. This is because thenowait=Trueparameter is now being set.What you think should happen instead
This is unexpected behavior. Any downstream tasks of the
BigQueryToGCSOperatorthat expect the CSVs to have been written by the time they are called may result in errors (and have done so in our own operations).The property should at least be configurable.
How to reproduce
BigQueryToGcsOperatorin your DAG.Anything else
No response
Are you willing to submit PR?
Code of Conduct