Skip to content

GKEStartJobOperator's job_poll_interval parameter is not used by its GKEJobTrigger #41705

Description

@bwatan

Apache Airflow Provider(s)

google

Versions of Apache Airflow Providers

apache-airflow-providers-google == 10.21.0

Apache Airflow version

2.9.3

Operating System

linux/arm64

Deployment

Official Apache Airflow Helm Chart

Deployment details

No response

What happened

My DAG kicks off a Kubernetes job using a GKEStartJobOperator. The operator has the following parameters set:

  • deferrable=True
  • wait_until_job_complete=True
  • job_poll_interval=60

When the task created from the GKEStartJobOperator is executed, the deferred task polls the job every 10 seconds and logs the message The job 'name-of-my-job' is incomplete. Sleeping for 10 sec.

What you think should happen instead

In the above example, the job should poll every 60 seconds, not 10 seconds.

This is happening because GKEJobTrigger isn't being passed job_poll_interval from GKEStartJobOperator which causes a AsyncKubernetesHook.wait_until_job_complete function to default the value of poll_interval 10.

The faulty code is in airflow/airflow/providers/google/cloud/triggers/kubernetes_engine.py in the method GKEJobTrigger.run on line 320.
This:
job: V1Job = await self.hook.wait_until_job_complete(name=self.job_name, namespace=self.job_namespace)
Should be this:
job: V1Job = await self.hook.wait_until_job_complete(name=self.job_name, namespace=self.job_namespace, poll_interval=self.poll_interval)

How to reproduce

  1. In GCP, create a GKE cluster with a default node pool.
  2. Create a GCP service account with the roles/container.developer role.
  3. Create a connection in Airflow that uses that service account.
  4. Create a DAG that uses a GKEStartJobOperator.
  5. Configure the GKE job to run for 60 seconds.
  6. On the GKEStartJobOperator, set deferrable=True, wait_until_job_complete=True, job_poll_interval to 20 on the GKEStartJobOperator.
  7. Execute the DAG. Verify that the kubernetes job task is deferred at some point.
  8. View the logs for the kubernetes job task. Verify that the task polled the job every 10 seconds, not every 20 seconds.

Anything else

The doc string for GKEStartJobOperator says that the poll interval parameter is called poll_interval, but the init says that it should be job_poll_interval. The doc string should be changed to say job_poll_interval.

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

Metadata

Metadata

Assignees

No one assigned

    Labels

    area:providerskind:bugThis is a clearly a bugneeds-triagelabel for new issues that we didn't triage yetprovider:googleGoogle (including GCP) related issues

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions