Description
Support multinode job types in the AWS Batch Operator.
The boto3 submit_job method supports container, multinode, and array batch jobs with the mutually exclusive nodeOverrides and containerOverrides (+ arrayProperties) parameters. But currently the AWS Batch Operator only supports submission of container jobs and array jobs by hardcoding the boto3 submit_job parameter containerOverrides:
|
containerOverrides=self.overrides, |
&
|
containerOverrides: Dict, |
The get_job_awslogs_info method in the batch client hook is also hardcoded for the container type job:
|
job_container_desc = self.get_job_description(job_id=job_id).get("container", {}) |
To support multinode jobs the get_job_awslogs_info method would need to access nodeProperties from the describe_jobs response.
Use case/motivation
Multinode jobs are a supported job type of AWS Batch, are supported by the underlying boto3 library, and should be also be available to be managed by Airflow. I've extended the AWS Batch Operator for our own use cases, but would prefer to not maintain a separate operator.
Related issues
No response
Are you willing to submit a PR?
Code of Conduct
Description
Support multinode job types in the AWS Batch Operator.
The boto3
submit_jobmethod supports container, multinode, and array batch jobs with the mutually exclusivenodeOverridesandcontainerOverrides(+arrayProperties) parameters. But currently the AWS Batch Operator only supports submission of container jobs and array jobs by hardcoding the boto3submit_jobparametercontainerOverrides:airflow/airflow/providers/amazon/aws/operators/batch.py
Line 200 in 3c08cef
airflow/airflow/providers/amazon/aws/hooks/batch_client.py
Line 99 in 3c08cef
The
get_job_awslogs_infomethod in the batch client hook is also hardcoded for the container type job:airflow/airflow/providers/amazon/aws/hooks/batch_client.py
Line 425 in 3c08cef
To support multinode jobs the
get_job_awslogs_infomethod would need to accessnodePropertiesfrom thedescribe_jobsresponse.Use case/motivation
Multinode jobs are a supported job type of AWS Batch, are supported by the underlying boto3 library, and should be also be available to be managed by Airflow. I've extended the AWS Batch Operator for our own use cases, but would prefer to not maintain a separate operator.
Related issues
No response
Are you willing to submit a PR?
Code of Conduct