Start porting mapped task to SDK#45627
Merged
Merged
Conversation
Member
Author
|
Mypy is seriously unhappy. Oh well |
6afcde8 to
7178c24
Compare
7178c24 to
29e8600
Compare
Member
Author
|
Oh also singlediaptch and singledispathmethod don't play great with type hints in 3.9. Worked around that easily enough now though. |
kaxil
reviewed
Jan 14, 2025
kaxil
reviewed
Jan 14, 2025
kaxil
left a comment
Member
There was a problem hiding this comment.
First pass, will do a more detailed look in an hour
kaxil
reviewed
Jan 14, 2025
kaxil
approved these changes
Jan 14, 2025
kaxil
left a comment
Member
There was a problem hiding this comment.
Few comments but the code looks good, minor adjustments needed to get tests passing
5b2702f to
8967c4b
Compare
8967c4b to
6f57645
Compare
029181b to
f239adf
Compare
eb93960 to
387c125
Compare
387c125 to
c6c52f0
Compare
jscheffl
reviewed
Jan 20, 2025
jscheffl
reviewed
Jan 20, 2025
c6c52f0 to
183bb00
Compare
kaxil
reviewed
Jan 21, 2025
This PR restructures the Mapped Operator and Mapped Task Group code to live in the Task SDK at definition time. The big thing this change _does not do_ is make it possible to execute mapped tasks via the Task Execution API server etc -- that is up next. There were some un-avoidable changes to the scheduler/expansion part of mapped tasks here. Of note: `BaseOperator.get_mapped_ti_count` has moved from an instance method on BaseOperator to be a class method. The reason for this was that with the move of more and more of the "definition time" code into the TaskSDK BaseOperator and AbstractOperator it is no longer possible to add DB-accessing code to a base class and have it apply to the subclasses. (i.e. `airflow.models.abstractoperator.AbstractOperator` is now _not always_ in the MRO for tasks. Eventually that class will be deleted, but not yet) On a similar vein XComArg's `get_task_map_length` is also moved to a single dispatch class method on the TaskMap model since now the definition time objects live in the TaskSDK, and there is no realistic way to get a per-type subclass with DB logic (i.e. it's very complex to end up with a PlainDBXComArg, a MapDBXComArg, etc. that we can attach the method too) For those who aren't aware, singledispatch (and singledispatchmethod) are a part of the standard library when the type of the first argument is used to determine which implementation to call. If you are familiar with C++ or Java this is very similar to method overloading, the one caveat is that it _only_ examines the type of the first argument, not the full signature.
183bb00 to
0b4b7a0
Compare
Member
Member
|
#protm |
dauinh
pushed a commit
to dauinh/airflow
that referenced
this pull request
Jan 24, 2025
This PR restructures the Mapped Operator and Mapped Task Group code to live in the Task SDK at definition time. The big thing this change _does not do_ is make it possible to execute mapped tasks via the Task Execution API server etc -- that is up next. There were some un-avoidable changes to the scheduler/expansion part of mapped tasks here. Of note: `BaseOperator.get_mapped_ti_count` has moved from an instance method on BaseOperator to be a class method. The reason for this was that with the move of more and more of the "definition time" code into the TaskSDK BaseOperator and AbstractOperator it is no longer possible to add DB-accessing code to a base class and have it apply to the subclasses. (i.e. `airflow.models.abstractoperator.AbstractOperator` is now _not always_ in the MRO for tasks. Eventually that class will be deleted, but not yet) On a similar vein XComArg's `get_task_map_length` is also moved to a single dispatch class method on the TaskMap model since now the definition time objects live in the TaskSDK, and there is no realistic way to get a per-type subclass with DB logic (i.e. it's very complex to end up with a PlainDBXComArg, a MapDBXComArg, etc. that we can attach the method too) For those who aren't aware, singledispatch (and singledispatchmethod) are a part of the standard library when the type of the first argument is used to determine which implementation to call. If you are familiar with C++ or Java this is very similar to method overloading, the one caveat is that it _only_ examines the type of the first argument, not the full signature.
gpathak128
pushed a commit
to gpathak128/airflow
that referenced
this pull request
Jan 29, 2025
This PR restructures the Mapped Operator and Mapped Task Group code to live in the Task SDK at definition time. The big thing this change _does not do_ is make it possible to execute mapped tasks via the Task Execution API server etc -- that is up next. There were some un-avoidable changes to the scheduler/expansion part of mapped tasks here. Of note: `BaseOperator.get_mapped_ti_count` has moved from an instance method on BaseOperator to be a class method. The reason for this was that with the move of more and more of the "definition time" code into the TaskSDK BaseOperator and AbstractOperator it is no longer possible to add DB-accessing code to a base class and have it apply to the subclasses. (i.e. `airflow.models.abstractoperator.AbstractOperator` is now _not always_ in the MRO for tasks. Eventually that class will be deleted, but not yet) On a similar vein XComArg's `get_task_map_length` is also moved to a single dispatch class method on the TaskMap model since now the definition time objects live in the TaskSDK, and there is no realistic way to get a per-type subclass with DB logic (i.e. it's very complex to end up with a PlainDBXComArg, a MapDBXComArg, etc. that we can attach the method too) For those who aren't aware, singledispatch (and singledispatchmethod) are a part of the standard library when the type of the first argument is used to determine which implementation to call. If you are familiar with C++ or Java this is very similar to method overloading, the one caveat is that it _only_ examines the type of the first argument, not the full signature.
got686-yandex
pushed a commit
to got686-yandex/airflow
that referenced
this pull request
Jan 30, 2025
This PR restructures the Mapped Operator and Mapped Task Group code to live in the Task SDK at definition time. The big thing this change _does not do_ is make it possible to execute mapped tasks via the Task Execution API server etc -- that is up next. There were some un-avoidable changes to the scheduler/expansion part of mapped tasks here. Of note: `BaseOperator.get_mapped_ti_count` has moved from an instance method on BaseOperator to be a class method. The reason for this was that with the move of more and more of the "definition time" code into the TaskSDK BaseOperator and AbstractOperator it is no longer possible to add DB-accessing code to a base class and have it apply to the subclasses. (i.e. `airflow.models.abstractoperator.AbstractOperator` is now _not always_ in the MRO for tasks. Eventually that class will be deleted, but not yet) On a similar vein XComArg's `get_task_map_length` is also moved to a single dispatch class method on the TaskMap model since now the definition time objects live in the TaskSDK, and there is no realistic way to get a per-type subclass with DB logic (i.e. it's very complex to end up with a PlainDBXComArg, a MapDBXComArg, etc. that we can attach the method too) For those who aren't aware, singledispatch (and singledispatchmethod) are a part of the standard library when the type of the first argument is used to determine which implementation to call. If you are familiar with C++ or Java this is very similar to method overloading, the one caveat is that it _only_ examines the type of the first argument, not the full signature.
niklasr22
pushed a commit
to niklasr22/airflow
that referenced
this pull request
Feb 8, 2025
This PR restructures the Mapped Operator and Mapped Task Group code to live in the Task SDK at definition time. The big thing this change _does not do_ is make it possible to execute mapped tasks via the Task Execution API server etc -- that is up next. There were some un-avoidable changes to the scheduler/expansion part of mapped tasks here. Of note: `BaseOperator.get_mapped_ti_count` has moved from an instance method on BaseOperator to be a class method. The reason for this was that with the move of more and more of the "definition time" code into the TaskSDK BaseOperator and AbstractOperator it is no longer possible to add DB-accessing code to a base class and have it apply to the subclasses. (i.e. `airflow.models.abstractoperator.AbstractOperator` is now _not always_ in the MRO for tasks. Eventually that class will be deleted, but not yet) On a similar vein XComArg's `get_task_map_length` is also moved to a single dispatch class method on the TaskMap model since now the definition time objects live in the TaskSDK, and there is no realistic way to get a per-type subclass with DB logic (i.e. it's very complex to end up with a PlainDBXComArg, a MapDBXComArg, etc. that we can attach the method too) For those who aren't aware, singledispatch (and singledispatchmethod) are a part of the standard library when the type of the first argument is used to determine which implementation to call. If you are familiar with C++ or Java this is very similar to method overloading, the one caveat is that it _only_ examines the type of the first argument, not the full signature.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.

This PR restructures the Mapped Operator and Mapped Task Group code to live in
the Task SDK at definition time.
The big thing this change does not do is make it possible to execute mapped
tasks via the Task Execution API server etc -- that is up next (#44360).
There were some un-avoidable changes to the scheduler/expansion part of mapped
tasks here. Of note:
BaseOperator.get_mapped_ti_counthas moved from an instance method onBaseOperator to be a class method. The reason for this was that with the move
of more and more of the "definition time" code into the TaskSDK BaseOperator
and AbstractOperator it is no longer possible to add DB-accessing code to a
base class and have it apply to the subclasses. (i.e.
airflow.models.abstractoperator.AbstractOperatoris now not always in theMRO for tasks. Eventually that class will be deleted, but not yet)
On a similar vein XComArg's
get_task_map_lengthis also moved to a singledispatch class method on the TaskMap model since now the definition time
objects live in the TaskSDK, and there is no realistic way to get a per-type
subclass with DB logic (i.e. it's very complex to end up with a
PlainDBXComArg, a MapDBXComArg, etc. that we can attach the method too)
For those who aren't aware, singledispatch (and singledispatchmethod) are a
part of the standard library when the type of the first argument is used to
determine which implementation to call. If you are familiar with C++ or Java
this is very similar to method overloading, the one caveat is that it only
examines the type of the first argument, not the full signature.
The long term goal here is to have a clean separation between "runtime/definition time" behaviour (i.e. creating mapped tasks, or running a mapped task) and expanding a mapped task (which is a scheduling-time operation only)
^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named
{pr_number}.significant.rstor{issue_number}.significant.rst, in newsfragments.