Change default log filename template to include map_index#21495
Conversation
|
Example of current layout: |
f87d6f6 to
ea10d94
Compare
6dddeda to
3028972
Compare
uranusjr
left a comment
There was a problem hiding this comment.
Generally lgtm except two minor issues
3028972 to
c71a661
Compare
There was a problem hiding this comment.
No, cos of this in BaseOperator init:
self.inlets: List = []
self.outlets: List = []
There was a problem hiding this comment.
Or not. I was wrong and these aren't needed at all actually.
There was a problem hiding this comment.
Was setting task.params here unintended or simply unnecessary?
There was a problem hiding this comment.
I forget the error, but setting it here was giving some kind of error (I think from it being set twice on the same in-memory object, once in the supervisor, and once in the actual runner.)
So I took the approach that get_* should never have any sideffects!
With the recently added LogTemplate mechanism old TIs will still use the format they had at creation time (with the change here to ensure that we create a LogTemplate row for the just-upgraded-in-place) so the logs can still be viewed in the UI And since it was now getting quite "deep" I have chosen to "label" the components in the "hive partition style"
The `except AttributeError` was _also_ catching more than just "this handler doesn't have a set_context attribute", but also errors from calling that function which lead to hard-to-track-down errors (missing inlets/outlets) I have also changed `get_template_context` to a side-effect-free function, so it no longer mutates task.params!
And instead of having to duplicate the default config value in to configuration.py for the update process, change it to get the new default value out of the loaded default_airflow.cfg
It has been broken in main for a while (it works fine in 2.2.x series), but because we were catching _all_ AttributeErrors we never noticed.
Since the log_id is never really visible to users I have taken the easier approach of just always including the map_index, even for unmapped tasks.
Co-authored-by: Kaxil Naik <kaxilnaik@gmail.com>
344bed9 to
6ae4f31
Compare
|
The PR most likely needs to run full matrix of tests because it modifies parts of the core of Airflow. However, committers might decide to merge it quickly and take the risk. If they don't merge it quickly - please rebase it to the latest main at your convenience, or amend the last commit of the PR, and push it with --force-with-lease. |
With the recently added LogTemplate mechanism old TIs will still use the format they had at creation time (with the change here to ensure that we create a LogTemplate row for the just-upgraded-in-place) so the logs can still be viewed in the UI.
Unfortuantely the only way I can find to have map_index only appear for mapped tasks, but also to retain full control of the filename to users was to put a conditional in the template, which I am not happy about.
And since it was now getting quite "deep" I have chosen to "label" the components in the "hive partition style".
I'll need to test this change with Elasticsearch tooES working now.^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code change, Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in UPDATING.md.