Skip to content

localize the run_id in dagrun#17502

Closed
LeonZhao28 wants to merge 6 commits into
apache:mainfrom
LeonZhao28:localize_run_id
Closed

localize the run_id in dagrun#17502
LeonZhao28 wants to merge 6 commits into
apache:mainfrom
LeonZhao28:localize_run_id

Conversation

@LeonZhao28

@LeonZhao28 LeonZhao28 commented Aug 9, 2021

Copy link
Copy Markdown
Contributor

Why I commit this change:

  • When checking the airflow features, I found that the dag run_id is generated using the UTC time, but actually, it is not convenient for us to use the UTC time because we are using local time.
  • For example, I am in Beijing now, I ran a DAG at 19:01, the default run_id is manual__2021-09-08T11:01:02.022226+08:00, but after a few hours, I want to find this run, I have to parse the local time to UTC time to get the specific dag_run. If the run_id is generated using local time, the run_id will be manual__2021-09-08T19:01:02.022226+08:00, I can easily find this dag_run.

How do I change:

  • Added a config localize_dag_run_id in the core config and the default value is False. It takes no effect if the user doesn't care about the dag run_id format. If the users who just like me want to use the local time to generate the run_id, they can set it to True

Result:

image

@boring-cyborg

boring-cyborg Bot commented Aug 9, 2021

Copy link
Copy Markdown

Congratulations on your first Pull Request and welcome to the Apache Airflow community! If you have any issues or are unsure about any anything please check our Contribution Guide (https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst)
Here are some useful points:

  • Pay attention to the quality of your code (flake8, mypy and type annotations). Our pre-commits will help you with that.
  • In case of a new feature add useful documentation (in docstrings or in docs/ directory). Adding a new operator? Check this short guide Consider adding an example DAG that shows how users should use it.
  • Consider using Breeze environment for testing locally, it’s a heavy docker but it ships with a working Airflow and a lot of integrations.
  • Be patient and persistent. It might take some time to get a review or get the final approval from Committers.
  • Please follow ASF Code of Conduct for all communication including (but not limited to) comments on Pull Requests, Mailing list and Slack.
  • Be sure to read the Airflow Coding style.
    Apache Airflow is a community-driven project and together we are making it better 🚀.
    In case of doubts contact the developers at:
    Mailing List: dev@airflow.apache.org
    Slack: https://s.apache.org/airflow-slack

@uranusjr

uranusjr commented Aug 9, 2021

Copy link
Copy Markdown
Member

I kind of feel maybe we should just use this behaviour all the time. The run ID is only for identifying the run, and using UTC has no advantage beside being easy to implement.

@uranusjr

uranusjr commented Aug 9, 2021

Copy link
Copy Markdown
Member

Either way though, you need some tests to ensure this works.

@LeonZhao28

LeonZhao28 commented Aug 9, 2021

Copy link
Copy Markdown
Contributor Author

I kind of feel maybe we should just use this behaviour all the time. The run ID is only for identifying the run, and using UTC has no advantage beside being easy to implement.

I agree that the run ID is only for identifying the run, but for the user, they may want to locate the specific run to check the dag logs, especially when debugging or troubleshooting, they may run the dag serval times manually. If the dag run ID is generated using the local time, it is really convenient for them.
It will not increase the compatibility for our system to allow the user to localize the run_id, we will be very happy to see this change.
I've talked with some friends who are using the airflow, actually, most of them change the sources codes here and re-compile the project to launch.

Either way though, you need some tests to ensure this works.

I've tested the code changes, setting this value to True, False, and with this config missing. It works fine.

@potiuk

potiuk commented Aug 9, 2021

Copy link
Copy Markdown
Member

@uranusjr wrote about adding unit tests.

And also there is one serious caveat (and one that should be covered by the unit tests) as well for sure is the DST change. UTC is not only easy to implement but guarantees conflict avoidance. With local time, if you have hourly scheduled DAG, you have a guaranteed conflict once a year - you will get the same run id conflict (when you move clock backwards). So unless you make sure that timezone is included and reflected properly, you might have a problem.

I think in this case, the time zone offset will be different when you use isoformat, but it definitely needs testing (and what I mean by that unit testing) - so that we can avoid regressions and you actually have shown that you thought and tested that case consciously.

Also i think it would be great to have parameterized unit
test with a number of more or less probable and 'weird' timezones. There are certain timezones (with half or even quarter-hour shifts) so it would.be great to see those tested as well.

@LeonZhao28

Copy link
Copy Markdown
Contributor Author

And also there is one serious caveat (and one that should be covered by the unit tests) as well for sure is the DST change. UTC is not only easy to implement but guarantees conflict avoidance. With local time, if you have hourly scheduled DAG, you have a guaranteed conflict once a year - you will get the same run id conflict (when you move clock backwards). So unless you make sure that timezone is included and reflected properly, you might have a problem.

I think in this case, the time zone offset will be different when you use isoformat, but it definitely needs testing (and what I mean by that unit testing) - so that we can avoid regressions and you actually have shown that you thought and tested that case consciously.

Also i think it would be great to have parameterized unit
test with a number of more or less probable and 'weird' timezones. There are certain timezones (with half or even quarter-hour shifts) so it would.be great to see those tested as well.

I see, it makes sense that the run_id may be conflicting if the system timezone is changed. But actually, for our usage, we really need a local run_id to show to find the specific dag run.
But I've got another solution that we can add another filed name local_run_id, it is transformed from UTC to local time only on the page, the local time is the same as the timezone that the user setting on the right top on the page. So we only need to change the js or transform view logic.
Do you think it is much better?

@potiuk

potiuk commented Aug 9, 2021

Copy link
Copy Markdown
Member

I think local_run_id might be misleading - if people will see different value in the UI and in the DB that might be quite confusing. Plus it adds unnecesary field in the DB. The run_id is used throughout the whole UI and It would be quite a change to use 'local' version all over the UI. And it will be even harder to test. I agree with @uranusjr that using local timezone settings always is a good idea, but it has to be thoroughly unit-tested.

@LeonZhao28 LeonZhao28 marked this pull request as draft August 10, 2021 01:59
@LeonZhao28

LeonZhao28 commented Aug 10, 2021

Copy link
Copy Markdown
Contributor Author

I think local_run_id might be misleading - if people will see different value in the UI and in the DB that might be quite confusing. Plus it adds unnecesary field in the DB. The run_id is used throughout the whole UI and It would be quite a change to use 'local' version all over the UI. And it will be even harder to test. I agree with @uranusjr that using local timezone settings always is a good idea, but it has to be thoroughly unit-tested.

Agree.
I've set this PR to WIP.
I will add the unit test about it.
Because we don't use the daylight saving time in China, I will be very glad if you can help me to do this if possible. Thanks a lot.

@LeonZhao28

Copy link
Copy Markdown
Contributor Author

I want to know, in your local life, if the time is conflict, how do you distinguish them during the daylight saving time switch.
For example, let us meet at 2:00 am, well, which 2:00 am?
😓

@uranusjr

Copy link
Copy Markdown
Member

I think the timezone part will change. For example, if you’re in England, 2am without DST (Greenwich Mean Time) would be 02:00:00+00:00, and 2am with DST (British Summer Time) would be 02:00:00+01:00.

@LeonZhao28

LeonZhao28 commented Aug 10, 2021

Copy link
Copy Markdown
Contributor Author

I think the timezone part will change. For example, if you’re in England, 2am without DST (Greenwich Mean Time) would be 02:00:00+00:00, and 2am with DST (British Summer Time) would be 02:00:00+01:00.

If that, I think it will work fine if we can make sure the DST offset is in the run_id.
Or, we can add another filed to show whether the run_id is generated in DST, just like "manual__2021-08-08T08:08:08.217033__dst",if it is not in DST, the run_id is manual__2021-08-08T08:08:08.217033.
For the area that doesn't use DST, we can add another config to disable the __dst appending.
Do you think it is ok?

@uranusjr

Copy link
Copy Markdown
Member

whether the run_id is generated in DST

This would be a bit difficult to determine unless the timezone implementation correctly implements dst, or if the implementation uses a different timezone altogether. Maybe pendulum is robust enough to make this viable, maybe not, you’ll need to do some tests on this.

@LeonZhao28 LeonZhao28 marked this pull request as ready for review August 10, 2021 09:32
@LeonZhao28

Copy link
Copy Markdown
Contributor Author

I think pendulum works fine when dealing with the DST time.
I've changed the codes and add the unit test for the local run_id

@LeonZhao28 LeonZhao28 changed the title localize the dag_run localize the run_id in dagrun Aug 10, 2021
@LeonZhao28

Copy link
Copy Markdown
Contributor Author

@uranusjr I think you may miss the PR comment message.
I've added the unit test for local run_id, please review it when you are free.
Thanks a lot

@LeonZhao28

Copy link
Copy Markdown
Contributor Author

@ashb @kaxil @XD-DENG The PR is ready for review.
I think this change is quite useful for us who are not using UTC time.
Please review it when you are free. Thanks a lot.

Comment thread airflow/models/dagrun.py Outdated
Comment on lines 290 to 307

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can have unintended consequences -- we will have to look closely.

@LeonZhao28 LeonZhao28 Aug 17, 2021

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. Post the checks before I determine changing these

As I checked the source codes, there may be two risks after changing these codes:

  1. As there is a UNIQUE_KEY(dag_id,run_id) in the database, we must make sure that the two values are unique.
  2. If the user changes the server config localize_dag_run_id from True to False or they change the default_timezone frequently, the dag run_id may be conflict.

For the first risk issue, I use pendulum to make sure the local_times are unique during the DST changing. And I tried to did the full test and add the test codes into PR.
For the second risk issue, I think it is only a low-rate case. Even the user changes the default_timezone, the dag run_id is accurate to six decimal places, it is really hard to make the run_id conflict.

@kaxil kaxil requested a review from ephraimbuddy August 16, 2021 15:13
Comment thread airflow/models/dagrun.py Outdated

@ashb ashb Aug 17, 2021

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be based on the dag timezone, not the default timezone.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it makes sense.
But we have to change the method from static_method to class method.
I will change it tomorrow

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh hmmm. It can't even be a class method. Maybe we should just pass the date in the "right" TZ here already?

@LeonZhao28 LeonZhao28 Aug 17, 2021

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ashb
Just confirm that the dag timezone is the value that is inited here?

if start_date and start_date.tzinfo:

Can we use the tzinfo in the execution_date directly?
Or we need to add a parameter dag_timezone in the method generate_run_id and set the value when calling this method?
Or transform the date to dag timezone datetime outside?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ashb
I checked the codes, we can not use the tzinfo in the execution_date directly, because it has been transformed into UTC already.
image

So we have two ways to implement:

  1. change the method generate_run_id from static method to class method
  2. add a parameter timezone in the method generate_run_id and the default value is settings.TIMEZONE.

Which one do you prefer? Or there is something I miss that we can have a better choice?
Let me know what do you think.
Thanks

Comment thread airflow/models/dagrun.py Outdated
Comment on lines 295 to 296

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see the need for this suffix -- when it's in DST the timezone offset will already be different (+00:00 vs +01:00__DST in your example for UK)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I don't know the living habits who are under DST, because we don't use the DST in China.
I just think it will be much clearer if we add __DST.
I will remove it tomorrow

@LeonZhao28 LeonZhao28 requested a review from mik-laj as a code owner August 18, 2021 08:26
@LeonZhao28

LeonZhao28 commented Aug 18, 2021

Copy link
Copy Markdown
Contributor Author

Commit the new changes.
There are two new changes:

  1. remove __DST suffix in the run_id as the offset will change when changing the DST,
  2. After discussing @ashb , use dag timezone instead of default_timezone make sense, we have three ways to do this:
    1. We use the execution_date timezone directly. But I found the execution_date is UTC already, and if we transform it before calling the method DagRun.generate_run_id, the changes will have a big risk because it may affect the task scheduler. So give it up
    2. change the method DagRun.generate_run_id from staticmethod to classmethod, and it leads that there must be a dag model before calling the method. So give it up
    3. add a parameter dag_timezone in the method generate_run_id. I scan the usage of this method in the project and we can get the dag_time in all the calling codes, and there are not too many actually. So I follow this way.

Actually, I'm still not sure about the second change, we can continue the discussion about it here .

Thanks a lot

@LeonZhao28 LeonZhao28 requested review from ashb and kaxil August 18, 2021 08:38
@LeonZhao28

Copy link
Copy Markdown
Contributor Author

rebased to latest main

@github-actions

Copy link
Copy Markdown
Contributor

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed in 5 days if no further activity occurs. Thank you for your contributions.

@github-actions github-actions Bot added the stale Stale PRs per the .github/workflows/stale.yml policy file label Oct 15, 2021
@LeonZhao28 LeonZhao28 closed this Oct 18, 2021
@LeonZhao28

Copy link
Copy Markdown
Contributor Author

rebase to the latest main.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

stale Stale PRs per the .github/workflows/stale.yml policy file

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants