Exit asset compilation fork in start-airflow after it completed#32114
Merged
Conversation
The `start-airflow` command uses fork to start parallell asset compilation in the background so that it can happen while docker compose initializes. Unfortunately this fork child did not have sys.exit() so it returned from the function and continued to run second docker-compose in the background. In case asset compilation was not needed this could happen in parallel and both processes attempted to start two docker-compose commands in parallel. This was not visible in "dev" mode - because asset compilation never completed there also - when asset compilation was needed, it took some time before it completed, and the effect of it were not visible, because the forked process did not get terminal output (it has been taken over by tmux by the time it started to use it) and could not grab forwarded ports, so it was running but largely invisible. However when asset compilation was not needed, the two processes started to do the same things at the same time - so a lot of the output has been duplicated and for example the line output has been broken because the same messages were overwriting over each other and canceling the effect of EOL printed to terminal. With this change, the forked process exits as soon as the asset compilation is completed and does not repeat the same steps that the parent process is doiing.
Member
Author
|
cc: @vandonr-amz |
eladkal
approved these changes
Jun 24, 2023
potiuk
added a commit
to potiuk/airflow
that referenced
this pull request
Jun 24, 2023
After apache#32114 the atexit registered killpg produces Permission Error in stdout when the group was missing (basically when the asset compilation stopped). Changing it to ignore the error avoids the false negative appear in the output.
ferruzzi
pushed a commit
to aws-mwaa/upstream-to-airflow
that referenced
this pull request
Jun 27, 2023
…he#32114) The `start-airflow` command uses fork to start parallell asset compilation in the background so that it can happen while docker compose initializes. Unfortunately this fork child did not have sys.exit() so it returned from the function and continued to run second docker-compose in the background. In case asset compilation was not needed this could happen in parallel and both processes attempted to start two docker-compose commands in parallel. This was not visible in "dev" mode - because asset compilation never completed there also - when asset compilation was needed, it took some time before it completed, and the effect of it were not visible, because the forked process did not get terminal output (it has been taken over by tmux by the time it started to use it) and could not grab forwarded ports, so it was running but largely invisible. However when asset compilation was not needed, the two processes started to do the same things at the same time - so a lot of the output has been duplicated and for example the line output has been broken because the same messages were overwriting over each other and canceling the effect of EOL printed to terminal. With this change, the forked process exits as soon as the asset compilation is completed and does not repeat the same steps that the parent process is doiing.
ferruzzi
pushed a commit
to aws-mwaa/upstream-to-airflow
that referenced
this pull request
Jun 27, 2023
apache#32116) After apache#32114 the atexit registered killpg produces Permission Error in stdout when the group was missing (basically when the asset compilation stopped). Changing it to ignore the error avoids the false negative appear in the output.
potiuk
added a commit
that referenced
this pull request
Jul 2, 2023
The `start-airflow` command uses fork to start parallell asset compilation in the background so that it can happen while docker compose initializes. Unfortunately this fork child did not have sys.exit() so it returned from the function and continued to run second docker-compose in the background. In case asset compilation was not needed this could happen in parallel and both processes attempted to start two docker-compose commands in parallel. This was not visible in "dev" mode - because asset compilation never completed there also - when asset compilation was needed, it took some time before it completed, and the effect of it were not visible, because the forked process did not get terminal output (it has been taken over by tmux by the time it started to use it) and could not grab forwarded ports, so it was running but largely invisible. However when asset compilation was not needed, the two processes started to do the same things at the same time - so a lot of the output has been duplicated and for example the line output has been broken because the same messages were overwriting over each other and canceling the effect of EOL printed to terminal. With this change, the forked process exits as soon as the asset compilation is completed and does not repeat the same steps that the parent process is doiing. (cherry picked from commit 5dddf57)
53 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
The
start-airflowcommand uses fork to start parallell asset compilation in the background so that it can happen while docker compose initializes. Unfortunately this fork child did not have sys.exit() so it returned from the function and continued to run second docker-compose in the background. In case asset compilation was not needed this could happen in parallel and both processes attempted to start two docker-compose commands in parallel.This was not visible in "dev" mode - because asset compilation never completed there also - when asset compilation was needed, it took some time before it completed, and the effect of it were not visible, because the forked process did not get terminal output (it has been taken over by tmux by the time it started to use it) and could not grab forwarded ports, so it was running but largely invisible.
However when asset compilation was not needed, the two processes started to do the same things at the same time - so a lot of the output has been duplicated and for example the line output has been broken because the same messages were overwriting over each other and canceling the effect of EOL printed to terminal.
With this change, the forked process exits as soon as the asset compilation is completed and does not repeat the same steps that the parent process is doiing.
^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named
{pr_number}.significant.rstor{issue_number}.significant.rst, in newsfragments.