Skip to content

Bug fix GCSToS3Operator: avoid ValueError when replace=False with files already in S3#32322

Merged
vincbeck merged 3 commits into
apache:mainfrom
Adaverse:gcs_s3_bug
Jul 4, 2023
Merged

Bug fix GCSToS3Operator: avoid ValueError when replace=False with files already in S3#32322
vincbeck merged 3 commits into
apache:mainfrom
Adaverse:gcs_s3_bug

Conversation

@Adaverse

@Adaverse Adaverse commented Jul 2, 2023

Copy link
Copy Markdown
Contributor

When few or all files that are already there in S3 that needs to be transferred from GCS to S3 with replace=False and s3_dest_url not ending with / ends up in the below error -

bug_ing

The reason being the filter snippet below doesn't filter the already existing filter files leading to the error due to missing /

# remove the prefix for the existing files to allow the match
existing_files = [file.replace(prefix, "", 1) for file in existing_files] # <---- prefix is missing /
files = list(set(files) - set(existing_files))

The tests do not catch it because this scenario is not covered.

Hence have performed the below tasks to mitigate the above scenario -

  • Apply / if not present (avoiding if prefix is empty)
  • Increase test coverage to include above scenaio

^ Add meaningful description above

Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in newsfragments.

@boring-cyborg boring-cyborg Bot added area:providers provider:amazon AWS/Amazon - related issues labels Jul 2, 2023
Comment thread tests/providers/amazon/aws/transfers/test_gcs_to_s3.py Outdated
Comment thread tests/providers/amazon/aws/transfers/test_gcs_to_s3.py Outdated
Comment thread airflow/providers/amazon/aws/transfers/gcs_to_s3.py Outdated
Comment thread tests/providers/amazon/aws/transfers/test_gcs_to_s3.py Outdated
@Adaverse Adaverse requested a review from uranusjr July 3, 2023 06:43
Comment thread tests/providers/amazon/aws/transfers/test_gcs_to_s3.py Outdated
@Adaverse Adaverse requested a review from uranusjr July 3, 2023 07:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:providers provider:amazon AWS/Amazon - related issues

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants