-
Notifications
You must be signed in to change notification settings - Fork 16.8k
run_grant_dataset_view_access fails if view already exists #35795
Description
Apache Airflow version
Other Airflow 2 version (please specify below)
What happened
The function run_grant_dataset_view_access does not correctly check if the view already exists, breaking what's specified in the docs
If this view has already been granted access to the dataset, do nothing
What you think should happen instead
The function should skip trying to create the view if it already exists
How to reproduce
Use GCP Composer, create a DAG with a BigQueryHook, call run_grant_dataset_view_access to grant a view in dataset A access to dataset B, but the view already has access to dataset B.
Operating System
composer-2.5.1-airflow-2.6.3
Versions of Apache Airflow Providers
The ones included in the above installation (the provider in question is google-cloud-bigquery==3.12.0)
Deployment
Google Cloud Composer
Deployment details
No response
Anything else
Investigation:
The logs include "Granting table xxx authorized view access to xxx dataset.", which indicates that this if-statement is True.
When running a python script locally with the same version of google-cloud-bigquery I found that the AccessEntry object fetched from the dataset does not match the created AccessEntry object. The mismatch occurs in the _properties dict, where the fetched AccessEntry only contains one entry view whereas the created one contains two entries, view and role. The following script shows the problem
from google.cloud import bigquery
from google.cloud.bigquery.dataset import AccessEntry
from copy import deepcopy
def main():
client = bigquery.Client(project=<project>, location=<location>)
dataset = client.get_dataset(<dataset>)
access_entries = dataset.access_entries
view_access = AccessEntry(
role=None,
entity_type="view",
entity_id={
"projectId": <project>,
"datasetId": <dataset>,
"tableId": <table>
})
view_access_no_role = AccessEntry(
role=None,
entity_type="view",
entity_id={
"projectId": <project>,
"datasetId": <dataset>,
"tableId": <table>
})
del view_access_no_role._properties['role']
isInAccessEntries_v1 = view_access in access_entries #False
isInAccessEntries_v2 = view_access_no_role in access_entries #True
existing_view_access = access_entries[<index of existing view>]
existing_view_access_fixed = deepcopy(existing_view_access)
existing_view_access_fixed._properties['role'] = None
access_entries.append(existing_view_access_fixed)
isInAccessEntries_v3 = view_access in access_entries #True
if __name__ == "__main__":
main()Are you willing to submit PR?
- Yes I am willing to submit a PR!
Code of Conduct
- I agree to follow this project's Code of Conduct