Skip to content

run_grant_dataset_view_access fails if view already exists #35795

@TobiasHammarstrom

Description

@TobiasHammarstrom

Apache Airflow version

Other Airflow 2 version (please specify below)

What happened

The function run_grant_dataset_view_access does not correctly check if the view already exists, breaking what's specified in the docs

If this view has already been granted access to the dataset, do nothing

What you think should happen instead

The function should skip trying to create the view if it already exists

How to reproduce

Use GCP Composer, create a DAG with a BigQueryHook, call run_grant_dataset_view_access to grant a view in dataset A access to dataset B, but the view already has access to dataset B.

Operating System

composer-2.5.1-airflow-2.6.3

Versions of Apache Airflow Providers

The ones included in the above installation (the provider in question is google-cloud-bigquery==3.12.0)

Deployment

Google Cloud Composer

Deployment details

No response

Anything else

Investigation:

The logs include "Granting table xxx authorized view access to xxx dataset.", which indicates that this if-statement is True.

When running a python script locally with the same version of google-cloud-bigquery I found that the AccessEntry object fetched from the dataset does not match the created AccessEntry object. The mismatch occurs in the _properties dict, where the fetched AccessEntry only contains one entry view whereas the created one contains two entries, view and role. The following script shows the problem

from google.cloud import bigquery
from google.cloud.bigquery.dataset import AccessEntry
from copy import deepcopy

def main():
    client = bigquery.Client(project=<project>, location=<location>)
    dataset = client.get_dataset(<dataset>)
    access_entries = dataset.access_entries 

    view_access = AccessEntry(
        role=None, 
        entity_type="view", 
        entity_id={
            "projectId": <project>, 
            "datasetId": <dataset>, 
            "tableId": <table>
        })

    view_access_no_role = AccessEntry(
        role=None, 
        entity_type="view", 
        entity_id={
            "projectId": <project>, 
            "datasetId": <dataset>, 
            "tableId": <table>
        })
    del view_access_no_role._properties['role']


    isInAccessEntries_v1 = view_access in access_entries #False
    isInAccessEntries_v2 = view_access_no_role in access_entries #True

    existing_view_access = access_entries[<index of existing view>]
    existing_view_access_fixed = deepcopy(existing_view_access)
    existing_view_access_fixed._properties['role'] = None
    access_entries.append(existing_view_access_fixed)

    isInAccessEntries_v3 = view_access in access_entries #True

if __name__ == "__main__":
    main()

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions