Skip to content

Fix OpenLineage DAG & Task facets field types#51165

Closed
dolfinus wants to merge 1 commit into
apache:mainfrom
dolfinus:bugfix/openlineage-inlets-outlets-wrong-type
Closed

Fix OpenLineage DAG & Task facets field types#51165
dolfinus wants to merge 1 commit into
apache:mainfrom
dolfinus:bugfix/openlineage-inlets-outlets-wrong-type

Conversation

@dolfinus

@dolfinus dolfinus commented May 28, 2025

Copy link
Copy Markdown
Contributor

OpenLineage facet for Task describes inlets & outlets as string, but they actually list of strings:

{
    "run": {
        "facets": {
            "airflow": {
                "_producer": "https://github.com/apache/airflow/tree/providers-openlineage/1.11.0",
                "_schemaURL": "https://openlineage.io/spec/2-0-2/OpenLineage.json#/$defs/RunFacet",
                "task": {
                    "depends_on_past": false,
                    "downstream_task_ids": "['add_period_in_hive']",
                    "executor_config": {},
                    "ignore_first_depends_on_past": true,
                    "inlets": [],
                    "is_setup": false,
                    "is_teardown": false,
                    "mapped": false,
                    "multiple_outputs": false,
                    "operator_class": "PythonOperator",
                    "operator_class_path": "***.operators.python.PythonOperator",
                    "outlets": [],
                    "owner": "***",
                    "priority_weight": 1,
                    "queue": "default",
                    "retries": 2,
                    "retry_exponential_backoff": false,
                    "task_id": "add_params_in_xcom",
                    "trigger_rule": "all_success",
                    "upstream_task_ids": "['Sensors.Sensor__hdp2gp_oebsap_dm__oebs-dm-replication-to-greenplum', 'Sensors.Sensor__hdp2gp_oebsar_dm__oebs-dm-replication-to-greenplum', 'Sensors.Sensor__hdp2gp_oebsfa_dm__oebs-dm-replication-to-greenplum', 'Sensors.Sensor__postgres2gp_1c_sales_report__oebs-dm-replication-to-greenplum', 'Sensors.Sensor__hdp2gp_oebsgl_dm__oebs-dm-replication-to-greenplum', 'Sensors.Sensor__hdp2gp_oebspa_dm__oebs-dm-replication-to-greenplum', 'Sensors.Sensor__hdp2gp_oebsinv_dm__oebs-dm-replication-to-greenplum', 'Sensors.Sensor__hdp2gp_oebsxtr_dm__oebs-dm-replication-to-greenplum', 'Sensors.Sensor__hdp2gp_oebspay_dm__oebs-dm-replication-to-greenplum', 'Sensors.Sensor__hdp2gp_oebsce_dm__oebs-dm-replication-to-greenplum']",
                    "wait_for_downstream": false,
                    "wait_for_past_depends_before_skipping": false,
                    "weight_rule": "<<non-serializable: _DownstreamPriorityWeightStrategy>>"
                }
            }
        }
    }
}

Same for DAG tags:

{
    "run": {
        "facets": {
            "airflow": {
                "_producer": "https://github.com/apache/airflow/tree/providers-openlineage/1.11.0",
                "_schemaURL": "https://openlineage.io/spec/2-0-2/OpenLineage.json#/$defs/RunFacet",
                "dag": {
                    "dag_id": "control_dm__oebsstatus",
                    "fileloc": "/data/airflow/dags/oebsstatus/master/control_dm.py",
                    "owner": "airflow",
                    "schedule_interval": "00 4 * * *",
                    "tags": [
                        "oebsstatus",
                        "master"
                    ],
                    "timetable": {
                        "expression": "00 4 * * *",
                        "timezone": "UTC"
                    }
                }
            }
        }
    }
}

^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in airflow-core/newsfragments.

@mobuchowski

Copy link
Copy Markdown
Contributor

@dolfinus this has been true in 1.11.0, but not since - see #41786

@dolfinus

Copy link
Copy Markdown
Contributor Author

But why tags list should be serialized to JSON?

@dolfinus dolfinus closed this May 28, 2025
@dolfinus dolfinus deleted the bugfix/openlineage-inlets-outlets-wrong-type branch May 28, 2025 15:33
@mobuchowski

Copy link
Copy Markdown
Contributor

@dolfinus they shouldn't, but they are and the breaking change will cause problems for some consumers... unfortunately I feel like at this point is easier to deal with this than change it.

@kacpermuda

Copy link
Copy Markdown
Collaborator

I've also wanted to improve that recently in #50399, and we decided to not do that. My idea for now is using TagsJobFacet instead. I think it's a good idea to move some information from airflowRunFacet into more generic facets instead. @dolfinus WDYT? I'll probably work on it next week

@dolfinus

Copy link
Copy Markdown
Contributor Author

I agree

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants