Skip to content

MSSQL Asset URI: support 5-part path with optional instance name#67999

Merged
dabla merged 1 commit into
apache:mainfrom
dabla:feature/add-azure-fabric-asset-uri-sanitation
Jun 4, 2026
Merged

MSSQL Asset URI: support 5-part path with optional instance name#67999
dabla merged 1 commit into
apache:mainfrom
dabla:feature/add-azure-fabric-asset-uri-sanitation

Conversation

@dabla

@dabla dabla commented Jun 4, 2026

Copy link
Copy Markdown
Contributor

Was generative AI tooling used to co-author this PR?
  • [ x ] Yes (please specify the tool below)

Claude Opus 4.6

Summary

This PR updates MSSQL asset URI sanitization to support both URI path formats (with and without SQL instance) and to properly support Azure SQL and Microsoft Fabric hosts.

Supported URI formats after this change:

  1. mssql://<host>[:port]/<database>/<schema>/<table>
  2. mssql://<host>[:port]/<instance>/<database>/<schema>/<table>

Examples of supported hosts include:

  • Azure SQL Database: <server-unique-identifier>.database.windows.net
  • Microsoft Fabric SQL endpoint: <server-unique-identifier>.<tenant>.fabric.microsoft.com

Changes

sanitize_uri

  • Keeps host validation (mssql:// must include a host).
  • Keeps default port normalization to 1433 when omitted.
  • Accepts both valid path shapes:
    • 4 segments (/database/schema/table)
    • 5 segments (/instance/database/schema/table)
  • Rejects other path shapes with a clear ValueError.

convert_asset_to_openlineage

  • Now sanitizes/parses URI via sanitize_uri(...).
  • Supports both URI path variants when extracting dataset name.
  • Produces consistent OpenLineage output:
    • namespace = mssql://<host>:<port>
    • name = <database>.<schema>.<table>

Tests

Updated unit tests in
providers/microsoft/mssql/tests/unit/microsoft/mssql/assets/test_mssql.py:

  • Added passing cases for:
    • standard host without instance
    • standard host with instance
    • Azure SQL host
    • Fabric host
    • Fabric host with instance
  • Added failing cases for invalid segment counts in both formats.
  • Added OpenLineage conversion test for URI including instance.

Outcome

MSSQL assets now accept both SQL Server-style instance paths and non-instance cloud paths (Azure SQL / Fabric) while keeping OpenLineage conversion consistent and backward compatible.


  • Read the Pull Request Guidelines for more information. Note: commit author/co-author name and email in commits become permanently public when merged.
  • For fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
  • When adding dependency, check compliance with the ASF 3rd Party License Policy.
  • For significant user-facing changes create newsfragment: {pr_number}.significant.rst, in airflow-core/newsfragments. You can add this file in a follow-up commit after the PR is created so you know the PR number.

@dabla dabla force-pushed the feature/add-azure-fabric-asset-uri-sanitation branch 3 times, most recently from d2cbc6d to d212e5d Compare June 4, 2026 08:21
@dabla dabla force-pushed the feature/add-azure-fabric-asset-uri-sanitation branch from d212e5d to accf773 Compare June 4, 2026 11:45
@dabla dabla changed the title Added Azure, Fabric and instance MSSQL Asset URI validation MSSQL Asset URI: support 5-part path with optional instance name Jun 4, 2026
@dabla dabla merged commit 15a4048 into apache:main Jun 4, 2026
93 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants