PostgresHook: Add configurable UPSERT update fields#67045
Conversation
64de06d to
5709c06
Compare
justinpakzad
left a comment
There was a problem hiding this comment.
Nice PR. Left a couple of comments. I know it's still in draft but figured I'd leave some feedback anyways.
That’s perfectly fine. I will address the feedback once the dependency is merged. |
5709c06 to
a438020
Compare
|
Upsert rows is already supported in PostgresHook via insert_rows and insert_args={“replace”: True} |
I will convert this to draft. This either needs to be reworked or closed. Sorry for overlooking that. |
You could still implement this method in the PostgresHook as a shortcut convenience method and delegate the call to the insert_rows method with replace=True. And then just test the delegation, I still think this is a good idea. |
a438020 to
2ef09bd
Compare
Extend PostgreSQL ON CONFLICT support by allowing callers to specify which columns are updated when conflicts occur. Preserve existing behavior when no update fields are provided, support DO NOTHING semantics via an empty update field list, and add an upsert_rows convenience wrapper built on top of the existing insert_rows(replace=True) implementation.
2ef09bd to
9714369
Compare
|
I have re-scoped the PR down to a convenience method i.e. |
Please elaborate. This is important since we're in the process of switching the default driver for postgres (both sync and async) to psycopg v3. |
I have left a comment on the relevant part of the diff. |
This is regarding to the Airflow postgres dialect, not the SQLAlchemy one to be clear, normally this should have zero effect on the used driver. |
|
@SameerMesiah97 please resolve all comments that have been addressed, this makes review easier. |
|
Congratulations @SameerMesiah97 , thank your for the great work! |
Extend PostgreSQL ON CONFLICT support by allowing callers to specify which columns are updated when conflicts occur. Preserve existing behavior when no update fields are provided, support DO NOTHING semantics via an empty update field list, and add an upsert_rows convenience wrapper built on top of the existing insert_rows(replace=True) implementation. Co-authored-by: Sameer Mesiah <smesiah971@gmail.com>
Description
This change extends PostgreSQL UPSERT support by introducing configurable update fields for
INSERT ... ON CONFLICToperations.A new
replace_targetoption is added toPostgresDialect.generate_replace_sql, allowing callers to specify which columns are updated when a conflict occurs. When omitted, the existing behavior of updating all non-conflict columns is preserved. When an empty list is provided,DO NOTHINGsemantics are generated.The change also adds a convenience
PostgresHook.upsert_rowsmethod that wraps the existinginsert_rows(replace=True)implementation and exposes PostgreSQL UPSERT concepts throughconflict_fieldsandupdate_fields.Rationale
While PostgreSQL UPSERT support already exists through
insert_rows(replace=True), callers cannot currently control which columns are updated when conflicts occur. This change adds that flexibility while preserving existing behavior by default.The
upsert_rowsconvenience method provides a more intuitive PostgreSQL-specific API without requiring callers to use lower-level dialect arguments.Tests
Added unit tests verifying that:
upsert_rowscorrectly delegates to the existinginsert_rows(replace=True)implementation.Notes
The entry for the parameter
replacein the docstring forgenerate_replace_sqlhas been removed as the argument for it is not being used within the function.Backwards Compatibility
This change is fully backwards compatible.
Existing
insert_rows(replace=True)behavior is preserved. Callers that do not specify update fields continue to update all non-conflict columns exactly as before.