Notes
This is an issue created as a response to the discussion of omop-alchemy (see AustralianCancerDataNetwork/OMOP_Alchemy#14 (comment)). It therefore depends on PR 14 of `omop-alchemy to complete!
Context
Staging tables are disambiguated purely by name, never by schema. staging_tablename() (src/orm_loader/tables/loadable_table.py:71-86) returns _staging_{tablename}, and every statement in src/orm_loader/backends/postgres.py (create_staging_table, drop_staging_table, load_staging_fast, the merge methods) uses that name unqualified, e.g. CREATE UNLOGGED TABLE "{staging_name}" (LIKE "{target_name}" ...). orm_loader has no concept of a schema anywhere in this path.
Problem
A consumer that wants staging tables physically isolated in their own schema (kept out of backups, migrations, and introspection of the real schema) has no API for it. The only workaround today, used in OMOP_Alchemy's cli_vocab.py, is to put a dedicated staging schema first in search_path and rely on the _staging_ prefix never colliding with a real table name so the unqualified SQL resolves to the right schema by accident. That's fragile, and required a full search_path walkthrough in PR review for someone to understand which table an unqualified TRUNCATE was actually hitting.
Proposal
Thread an explicit schema through the staging table lifecycle (e.g. staging_schema: str | None on create_staging_table/drop_staging_table/load_staging_fast/the merge methods, or a class-level override alongside _staging_tablename), and qualify identifiers ("{schema}"."{name}") instead of leaving resolution to search_path. Consumers that don't pass a schema keep today's behavior.
Decision
We considered dropping the separate schema and leaning entirely on the _staging_ prefix instead, but went with making orm_loaderschema-aware so the schema separation OMOP_Alchemy wants doesn't depend on name non-collision.
Notes
This is an issue created as a response to the discussion of
omop-alchemy(see AustralianCancerDataNetwork/OMOP_Alchemy#14 (comment)). It therefore depends on PR 14 of `omop-alchemy to complete!Context
Staging tables are disambiguated purely by name, never by schema.
staging_tablename()(src/orm_loader/tables/loadable_table.py:71-86) returns_staging_{tablename}, and every statement insrc/orm_loader/backends/postgres.py(create_staging_table,drop_staging_table,load_staging_fast, the merge methods) uses that name unqualified, e.g.CREATE UNLOGGED TABLE "{staging_name}" (LIKE "{target_name}" ...). orm_loader has no concept of a schema anywhere in this path.Problem
A consumer that wants staging tables physically isolated in their own schema (kept out of backups, migrations, and introspection of the real schema) has no API for it. The only workaround today, used in OMOP_Alchemy's
cli_vocab.py, is to put a dedicatedstagingschema first insearch_pathand rely on the_staging_prefix never colliding with a real table name so the unqualified SQL resolves to the right schema by accident. That's fragile, and required a full search_path walkthrough in PR review for someone to understand which table an unqualifiedTRUNCATEwas actually hitting.Proposal
Thread an explicit schema through the staging table lifecycle (e.g.
staging_schema: str | Noneoncreate_staging_table/drop_staging_table/load_staging_fast/the merge methods, or a class-level override alongside_staging_tablename), and qualify identifiers ("{schema}"."{name}") instead of leaving resolution tosearch_path. Consumers that don't pass a schema keep today's behavior.Decision
We considered dropping the separate schema and leaning entirely on the
_staging_prefix instead, but went with makingorm_loaderschema-aware so the schema separation OMOP_Alchemy wants doesn't depend on name non-collision.