Skip to content

CLI Overhaul and improved vocabulary ingestion#14

Merged
nicoloesch merged 40 commits into
mainfrom
12-cli-vocab-load
Jun 16, 2026
Merged

CLI Overhaul and improved vocabulary ingestion#14
nicoloesch merged 40 commits into
mainfrom
12-cli-vocab-load

Conversation

@nicoloesch

@nicoloesch nicoloesch commented May 25, 2026

Copy link
Copy Markdown
Collaborator

Summary

This PR delivers a full structural overhaul of the omop-alchemy maintenance CLI, motivated by accumulated complexity, duplicated boilerplate across every command, raw SQL scattered through CLI files, and documentation that was insufficient to understand how commands should be used.

Fixes #12, Fixes #16


Part 1: CLI cleanup

Backend abstraction layer

A new omop_alchemy/backends/ package introduces a Backend ABC with PostgresBackend and SQLiteBackend implementations. All dialect-specific SQL (FK trigger toggling, ANALYZE/TRUNCATE/sequence management, CLUSTER, full-text sidecar management, backup/restore command assembly) is extracted from the CLI files and centralised in the backend implementations. CLI files are now orchestration and rendering only.

  • backend_support.py deleted and replaced by backend_supports / require_backend_support / BackendNotSupportedError on the Backend base class
  • cdm/handlers/fulltext/ deleted; full-text logic absorbed into backends/fulltext.py and PostgresBackend

@omop_command decorator

An @omop_command decorator in _cli_utils.py eliminates three repeated patterns from all 18 CLI command functions: connection parameter declarations (--dotenv, --engine-schema, --db-schema), setup_cli_cmd boilerplate, and try/except handle_error wrappers. Command bodies now only see a resolved conn object and a ready engine.

cli_schema.py decomposed

The 1,500-line monolithic file was split into five single-responsibility domain modules (cli_schema_info.py, cli_schema_doctor.py, cli_schema_reconcile.py, cli_schema_tables.py, cli_schema_summary.py). The original file is retained as a thin re-export shim so all existing callers continue to work unchanged.

Documentation

  • Module-level docstrings added to all 14 CLI source files
  • Docstring style standardised (no semicolons as clause joiners)
  • New docs/cli/index.md: CLI architecture overview covering the @omop_command decorator, connection resolution, and the conn object
  • New docs/cli/reference.md: full command reference with parameter tables for all 25 commands
  • MkDocs nav corrected: all orphaned model, API, and advanced pages added; dead reference removed

Part 2: Accelerated ingestion of vocabularies

** To be done**


Part 3: OA Configurator integration

PENDING

Integration with the OA Configurator will absorb the engine-creation and configuration machinery (db.py, config.py, logger_config.py) from this package into the shared configuration tool. The backends/ layer is configuration-agnostic by design and will slot in unchanged when this work is complete.

This section will be updated when the integration is ready.

@nicoloesch nicoloesch force-pushed the 12-cli-vocab-load branch from 49d3ee0 to 0533caa Compare May 27, 2026 00:36

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR restructures the omop-alchemy maintenance CLI into smaller, domain-focused modules with a new backend abstraction layer, and migrates configuration/connection resolution to oa-configurator. It also updates tests and documentation to reflect the new CLI architecture and configuration flow.

Changes:

  • Introduces omop_alchemy/backends/ with Backend + dialect implementations and shifts dialect-specific behavior out of CLI modules.
  • Adds @omop_command decorator to centralize CLI engine/config resolution, rendering, and error handling.
  • Updates docs, packaging (Hatchling), and tests to align with oa-configurator-based configuration.

Reviewed changes

Copilot reviewed 82 out of 87 changed files in this pull request and generated 10 comments.

Show a summary per file
File Description
tests/test_truncate_tables.py Updates truncate tests to new CLI/table modules and oa-configurator config injection.
tests/test_load_vocab.py Minor typing + SQLAlchemy NULL-comparison fix.
tests/test_load_vocab_source.py Updates vocab-load tests to new CLI module and oa-configurator config injection.
tests/test_load_vocab_postgres.py Marks Postgres integration tests as requiring a configured test DB resource.
tests/test_indexes.py Updates index tests to new CLI indexes module + oa-configurator config injection.
tests/test_fulltext.py Updates fulltext tests to backend-based fulltext implementation and new CLI module.
tests/test_data_summary.py Updates imports to CLI schema shim.
tests/test_create_tables.py Updates imports to CLI schema shim.
tests/test_config_driver.py Updates tests to new config/engine creation API (create_cdm_engine).
tests/test_conditions_basic.py Removes duplicate import.
tests/test_cli_config.py Replaces removed config subcommand tests with smoke tests.
tests/test_analyze_tables.py Updates analyze tests to new CLI tables module and backend error messages.
tests/conftest.py Moves Postgres test DB resolution to oa-configurator pytest plugin; adds safety guard.
README.md Documents oa-configurator-based configuration and Docker Compose workflow.
pyproject.toml Swaps dotenv for oa-configurator, switches build to Hatchling, adds omop.config entry-point.
omop_alchemy/py.typed Marks package as typed.
omop_alchemy/maintenance/ui.py Updates UI rendering types/labels to backend-based results and new config fields.
omop_alchemy/maintenance/truncate_tables.py Removes legacy truncate implementation (moved to CLI domain modules).
omop_alchemy/maintenance/tables.py Adds typing ignore for schema argument type mismatch.
omop_alchemy/maintenance/reset_sequences.py Removes legacy sequence reset implementation (moved to CLI domain modules).
omop_alchemy/maintenance/indexes.py Removes legacy index management implementation (moved to cli_indexes).
omop_alchemy/maintenance/help.py Changes help backend-support note parsing to new marker scheme.
omop_alchemy/maintenance/foreign_keys.py Removes legacy FK management implementation (moved to CLI domain modules).
omop_alchemy/maintenance/defaults.py Removes legacy defaults-file config mechanism.
omop_alchemy/maintenance/cli_schema.py Adds schema subapp shim + wires schema-related commands.
omop_alchemy/maintenance/cli_schema_tables.py New table creation domain module (create missing tables).
omop_alchemy/maintenance/cli_schema_summary.py New data summary domain module.
omop_alchemy/maintenance/cli_schema_reconcile.py New schema reconciliation domain module using backend capabilities.
omop_alchemy/maintenance/cli_schema_info.py New environment/connection readiness inspection using oa-configurator.
omop_alchemy/maintenance/cli_schema_doctor.py New doctor/healthcheck module built on new info + FK + reconcile modules.
omop_alchemy/maintenance/cli_indexes.py New index management commands + backend-driven clustering support.
omop_alchemy/maintenance/cli_fulltext.py New fulltext sidecar management commands using backend abstraction.
omop_alchemy/maintenance/cli_backup.py New backup/restore commands delegating command assembly to backends.
omop_alchemy/maintenance/backup.py Removes legacy backup implementation (moved to cli_backup).
omop_alchemy/maintenance/analyze_tables.py Removes legacy analyze implementation (moved to CLI tables module).
omop_alchemy/maintenance/_cli_utils.py Adds @omop_command decorator + shared CLI helpers/error handling.
omop_alchemy/maintenance/init.py Re-exports updated CLI domain APIs and types.
omop_alchemy/config.py Replaces env-based config with oa-configurator PackageConfigBase + engine creation.
omop_alchemy/cdm/model/vocabulary/vocabulary.py Removes unused import.
omop_alchemy/cdm/model/unstructured/note_nlp.py Cleans unused imports/TYPE_CHECKING block.
omop_alchemy/cdm/model/typing.py Removes unused imports.
omop_alchemy/cdm/handlers/vocabs_and_mappers/concept_resolver.py Removes unused imports.
omop_alchemy/cdm/handlers/vocabs_and_mappers/concept_registry.py Removes unused import.
omop_alchemy/cdm/handlers/timeline/event_timeline.py Cleans imports and adjusts typing ignores for mixin usage.
omop_alchemy/cdm/handlers/fulltext/fulltext.py Removes legacy fulltext handler implementation (moved to backends).
omop_alchemy/cdm/handlers/fulltext/init.py Removes legacy fulltext exports.
omop_alchemy/cdm/handlers/init.py Removes legacy fulltext exports from handlers package.
omop_alchemy/cdm/base/typing.py Minor whitespace cleanup.
omop_alchemy/cdm/base/reference_context.py Removes unused imports.
omop_alchemy/cdm/base/indexing.py Removes duplicate Mapping import.
omop_alchemy/cdm/base/domain_validation.py Adds typing ignore for union attribute access.
omop_alchemy/cdm/base/decorators.py Removes unused typing import.
omop_alchemy/cdm/base/column_mixins.py Removes unused imports.
omop_alchemy/cdm/base/column_helpers.py Removes unused import.
omop_alchemy/backends/sqlite.py Adds SQLite backend implementation (currently analyze only).
omop_alchemy/backends/resolve.py Adds dialect→backend resolution.
omop_alchemy/backends/base.py Adds Backend ABC, support checks, fulltext types, and capability gating.
omop_alchemy/backends/init.py Exposes backend API surface.
omop_alchemy/backend_support.py Removes legacy backend support helpers (replaced by backends layer).
omop_alchemy/init.py Exposes OmopAlchemyConfig as package-level API.
mkdocs.yml Updates nav + theme configuration; adds CLI reference section.
docs/index.md Updates docs landing-page links.
docs/getting-started/quickstart.md Adds Postgres test-running instructions.
docs/getting-started/installation.md Removes env-based config docs; points to oa-configurator configuration.
docs/getting-started/configuration.md New oa-configurator configuration guide.
docs/cli/reference.md New CLI command reference.
docs/cli/index.md New CLI architecture overview (decorator + conn context).
docs/api/typing.md New typing API documentation.
docs/api/configuration.md (Touched/placeholder) API configuration docs presence.
docs/advanced/vocabulary_load_performance.md New performance-tuning guide for vocabulary loading.
docs/advanced/views.md (Touched/placeholder) advanced views docs presence.
docs/advanced/timelines.md Adds/renames timelines documentation.
docs/advanced/query_patterns.md (Touched/placeholder) query patterns docs presence.
docs/advanced/index.md Updates advanced index to match new advanced pages.
Dockerfile Adds simple image for local CLI usage with postgres extra.
docker-compose.yaml Adds Postgres + python-alchemy service and auto-configure on startup.
.omop-maint.toml Removes legacy defaults file.
.env.example Adds Docker Compose override example vars.
Comments suppressed due to low confidence (1)

pyproject.toml:40

  • omop_alchemy/config.py now imports pydantic.Field, but pydantic is not declared as a direct dependency. This can break installation if oa-configurator stops depending on pydantic (or pins an incompatible version). Add an explicit pydantic requirement.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread omop_alchemy/maintenance/_cli_utils.py
Comment thread omop_alchemy/maintenance/help.py
Comment thread tests/test_truncate_tables.py Outdated
Comment thread tests/test_truncate_tables.py Outdated
Comment thread tests/test_load_vocab_source.py Outdated
Comment thread tests/test_load_vocab_source.py Outdated
Comment thread tests/test_fulltext.py Outdated
Comment thread tests/test_indexes.py Outdated
Comment thread docs/getting-started/quickstart.md Outdated
Comment thread docs/cli/reference.md Outdated
@nicoloesch nicoloesch marked this pull request as ready for review June 15, 2026 23:11
@nicoloesch nicoloesch requested a review from gkennos June 15, 2026 23:12
@gkennos

gkennos commented Jun 16, 2026

Copy link
Copy Markdown
Member

in fulltext.md docs, the reference to import register_optional_fulltext_columns, unregister_optional_fulltext_columns, concept_name_tsvector_expression are now wrong (moved to PostgresBackend)

Comment thread omop_alchemy/maintenance/cli_vocab.py Outdated
Comment thread omop_alchemy/maintenance/cli_vocab.py Outdated
Comment thread omop_alchemy/backends/postgres.py Outdated
Comment thread omop_alchemy/backends/resolve.py Outdated
Comment thread tests/conftest.py Outdated
@nicoloesch

Copy link
Copy Markdown
Collaborator Author

in fulltext.md docs, the reference to import register_optional_fulltext_columns, unregister_optional_fulltext_columns, concept_name_tsvector_expression are now wrong (moved to PostgresBackend)

concept_name_tsvector_expression, register_optional_fulltext_columns, and unregister_optional_fulltext_columns aren't importable from omop_alchemy.cdm.handlers anymore; they're methods on Backend/PostgresBackend now (see backends/base.py:212-230). Updated every snippet in the doc to go through resolve_backend(engine) first, e.g.:

from omop_alchemy.backends import resolve_backend

backend = resolve_backend(engine)
backend.register_fulltext_metadata()
vector = backend.concept_name_tsvector_expression()

Checked the rest of docs/. This was the only file referencing the old import path.

@nicoloesch nicoloesch merged commit cb0ae87 into main Jun 16, 2026
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Include oa-configurator for centralised configuration fix: Overhaul the CLI, improve vocab ingestion

3 participants