CLI Overhaul and improved vocabulary ingestion#14
Conversation
49d3ee0 to
0533caa
Compare
… CLIs, alter docs slightly
There was a problem hiding this comment.
Pull request overview
This PR restructures the omop-alchemy maintenance CLI into smaller, domain-focused modules with a new backend abstraction layer, and migrates configuration/connection resolution to oa-configurator. It also updates tests and documentation to reflect the new CLI architecture and configuration flow.
Changes:
- Introduces
omop_alchemy/backends/withBackend+ dialect implementations and shifts dialect-specific behavior out of CLI modules. - Adds
@omop_commanddecorator to centralize CLI engine/config resolution, rendering, and error handling. - Updates docs, packaging (Hatchling), and tests to align with
oa-configurator-based configuration.
Reviewed changes
Copilot reviewed 82 out of 87 changed files in this pull request and generated 10 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/test_truncate_tables.py | Updates truncate tests to new CLI/table modules and oa-configurator config injection. |
| tests/test_load_vocab.py | Minor typing + SQLAlchemy NULL-comparison fix. |
| tests/test_load_vocab_source.py | Updates vocab-load tests to new CLI module and oa-configurator config injection. |
| tests/test_load_vocab_postgres.py | Marks Postgres integration tests as requiring a configured test DB resource. |
| tests/test_indexes.py | Updates index tests to new CLI indexes module + oa-configurator config injection. |
| tests/test_fulltext.py | Updates fulltext tests to backend-based fulltext implementation and new CLI module. |
| tests/test_data_summary.py | Updates imports to CLI schema shim. |
| tests/test_create_tables.py | Updates imports to CLI schema shim. |
| tests/test_config_driver.py | Updates tests to new config/engine creation API (create_cdm_engine). |
| tests/test_conditions_basic.py | Removes duplicate import. |
| tests/test_cli_config.py | Replaces removed config subcommand tests with smoke tests. |
| tests/test_analyze_tables.py | Updates analyze tests to new CLI tables module and backend error messages. |
| tests/conftest.py | Moves Postgres test DB resolution to oa-configurator pytest plugin; adds safety guard. |
| README.md | Documents oa-configurator-based configuration and Docker Compose workflow. |
| pyproject.toml | Swaps dotenv for oa-configurator, switches build to Hatchling, adds omop.config entry-point. |
| omop_alchemy/py.typed | Marks package as typed. |
| omop_alchemy/maintenance/ui.py | Updates UI rendering types/labels to backend-based results and new config fields. |
| omop_alchemy/maintenance/truncate_tables.py | Removes legacy truncate implementation (moved to CLI domain modules). |
| omop_alchemy/maintenance/tables.py | Adds typing ignore for schema argument type mismatch. |
| omop_alchemy/maintenance/reset_sequences.py | Removes legacy sequence reset implementation (moved to CLI domain modules). |
| omop_alchemy/maintenance/indexes.py | Removes legacy index management implementation (moved to cli_indexes). |
| omop_alchemy/maintenance/help.py | Changes help backend-support note parsing to new marker scheme. |
| omop_alchemy/maintenance/foreign_keys.py | Removes legacy FK management implementation (moved to CLI domain modules). |
| omop_alchemy/maintenance/defaults.py | Removes legacy defaults-file config mechanism. |
| omop_alchemy/maintenance/cli_schema.py | Adds schema subapp shim + wires schema-related commands. |
| omop_alchemy/maintenance/cli_schema_tables.py | New table creation domain module (create missing tables). |
| omop_alchemy/maintenance/cli_schema_summary.py | New data summary domain module. |
| omop_alchemy/maintenance/cli_schema_reconcile.py | New schema reconciliation domain module using backend capabilities. |
| omop_alchemy/maintenance/cli_schema_info.py | New environment/connection readiness inspection using oa-configurator. |
| omop_alchemy/maintenance/cli_schema_doctor.py | New doctor/healthcheck module built on new info + FK + reconcile modules. |
| omop_alchemy/maintenance/cli_indexes.py | New index management commands + backend-driven clustering support. |
| omop_alchemy/maintenance/cli_fulltext.py | New fulltext sidecar management commands using backend abstraction. |
| omop_alchemy/maintenance/cli_backup.py | New backup/restore commands delegating command assembly to backends. |
| omop_alchemy/maintenance/backup.py | Removes legacy backup implementation (moved to cli_backup). |
| omop_alchemy/maintenance/analyze_tables.py | Removes legacy analyze implementation (moved to CLI tables module). |
| omop_alchemy/maintenance/_cli_utils.py | Adds @omop_command decorator + shared CLI helpers/error handling. |
| omop_alchemy/maintenance/init.py | Re-exports updated CLI domain APIs and types. |
| omop_alchemy/config.py | Replaces env-based config with oa-configurator PackageConfigBase + engine creation. |
| omop_alchemy/cdm/model/vocabulary/vocabulary.py | Removes unused import. |
| omop_alchemy/cdm/model/unstructured/note_nlp.py | Cleans unused imports/TYPE_CHECKING block. |
| omop_alchemy/cdm/model/typing.py | Removes unused imports. |
| omop_alchemy/cdm/handlers/vocabs_and_mappers/concept_resolver.py | Removes unused imports. |
| omop_alchemy/cdm/handlers/vocabs_and_mappers/concept_registry.py | Removes unused import. |
| omop_alchemy/cdm/handlers/timeline/event_timeline.py | Cleans imports and adjusts typing ignores for mixin usage. |
| omop_alchemy/cdm/handlers/fulltext/fulltext.py | Removes legacy fulltext handler implementation (moved to backends). |
| omop_alchemy/cdm/handlers/fulltext/init.py | Removes legacy fulltext exports. |
| omop_alchemy/cdm/handlers/init.py | Removes legacy fulltext exports from handlers package. |
| omop_alchemy/cdm/base/typing.py | Minor whitespace cleanup. |
| omop_alchemy/cdm/base/reference_context.py | Removes unused imports. |
| omop_alchemy/cdm/base/indexing.py | Removes duplicate Mapping import. |
| omop_alchemy/cdm/base/domain_validation.py | Adds typing ignore for union attribute access. |
| omop_alchemy/cdm/base/decorators.py | Removes unused typing import. |
| omop_alchemy/cdm/base/column_mixins.py | Removes unused imports. |
| omop_alchemy/cdm/base/column_helpers.py | Removes unused import. |
| omop_alchemy/backends/sqlite.py | Adds SQLite backend implementation (currently analyze only). |
| omop_alchemy/backends/resolve.py | Adds dialect→backend resolution. |
| omop_alchemy/backends/base.py | Adds Backend ABC, support checks, fulltext types, and capability gating. |
| omop_alchemy/backends/init.py | Exposes backend API surface. |
| omop_alchemy/backend_support.py | Removes legacy backend support helpers (replaced by backends layer). |
| omop_alchemy/init.py | Exposes OmopAlchemyConfig as package-level API. |
| mkdocs.yml | Updates nav + theme configuration; adds CLI reference section. |
| docs/index.md | Updates docs landing-page links. |
| docs/getting-started/quickstart.md | Adds Postgres test-running instructions. |
| docs/getting-started/installation.md | Removes env-based config docs; points to oa-configurator configuration. |
| docs/getting-started/configuration.md | New oa-configurator configuration guide. |
| docs/cli/reference.md | New CLI command reference. |
| docs/cli/index.md | New CLI architecture overview (decorator + conn context). |
| docs/api/typing.md | New typing API documentation. |
| docs/api/configuration.md | (Touched/placeholder) API configuration docs presence. |
| docs/advanced/vocabulary_load_performance.md | New performance-tuning guide for vocabulary loading. |
| docs/advanced/views.md | (Touched/placeholder) advanced views docs presence. |
| docs/advanced/timelines.md | Adds/renames timelines documentation. |
| docs/advanced/query_patterns.md | (Touched/placeholder) query patterns docs presence. |
| docs/advanced/index.md | Updates advanced index to match new advanced pages. |
| Dockerfile | Adds simple image for local CLI usage with postgres extra. |
| docker-compose.yaml | Adds Postgres + python-alchemy service and auto-configure on startup. |
| .omop-maint.toml | Removes legacy defaults file. |
| .env.example | Adds Docker Compose override example vars. |
Comments suppressed due to low confidence (1)
pyproject.toml:40
omop_alchemy/config.pynow importspydantic.Field, butpydanticis not declared as a direct dependency. This can break installation ifoa-configuratorstops depending on pydantic (or pins an incompatible version). Add an explicitpydanticrequirement.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
in fulltext.md docs, the reference to import register_optional_fulltext_columns, unregister_optional_fulltext_columns, concept_name_tsvector_expression are now wrong (moved to PostgresBackend) |
from omop_alchemy.backends import resolve_backend
backend = resolve_backend(engine)
backend.register_fulltext_metadata()
vector = backend.concept_name_tsvector_expression()Checked the rest of |
Summary
This PR delivers a full structural overhaul of the
omop-alchemymaintenance CLI, motivated by accumulated complexity, duplicated boilerplate across every command, raw SQL scattered through CLI files, and documentation that was insufficient to understand how commands should be used.Fixes #12, Fixes #16
Part 1: CLI cleanup
Backend abstraction layer
A new
omop_alchemy/backends/package introduces aBackendABC withPostgresBackendandSQLiteBackendimplementations. All dialect-specific SQL (FK trigger toggling, ANALYZE/TRUNCATE/sequence management, CLUSTER, full-text sidecar management, backup/restore command assembly) is extracted from the CLI files and centralised in the backend implementations. CLI files are now orchestration and rendering only.backend_support.pydeleted and replaced bybackend_supports/require_backend_support/BackendNotSupportedErroron the Backend base classcdm/handlers/fulltext/deleted; full-text logic absorbed intobackends/fulltext.pyandPostgresBackend@omop_commanddecoratorAn
@omop_commanddecorator in_cli_utils.pyeliminates three repeated patterns from all 18 CLI command functions: connection parameter declarations (--dotenv,--engine-schema,--db-schema),setup_cli_cmdboilerplate, andtry/except handle_errorwrappers. Command bodies now only see a resolvedconnobject and a readyengine.cli_schema.pydecomposedThe 1,500-line monolithic file was split into five single-responsibility domain modules (
cli_schema_info.py,cli_schema_doctor.py,cli_schema_reconcile.py,cli_schema_tables.py,cli_schema_summary.py). The original file is retained as a thin re-export shim so all existing callers continue to work unchanged.Documentation
docs/cli/index.md: CLI architecture overview covering the@omop_commanddecorator, connection resolution, and theconnobjectdocs/cli/reference.md: full command reference with parameter tables for all 25 commandsPart 2: Accelerated ingestion of vocabularies
** To be done**
Part 3: OA Configurator integration
PENDING
Integration with the OA Configurator will absorb the engine-creation and configuration machinery (
db.py,config.py,logger_config.py) from this package into the shared configuration tool. Thebackends/layer is configuration-agnostic by design and will slot in unchanged when this work is complete.This section will be updated when the integration is ready.