Collection : with assistant flag and additional enhancements#412
Collection : with assistant flag and additional enhancements#412
Conversation
WalkthroughRefactors collection endpoints into job-driven create/delete flows with callback support, consolidates document retrieval into collection info (with pagination), introduces public collection/job models, routes remote ops between assistant and vector-store backends, updates CRUD/services/helpers for cleanup, and adds broad test coverage. Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant API
participant Scheduler
participant Worker
participant RemoteSvc
participant DB
Client->>API: POST /collections/ (CreationRequest JSON)
Note right of API: compute with_assistant = model && instructions
API->>Scheduler: start_job(project_id, org_id, with_assistant, request_json)
Scheduler->>DB: insert CollectionJob(status=PENDING)
Scheduler-->>Client: 200 (CollectionJobImmediatePublic)
rect rgb(240,255,240)
Note over Worker: async execute_job(job_id, with_assistant, request)
Worker->>DB: update CollectionJob -> PROCESSING
alt with_assistant == true
Worker->>RemoteSvc: create vector store & upload vectors
Worker->>RemoteSvc: create assistant (model+instructions)
else
Worker->>RemoteSvc: upload vectors only
end
Worker->>DB: create Collection + link documents
Worker->>DB: update CollectionJob -> SUCCESSFUL
alt callback_url present
Worker->>API: send_callback(success_payload)
end
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Areas needing extra attention:
Possibly related PRs
Suggested reviewers
Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Codecov Report❌ Patch coverage is 📢 Thoughts on this report? Let us know! |
There was a problem hiding this comment.
Actionable comments posted: 7
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (7)
backend/app/models/collection.py (1)
120-124: Pydantic v2 break: usemodel_fieldsinstead of__fields__inCreationRequest.extract_super_type.This will fail on v2 (no
__fields__). It’s used byexecute_job, so collection creation can break at runtime.Apply this robust v1/v2‑compatible diff:
- def extract_super_type(self, cls: "CreationRequest"): - for field_name in cls.__fields__.keys(): - field_value = getattr(self, field_name) - yield (field_name, field_value) + def extract_super_type(self, cls: type[SQLModel]): + # Pydantic v2: `model_fields`; v1 fallback: `__fields__` + fields = getattr(cls, "model_fields", None) or getattr(cls, "__fields__", {}) + for field_name in fields.keys(): + yield (field_name, getattr(self, field_name))backend/app/services/collections/helpers.py (1)
48-67: Validatebatch_sizeto avoid infinite loops and misuse.Guard against zero/negative values.
def batch_documents( document_crud: DocumentCrud, documents: List[UUID], batch_size: int ): @@ - docs_batches = [] + if batch_size <= 0: + raise ValueError("batch_size must be a positive integer") + docs_batches = []backend/app/tests/crud/collections/collection/test_crud_collection_read_one.py (1)
3-4: Remove unused OpenAI client and import.Cleans up dead code and silences F841.
-from openai import OpenAI from openai_responses import OpenAIMock @@ - client = OpenAI(api_key="sk-test-key") collection = get_collection(db, project=project)Also applies to: 18-19
backend/app/tests/api/routes/collections/test_collection_info.py (1)
26-33: Fix Enum-vs-string comparisons in assertions.API returns strings; tests compare against Enum instances. Compare to
.value(or cast) to avoid false failures.- assert data["status"] == CollectionJobStatus.PENDING + assert data["status"] == CollectionJobStatus.PENDING.value @@ - assert data["status"] == CollectionJobStatus.SUCCESSFUL - assert data["action_type"] == CollectionActionType.CREATE + assert data["status"] == CollectionJobStatus.SUCCESSFUL.value + assert data["action_type"] == CollectionActionType.CREATE.value @@ - assert data["status"] == CollectionJobStatus.FAILED + assert data["status"] == CollectionJobStatus.FAILED.valueAlso applies to: 54-58, 84-85
backend/app/crud/collection/collection.py (1)
22-31: Ensure updated_at is bumped on updates.set updated_at before commit so soft-deletes and other updates record a fresh timestamp.
def _update(self, collection: Collection): + collection.updated_at = now() self.session.add(collection) self.session.commit() self.session.refresh(collection)backend/app/services/collections/delete_collection.py (1)
23-52: Return type mismatch in start_job.Signature says
-> strbut returns a UUID. Return a string (or change the annotation), to avoid downstream type surprises.- return collection_job_id + return str(collection_job_id)backend/app/services/collections/create_collection.py (1)
39-70: Return type mismatch in start_job.Signature is
-> strbut returns a UUID. Align the return.- return collection_job_id + return str(collection_job_id)
🧹 Nitpick comments (10)
backend/app/models/collection.py (2)
6-6: Migrate deprecatedroot_validatorto v2model_validator(mode="before").Removes deprecation and aligns with v2 semantics. Behavior remains the same.
Based on learnings
-from pydantic import HttpUrl, root_validator +from pydantic import HttpUrl, model_validator @@ - @root_validator(pre=True) - def _assistant_fields_all_or_none(cls, values): + @model_validator(mode="before") + def _assistant_fields_all_or_none(cls, values: dict): def norm(x): return x.strip() if isinstance(x, str) and x.strip() else NoneAlso applies to: 89-105
61-69: Small copy/consistency fixes in field descriptions.
- “compatable” → “compatible”.
- Consider clarifying “[To Be Deprecated]” timeline or linking to the replacement API.
- "must be compatable with the assistants API; see the " + "must be compatible with the Assistants API; see the "Also applies to: 71-78, 80-87
backend/app/services/collections/helpers.py (1)
71-73: Add type hints to callbacks and align_backoutdocstring with generic resource.Improves readability and linting; no behavior change.
-from typing import List +from typing import List, Any, Optional @@ -class CallbackHandler: - def __init__(self, collection_job): - self.collection_job = collection_job +class CallbackHandler: + def __init__(self, collection_job: Any): + self.collection_job = collection_job @@ -class WebHookCallback(CallbackHandler): - def __init__(self, url: HttpUrl, collection_job): +class WebHookCallback(CallbackHandler): + def __init__(self, url: HttpUrl, collection_job: Any): super().__init__(collection_job) self.url = url @@ - def __call__(self, response: APIResponse): + def __call__(self, response: APIResponse) -> None: logger.info( f"[WebHookCallback.call] Posting callback | {{'url': '{self.url}'}}" ) - post_callback(self.url, response) + ok = post_callback(self.url, response) + if not ok: + logger.warning("[WebHookCallback.call] Callback POST failed") @@ -def _backout(crud, llm_service_id: str): - """Best-effort cleanup: attempt to delete the assistant by ID""" +def _backout(crud, llm_service_id: str): + """Best-effort cleanup: attempt to delete the remote resource by ID"""Also applies to: 92-119
backend/app/tests/crud/collections/collection/test_crud_collection_delete.py (1)
71-75: Remove unused APIKey lookup.
api_keyisn’t used; delete dead code to keep tests lean.- stmt = select(APIKey).where( - APIKey.project_id == project.id, APIKey.is_deleted == False - ) - api_key = db.exec(stmt).first()backend/app/api/routes/collections.py (1)
55-58: Make with_assistant detection robust to “provided vs empty”.Use Pydantic v2’s
model_fields_setto detect fields provided in the body and keep truthiness checks explicit.- # True iff both model and instructions were provided in the request body - with_assistant = bool( - getattr(request, "model", None) and getattr(request, "instructions", None) - ) + # True iff both fields were present in the payload and non-empty + provided = getattr(request, "model_fields_set", set()) + with_assistant = ( + {"model", "instructions"}.issubset(provided) + and bool(getattr(request, "model")) + and bool(getattr(request, "instructions")) + )If you intend “present regardless of emptiness,” drop the last two
bool(...)checks.backend/app/crud/collection/collection.py (1)
1-6: Duplicate import.Drop the second
import logging.-import logging +import loggingbackend/app/services/collections/delete_collection.py (3)
83-96: Brittle service detection by display name.String-equality on
llm_service_nameis fragile. At minimum, normalize withcasefold()and accept a small set of aliases; longer term, persist a canonical enum/boolean (e.g.,service_typeorwith_assistant) on Collection.- service = (collection.llm_service_name or "").strip().lower() - is_vector = service == "openai vector store" + service = (collection.llm_service_name or "").casefold().strip() + is_vector = service in {"openai vector store", "vector_store", "vector-store"}Please confirm all producers set
llm_service_nameconsistently; otherwise, consider adding aservice_typeenum.
85-95: Unnecessary getattr fallbacks; simplify to the modeled field.
Collectionexposesllm_service_id. Thevector_store_id/assistant_idattributes aren’t part of the model here, so the fallbacks add noise.- llm_service_id = ( - ( - getattr(collection, "vector_store_id", None) - or getattr(collection, "llm_service_id", None) - ) - if is_vector - else ( - getattr(collection, "assistant_id", None) - or getattr(collection, "llm_service_id", None) - ) - ) + llm_service_id = getattr(collection, "llm_service_id", None)
54-62: Unused parametertask_instance.Prefix with
_(or remove) to satisfy linters.-def execute_job( +def execute_job( request: dict, project_id: int, organization_id: int, task_id: str, job_id: str, collection_id: str, - task_instance, + _task_instance, ) -> None:backend/app/services/collections/create_collection.py (1)
72-80: Unused parametertask_instance.Rename to
_task_instance(or drop if not required by the worker runner).-def execute_job( +def execute_job( request: dict, project_id: int, organization_id: int, task_id: str, job_id: str, - with_assistant: bool, - task_instance, + with_assistant: bool, + _task_instance, ) -> None:
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (16)
backend/app/api/routes/collections.py(3 hunks)backend/app/crud/collection/collection.py(3 hunks)backend/app/models/collection.py(2 hunks)backend/app/services/collections/create_collection.py(3 hunks)backend/app/services/collections/delete_collection.py(3 hunks)backend/app/services/collections/helpers.py(2 hunks)backend/app/tests/api/routes/collections/test_collection_info.py(3 hunks)backend/app/tests/api/routes/collections/test_create_collections.py(1 hunks)backend/app/tests/crud/collections/collection/test_crud_collection_create.py(2 hunks)backend/app/tests/crud/collections/collection/test_crud_collection_delete.py(1 hunks)backend/app/tests/crud/collections/collection/test_crud_collection_read_all.py(2 hunks)backend/app/tests/crud/collections/collection/test_crud_collection_read_one.py(1 hunks)backend/app/tests/services/collections/test_create_collection.py(7 hunks)backend/app/tests/services/collections/test_delete_collection.py(6 hunks)backend/app/tests/services/collections/test_helpers.py(1 hunks)backend/app/tests/utils/collection.py(2 hunks)
🧰 Additional context used
📓 Path-based instructions (5)
**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Use type hints in Python code (Python 3.11+ project)
Files:
backend/app/crud/collection/collection.pybackend/app/tests/utils/collection.pybackend/app/tests/api/routes/collections/test_collection_info.pybackend/app/tests/crud/collections/collection/test_crud_collection_read_all.pybackend/app/tests/api/routes/collections/test_create_collections.pybackend/app/api/routes/collections.pybackend/app/services/collections/delete_collection.pybackend/app/models/collection.pybackend/app/services/collections/create_collection.pybackend/app/services/collections/helpers.pybackend/app/tests/services/collections/test_create_collection.pybackend/app/tests/crud/collections/collection/test_crud_collection_read_one.pybackend/app/tests/services/collections/test_helpers.pybackend/app/tests/crud/collections/collection/test_crud_collection_delete.pybackend/app/tests/services/collections/test_delete_collection.pybackend/app/tests/crud/collections/collection/test_crud_collection_create.py
backend/app/crud/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Implement database access operations in backend/app/crud/
Files:
backend/app/crud/collection/collection.py
backend/app/api/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Expose FastAPI REST endpoints under backend/app/api/ organized by domain
Files:
backend/app/api/routes/collections.py
backend/app/services/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Implement business logic services under backend/app/services/
Files:
backend/app/services/collections/delete_collection.pybackend/app/services/collections/create_collection.pybackend/app/services/collections/helpers.py
backend/app/models/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Define SQLModel entities (database tables and domain objects) in backend/app/models/
Files:
backend/app/models/collection.py
🧬 Code graph analysis (15)
backend/app/crud/collection/collection.py (4)
backend/app/models/collection.py (1)
Collection(13-36)backend/app/crud/document_collection.py (1)
DocumentCollectionCrud(8-48)backend/app/core/util.py (1)
now(15-16)backend/app/models/document_collection.py (1)
DocumentCollection(9-23)
backend/app/tests/utils/collection.py (5)
backend/app/models/collection.py (1)
Collection(13-36)backend/app/models/collection_job.py (3)
CollectionActionType(18-20)CollectionJob(35-64)CollectionJobStatus(11-15)backend/app/crud/collection/collection.py (2)
CollectionCrud(17-154)create(42-66)backend/app/crud/collection/collection_job.py (2)
CollectionJobCrud(19-96)create(78-96)backend/app/tests/crud/collections/collection/test_crud_collection_delete.py (1)
get_collection(16-33)
backend/app/tests/api/routes/collections/test_collection_info.py (4)
backend/app/tests/utils/utils.py (1)
get_project(54-73)backend/app/tests/utils/collection.py (2)
get_collection(25-48)get_collection_job(77-98)backend/app/models/collection_job.py (2)
CollectionActionType(18-20)CollectionJobStatus(11-15)backend/app/tests/conftest.py (3)
db(28-45)client(65-68)user_api_key_header(90-92)
backend/app/tests/crud/collections/collection/test_crud_collection_read_all.py (1)
backend/app/tests/utils/collection.py (1)
get_collection(25-48)
backend/app/tests/api/routes/collections/test_create_collections.py (3)
backend/app/models/collection_job.py (2)
CollectionJobStatus(11-15)CollectionActionType(18-20)backend/app/models/collection.py (1)
CreationRequest(115-123)backend/app/tests/conftest.py (3)
client(65-68)user_api_key_header(90-92)user_api_key(102-104)
backend/app/api/routes/collections.py (3)
backend/app/services/collections/create_collection.py (1)
start_job(39-69)backend/app/services/collections/delete_collection.py (1)
start_job(23-51)backend/app/utils.py (2)
APIResponse(29-53)success_response(36-39)
backend/app/services/collections/delete_collection.py (7)
backend/app/crud/rag/open_ai.py (2)
OpenAIAssistantCrud(192-246)OpenAIVectorStoreCrud(101-189)backend/app/models/collection_job.py (2)
CollectionJobStatus(11-15)CollectionJobUpdate(74-79)backend/app/models/collection.py (1)
DeletionRequest(126-127)backend/app/utils.py (1)
get_openai_client(175-205)backend/app/crud/collection/collection_job.py (3)
CollectionJobCrud(19-96)read_one(24-45)update(58-76)backend/app/crud/collection/collection.py (4)
read_one(68-92)CollectionCrud(17-154)delete(112-120)delete_by_id(105-109)backend/app/services/collections/helpers.py (8)
SilentCallback(81-88)WebHookCallback(91-118)fail(74-75)fail(82-84)fail(105-113)success(77-78)success(86-88)success(115-118)
backend/app/services/collections/create_collection.py (10)
backend/app/models/collection.py (4)
CreationRequest(115-123)extract_super_type(120-123)AssistantOptions(57-105)Collection(13-36)backend/app/crud/collection/collection_job.py (4)
CollectionJobCrud(19-96)read_one(24-45)update(58-76)create(78-96)backend/app/crud/collection/collection.py (3)
read_one(68-92)create(42-66)CollectionCrud(17-154)backend/app/models/collection_job.py (2)
CollectionJobUpdate(74-79)CollectionJobStatus(11-15)backend/app/utils.py (1)
get_openai_client(175-205)backend/app/core/cloud/storage.py (1)
get_cloud_storage(262-279)backend/app/crud/document.py (1)
DocumentCrud(14-135)backend/app/services/collections/helpers.py (10)
batch_documents(48-67)SilentCallback(81-88)WebHookCallback(91-118)success(77-78)success(86-88)success(115-118)_backout(121-129)fail(74-75)fail(82-84)fail(105-113)backend/app/crud/rag/open_ai.py (2)
OpenAIVectorStoreCrud(101-189)OpenAIAssistantCrud(192-246)backend/app/crud/document_collection.py (1)
DocumentCollectionCrud(8-48)
backend/app/services/collections/helpers.py (4)
backend/app/utils.py (3)
APIResponse(29-53)failure_response(42-53)success_response(36-39)backend/app/core/util.py (1)
post_callback(29-39)backend/app/crud/collection/collection.py (1)
delete(112-120)backend/app/tests/services/collections/test_helpers.py (2)
delete(145-146)delete(158-159)
backend/app/tests/services/collections/test_create_collection.py (6)
backend/app/models/collection.py (1)
CreationRequest(115-123)backend/app/api/routes/collections.py (1)
create_collection(41-79)backend/app/services/collections/create_collection.py (2)
start_job(39-69)execute_job(72-238)backend/app/tests/utils/openai.py (1)
get_mock_openai_client_with_vector_store(90-117)backend/app/tests/utils/collection.py (1)
get_collection_job(77-98)backend/app/tests/conftest.py (1)
db(28-45)
backend/app/tests/crud/collections/collection/test_crud_collection_read_one.py (3)
backend/app/tests/crud/collections/collection/test_crud_collection_delete.py (1)
get_collection(16-33)backend/app/tests/utils/collection.py (1)
get_collection(25-48)backend/app/tests/utils/utils.py (1)
get_project(54-73)
backend/app/tests/services/collections/test_helpers.py (2)
backend/app/services/collections/helpers.py (11)
extract_error_message(19-45)batch_documents(48-67)SilentCallback(81-88)success(77-78)success(86-88)success(115-118)fail(74-75)fail(82-84)fail(105-113)WebHookCallback(91-118)_backout(121-129)backend/app/utils.py (1)
APIResponse(29-53)
backend/app/tests/crud/collections/collection/test_crud_collection_delete.py (5)
backend/app/crud/collection/collection.py (2)
CollectionCrud(17-154)create(42-66)backend/app/models/collection.py (1)
Collection(13-36)backend/app/crud/rag/open_ai.py (1)
OpenAIAssistantCrud(192-246)backend/app/tests/utils/utils.py (1)
get_project(54-73)backend/app/tests/utils/collection.py (1)
get_collection(25-48)
backend/app/tests/services/collections/test_delete_collection.py (6)
backend/app/tests/utils/utils.py (1)
get_project(54-73)backend/app/crud/collection/collection_job.py (2)
CollectionJobCrud(19-96)read_one(24-45)backend/app/models/collection_job.py (2)
CollectionJobStatus(11-15)CollectionActionType(18-20)backend/app/tests/utils/collection.py (2)
get_collection(25-48)get_collection_job(77-98)backend/app/services/collections/delete_collection.py (2)
start_job(23-51)execute_job(54-194)backend/app/crud/collection/collection.py (5)
_(123-130)_(133-154)delete(112-120)delete_by_id(105-109)read_one(68-92)
backend/app/tests/crud/collections/collection/test_crud_collection_create.py (4)
backend/app/crud/collection/collection.py (1)
CollectionCrud(17-154)backend/app/models/document_collection.py (1)
DocumentCollection(9-23)backend/app/models/collection.py (1)
Collection(13-36)backend/app/tests/utils/document.py (1)
project(65-66)
🪛 Ruff (0.14.0)
backend/app/tests/utils/collection.py
30-30: Unused function argument: model
(ARG001)
56-56: Unused function argument: model
(ARG001)
backend/app/tests/api/routes/collections/test_create_collections.py
5-5: Redefinition of unused patch from line 2
Remove definition: patch
(F811)
backend/app/services/collections/create_collection.py
79-79: Unused function argument: task_instance
(ARG001)
backend/app/tests/services/collections/test_create_collection.py
220-220: Local variable mock_client is assigned to but never used
Remove assignment to unused variable mock_client
(F841)
backend/app/tests/crud/collections/collection/test_crud_collection_read_one.py
18-18: Local variable client is assigned to but never used
Remove assignment to unused variable client
(F841)
backend/app/tests/crud/collections/collection/test_crud_collection_delete.py
16-16: Redefinition of unused get_collection from line 13
(F811)
🔇 Additional comments (19)
backend/app/tests/crud/collections/collection/test_crud_collection_create.py (1)
18-25: LGTM — directCollectionconstruction matches the model and CRUD flow.backend/app/tests/crud/collections/collection/test_crud_collection_read_all.py (1)
14-29: The review comment is technically incorrect about the validation behavior.The Collection model does not define
assistant_idorvector_store_idfields, but this won't cause validation errors. By default, SQLModel/Pydantic v2 usesextra='ignore'behavior, so extra fields are silently ignored rather than forbidden. The Collection model has no custommodel_configorConfigclass that would change this.While passing undefined parameters is poor practice and suggests a code organization issue, it will not error under Pydantic v2 as claimed. The actual concern is code clarity: these extra parameters are silently dropped, making the helper's intent unclear.
The unused
client = OpenAI(...)variable on line 19 is a valid minor concern but unrelated to the Collection model issue.Likely an incorrect or invalid review comment.
backend/app/api/routes/collections.py (1)
69-79: Nice UX: metadata explains vector-store-only path.Clear guidance in responses when
with_assistantis False. LGTM.backend/app/tests/services/collections/test_delete_collection.py (3)
16-21: start_job test aligns with new scheduling contract.Asserts on kwargs (function_path, ids, payload via
model_dump(mode="json")) look correct.Also applies to: 61-73
98-137: execute_job success: verifies remote delete and local soft-delete.Good use of patches; asserts on
delete_by_id, task/status updates, and assistant id match are accurate.
160-205: execute_job failure: correct FAILED status and no local delete.Mocked exception path and assertions look right, including error_message contents.
backend/app/tests/services/collections/test_helpers.py (5)
10-35: extract_error_message tests cover JSON, dict, fallback, truncation.Concise and robust. LGTM.
41-66: batch_documents tests validate chunking and outputs.Even/ragged/empty paths well covered.
86-91: SilentCallback no-ops confirmed.Simple and sufficient.
93-135: WebHookCallback success/failure flows verified with metadata.Good assertions on APIResponse shape and job id propagation.
140-161: _backout cleanup behavior validated.Delete called once; OpenAIError swallowed as intended.
backend/app/crud/collection/collection.py (1)
140-143: Good: filter out already-deleted collections when deleting by Document.This avoids double-deletes and keeps semantics consistent.
backend/app/tests/api/routes/collections/test_create_collections.py (4)
12-14: Helper for metadata is tidy.Nicely handles both
metadataandmeta.
16-66: End-to-end assertions look solid.Good coverage: enums, ids, trace, and
with_assistant=Truepropagation tostart_job.
68-110: Vector-only path checks are appropriate.Validates metadata note and
with_assistant=Falseflow.
112-137: Validation error test is precise.Matches the model’s “both or none” requirement.
backend/app/services/collections/create_collection.py (3)
125-153: Resource creation flow looks correct.Vector store created/updated; assistant created conditionally; IDs wired into Collection. Logs are informative.
154-195: Metrics/logging block is useful.File types, sizes, and timing aid observability.
220-238: Backout and failure marking are robust.Best-effort cleanup plus job status update and callback are handled.
backend/app/tests/crud/collections/collection/test_crud_collection_delete.py
Outdated
Show resolved
Hide resolved
backend/app/tests/services/collections/test_create_collection.py
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
backend/app/api/docs/collections/job_info.md (1)
9-9: Improve clarity of grammar on collection deletion response.Line 9 uses awkward phrasing: "nothing will be returned as collection" is unclear.
Apply this diff to improve clarity:
-* If the job of delete collection was successful, we will get the status as successful and nothing will be returned as collection. +* If the collection deletion job was successful, the status will be marked as successful and no collection object will be returned.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
backend/app/api/docs/collections/create.md(2 hunks)backend/app/api/docs/collections/delete.md(1 hunks)backend/app/api/docs/collections/job_info.md(1 hunks)
✅ Files skipped from review due to trivial changes (1)
- backend/app/api/docs/collections/create.md
🧰 Additional context used
🪛 LanguageTool
backend/app/api/docs/collections/delete.md
[grammar] ~13-~13: Ensure spelling is correct
Context: ...deleted and marked as deleted. You will recieve a message in the callback_url about t...
(QB_NEW_EN_ORTHOGRAPHY_ERROR_IDS_1)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: checks (3.11.7, 6)
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (3)
backend/app/crud/collection/collection.py (3)
1-5: Remove duplicateloggingimport.The
loggingmodule is imported on both lines 1 and 5.import logging import functools as ft from uuid import UUID from typing import Optional -import logging
22-30: Add return type hint to_updatemethod.Per the coding guidelines for Python 3.11+ projects, type hints should be used consistently.
As per coding guidelines.
- def _update(self, collection: Collection): + def _update(self, collection: Collection) -> Collection: self.session.add(collection) self.session.commit() self.session.refresh(collection)
94-103: Add return type hint toread_allmethod.Per the coding guidelines for Python 3.11+ projects, type hints should be used consistently.
As per coding guidelines.
- def read_all(self): + def read_all(self) -> list[Collection]: statement = select(Collection).where( and_( Collection.project_id == self.project_id,
♻️ Duplicate comments (2)
backend/app/tests/crud/collections/collection/test_crud_collection_delete.py (1)
15-34: Org/project mismatch: helper ignores passedproject_idwhen fetching project.The helper calls
get_project(db)which retrieves a random active project, but then uses the passedproject_idparameter directly when constructing the Collection. This causesorganization_idandproject_idto reference mismatched entities when the caller passes a specificproject_id.Accept an explicit
projectparameter instead:def get_collection_for_delete( - db: Session, client=None, project_id: int = None + db: Session, project, client=None ) -> Collection: - project = get_project(db) if client is None: client = OpenAI(api_key="test_api_key") vector_store = client.vector_stores.create() assistant = client.beta.assistants.create( model="gpt-4o", tools=[{"type": "file_search"}], tool_resources={"file_search": {"vector_store_ids": [vector_store.id]}}, ) return Collection( organization_id=project.organization_id, - project_id=project_id, + project_id=project.id, llm_service_id=assistant.id, llm_service_name="gpt-4o", )Then update call sites:
-collection = get_collection_for_delete(db, client, project_id=project.id) +collection = get_collection_for_delete(db, project, client)backend/app/crud/collection/collection.py (1)
42-66: PK collision risk with soft-deleted rows remains unresolved.The past review identified that
read_one()filters out soft-deleted collections (line 73:Collection.deleted_at.is_(None)). If a soft-deleted collection with the sameidexists,read_one()raises 404, and the subsequentsession.add()at line 59 attempts an insert with a duplicate primary key, causing anIntegrityError.The suggested fix from the previous review was not applied. Additionally, the
existing = Noneinitialization on line 45 is redundant.Apply the corrected diff from the past review:
def create( self, collection: Collection, documents: Optional[list[Document]] = None ) -> Collection: - existing = None - try: - existing = self.read_one(collection.id) - except HTTPException as e: - if e.status_code != 404: - raise - - if existing is not None: + # First, see if any row exists for this id (deleted or not) + any_row = self.session.exec( + select(Collection).where(Collection.id == collection.id) + ).one_or_none() + if any_row is not None: + if any_row.deleted_at is None: + logger.warning( + "[CollectionCrud.create] Collection already present | {'collection_id': '%s'}", + collection.id, + ) + return any_row + # id is already used by a soft-deleted row + raise HTTPException(status_code=409, detail="Collection id already exists (soft-deleted)") - logger.warning( - "[CollectionCrud.create] Collection already present | " - f"{{'collection_id': '{collection.id}'}}" - ) - return existing self.session.add(collection) self.session.commit() self.session.refresh(collection)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (7)
backend/app/api/docs/collections/delete.md(1 hunks)backend/app/crud/collection/collection.py(3 hunks)backend/app/models/collection.py(3 hunks)backend/app/tests/api/routes/collections/test_create_collections.py(1 hunks)backend/app/tests/crud/collections/collection/test_crud_collection_delete.py(4 hunks)backend/app/tests/services/collections/test_create_collection.py(7 hunks)backend/app/tests/utils/collection.py(2 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- backend/app/api/docs/collections/delete.md
🧰 Additional context used
📓 Path-based instructions (3)
**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Use type hints in Python code (Python 3.11+ project)
Files:
backend/app/tests/crud/collections/collection/test_crud_collection_delete.pybackend/app/tests/services/collections/test_create_collection.pybackend/app/models/collection.pybackend/app/crud/collection/collection.pybackend/app/tests/api/routes/collections/test_create_collections.pybackend/app/tests/utils/collection.py
backend/app/models/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Define SQLModel entities (database tables and domain objects) in backend/app/models/
Files:
backend/app/models/collection.py
backend/app/crud/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Implement database access operations in backend/app/crud/
Files:
backend/app/crud/collection/collection.py
🧬 Code graph analysis (5)
backend/app/tests/crud/collections/collection/test_crud_collection_delete.py (5)
backend/app/crud/collection/collection.py (2)
CollectionCrud(17-154)create(42-66)backend/app/models/collection.py (1)
Collection(13-36)backend/app/crud/rag/open_ai.py (1)
OpenAIAssistantCrud(192-246)backend/app/tests/utils/utils.py (1)
get_project(54-73)backend/app/tests/utils/document.py (2)
DocumentStore(53-80)project(65-66)
backend/app/tests/services/collections/test_create_collection.py (4)
backend/app/services/collections/create_collection.py (2)
start_job(39-69)execute_job(72-238)backend/app/tests/utils/openai.py (1)
get_mock_openai_client_with_vector_store(90-117)backend/app/tests/utils/collection.py (1)
get_collection_job(74-95)backend/app/crud/collection/collection_job.py (4)
create(78-96)update(58-76)CollectionJobCrud(19-96)read_one(24-45)
backend/app/crud/collection/collection.py (4)
backend/app/models/collection.py (1)
Collection(13-36)backend/app/crud/document_collection.py (1)
DocumentCollectionCrud(8-48)backend/app/core/util.py (1)
now(15-16)backend/app/models/document_collection.py (1)
DocumentCollection(9-23)
backend/app/tests/api/routes/collections/test_create_collections.py (2)
backend/app/models/collection_job.py (2)
CollectionJobStatus(11-15)CollectionActionType(18-20)backend/app/tests/conftest.py (3)
client(65-68)user_api_key_header(90-92)user_api_key(102-104)
backend/app/tests/utils/collection.py (5)
backend/app/models/collection.py (1)
Collection(13-36)backend/app/models/collection_job.py (3)
CollectionActionType(18-20)CollectionJob(35-64)CollectionJobStatus(11-15)backend/app/crud/collection/collection.py (2)
CollectionCrud(17-154)create(42-66)backend/app/crud/collection/collection_job.py (2)
CollectionJobCrud(19-96)create(78-96)backend/app/tests/conftest.py (1)
db(28-45)
🔇 Additional comments (8)
backend/app/tests/api/routes/collections/test_create_collections.py (1)
1-136: LGTM!The test suite comprehensively covers the new
with_assistantflow, including:
- Assistant-enabled path with proper callback verification
- Vector-only path with metadata validation
- Validation error for incomplete assistant options
The assertions correctly verify job status, metadata content, and all callback arguments.
backend/app/tests/services/collections/test_create_collection.py (1)
1-255: LGTM!The service tests correctly verify the updated
start_jobandexecute_jobflows withwith_assistant=True. The new failure test at lines 194-255 properly validates error handling and vector store cleanup on assistant creation failure.backend/app/models/collection.py (1)
120-123: LGTM!Correctly updated to use
model_fields(Pydantic v2 API) instead of the deprecated__fields__.backend/app/tests/utils/collection.py (3)
25-47: LGTM!The
get_collectionhelper correctly creates an assistant-backed Collection with all required fields matching the schema. Themodelparameter is properly used forllm_service_name.
50-71: LGTM!The
get_vector_store_collectionhelper correctly creates a vector-store-only Collection without assistant fields. All fields match the Collection schema.
74-95: LGTM!The
get_collection_jobhelper provides a clean, reusable way to seed CollectionJob test data with configurable fields. The enum value extraction handles both enum instances and string values properly.backend/app/crud/collection/collection.py (2)
105-109: LGTM: Clean soft-delete implementation.The new
delete_by_idmethod correctly implements soft-delete by reading the collection (which verifies it exists and is not already deleted), settingdeleted_at, and persisting via_update.
140-143: Good addition: filter excludes soft-deleted collections.The updated
whereclause correctly filters out soft-deleted collections when processing document deletions, preventing redundant deletion attempts.
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (2)
backend/app/models/collection.py (2)
61-69: Surface deprecations in OpenAPI schema (don’t just say “To Be Deprecated”).Mark these fields as deprecated in the generated schema so client SDKs and docs flag them. Add json_schema_extra={"deprecated": True} to model, instructions, and temperature.
Apply:
- model: Optional[str] = Field( - default=None, + model: Optional[str] = Field( + default=None, description=( "**[To Be Deprecated]** " "OpenAI model to attach to this assistant. The model " "must be compatable with the assistants API; see the " "OpenAI [model documentation](https://platform.openai.com/docs/models/compare) for more." - ), + ), + json_schema_extra={"deprecated": True}, ) @@ - instructions: Optional[str] = Field( - default=None, + instructions: Optional[str] = Field( + default=None, description=( "**[To Be Deprecated]** " "Assistant instruction. Sometimes referred to as the " '"system" prompt.' - ), + ), + json_schema_extra={"deprecated": True}, ) @@ - temperature: float = Field( - default=1e-6, + temperature: float = Field( + default=1e-6, description=( "**[To Be Deprecated]** " "Model temperature. The default is slightly " "greater-than zero because it is [unknown how OpenAI " "handles zero](https://community.openai.com/t/clarifications-on-setting-temperature-0/886447/5)." - ), + ), + json_schema_extra={"deprecated": True}, )Based on learnings.
Also applies to: 71-78, 80-87
122-122: v2-compatible field access is correct; tighten typing and simplify iteration.Switch the parameter type to a model class and add a return type; iterating dict keys explicitly isn’t needed.
Apply:
-from typing import Any, Optional +from typing import Any, Optional, Iterator @@ - def extract_super_type(self, cls: "CreationRequest"): - for field_name in cls.model_fields.keys(): + def extract_super_type(self, cls: type[SQLModel]) -> Iterator[tuple[str, Any]]: + for field_name in cls.model_fields: field_value = getattr(self, field_name) yield (field_name, field_value)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
backend/app/models/collection.py(3 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Use type hints in Python code (Python 3.11+ project)
Files:
backend/app/models/collection.py
backend/app/models/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Define SQLModel entities (database tables and domain objects) in backend/app/models/
Files:
backend/app/models/collection.py
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: checks (3.11.7, 6)
🔇 Additional comments (1)
backend/app/models/collection.py (1)
6-6: Good move to Pydantic v2 API.Importing model_validator is correct for v2; no issues.
There was a problem hiding this comment.
Actionable comments posted: 2
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (9)
backend/app/api/routes/collections.py(4 hunks)backend/app/api/routes/documents.py(3 hunks)backend/app/crud/collection/collection.py(4 hunks)backend/app/models/collection_job.py(2 hunks)backend/app/services/collections/create_collection.py(4 hunks)backend/app/services/collections/delete_collection.py(2 hunks)backend/app/services/collections/helpers.py(3 hunks)backend/app/tests/api/routes/collections/test_collection_info.py(3 hunks)backend/app/tests/api/routes/collections/test_create_collections.py(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- backend/app/crud/collection/collection.py
🧰 Additional context used
📓 Path-based instructions (4)
**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Use type hints in Python code (Python 3.11+ project)
Files:
backend/app/models/collection_job.pybackend/app/api/routes/collections.pybackend/app/api/routes/documents.pybackend/app/tests/api/routes/collections/test_create_collections.pybackend/app/services/collections/delete_collection.pybackend/app/services/collections/create_collection.pybackend/app/services/collections/helpers.pybackend/app/tests/api/routes/collections/test_collection_info.py
backend/app/models/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Define SQLModel entities (database tables and domain objects) in backend/app/models/
Files:
backend/app/models/collection_job.py
backend/app/api/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Expose FastAPI REST endpoints under backend/app/api/ organized by domain
Files:
backend/app/api/routes/collections.pybackend/app/api/routes/documents.py
backend/app/services/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Implement business logic services under backend/app/services/
Files:
backend/app/services/collections/delete_collection.pybackend/app/services/collections/create_collection.pybackend/app/services/collections/helpers.py
🧠 Learnings (1)
📚 Learning: 2025-10-08T12:05:01.317Z
Learnt from: CR
PR: ProjectTech4DevAI/ai-platform#0
File: CLAUDE.md:0-0
Timestamp: 2025-10-08T12:05:01.317Z
Learning: Applies to backend/app/services/response/**/*.py : Put OpenAI Responses API integration, conversation management, and job execution in backend/app/services/response/
Applied to files:
backend/app/services/collections/create_collection.py
🧬 Code graph analysis (7)
backend/app/api/routes/collections.py (6)
backend/app/models/collection_job.py (4)
CollectionJobPublic(86-95)CollectionJobCreate(71-75)CollectionActionType(18-20)CollectionJobStatus(11-15)backend/app/utils.py (2)
APIResponse(29-53)success_response(36-39)backend/app/services/collections/create_collection.py (1)
start_job(40-70)backend/app/services/collections/delete_collection.py (1)
start_job(24-52)backend/app/models/collection.py (1)
DeletionRequest(131-132)backend/app/crud/collection/collection_job.py (2)
CollectionJobCrud(19-96)create(78-96)
backend/app/api/routes/documents.py (3)
backend/app/crud/rag/open_ai.py (3)
OpenAIVectorStoreCrud(101-189)delete(174-189)delete(217-246)backend/app/services/collections/helpers.py (1)
pick_service_for_documennt(141-160)backend/app/crud/collection/collection.py (3)
CollectionCrud(18-145)delete(103-111)read_one(59-83)
backend/app/tests/api/routes/collections/test_create_collections.py (3)
backend/app/models/collection_job.py (2)
CollectionJobStatus(11-15)CollectionActionType(18-20)backend/app/models/collection.py (1)
CreationRequest(120-128)backend/app/tests/conftest.py (3)
client(65-68)user_api_key_header(90-92)user_api_key(102-104)
backend/app/services/collections/delete_collection.py (6)
backend/app/crud/rag/open_ai.py (2)
OpenAIAssistantCrud(192-246)OpenAIVectorStoreCrud(101-189)backend/app/models/collection_job.py (3)
CollectionJobStatus(11-15)CollectionJobUpdate(78-83)job_id(67-68)backend/app/models/collection.py (1)
DeletionRequest(131-132)backend/app/services/collections/helpers.py (8)
SilentCallback(86-93)WebHookCallback(96-123)fail(79-80)fail(87-89)fail(110-118)success(82-83)success(91-93)success(120-123)backend/app/utils.py (1)
get_openai_client(175-205)backend/app/crud/collection/collection.py (4)
read_one(59-83)CollectionCrud(18-145)delete(103-111)delete_by_id(96-100)
backend/app/services/collections/create_collection.py (10)
backend/app/models/collection.py (4)
CreationRequest(120-128)extract_super_type(125-128)AssistantOptions(57-110)Collection(13-36)backend/app/models/collection_job.py (3)
job_id(67-68)CollectionJobUpdate(78-83)CollectionJobStatus(11-15)backend/app/crud/collection/collection_job.py (4)
CollectionJobCrud(19-96)read_one(24-45)update(58-76)create(78-96)backend/app/crud/collection/collection.py (3)
read_one(59-83)create(43-57)CollectionCrud(18-145)backend/app/utils.py (1)
get_openai_client(175-205)backend/app/core/cloud/storage.py (1)
get_cloud_storage(262-279)backend/app/crud/document.py (1)
DocumentCrud(14-135)backend/app/services/collections/helpers.py (10)
batch_documents(53-72)SilentCallback(86-93)WebHookCallback(96-123)success(82-83)success(91-93)success(120-123)_backout(126-134)fail(79-80)fail(87-89)fail(110-118)backend/app/crud/rag/open_ai.py (2)
OpenAIVectorStoreCrud(101-189)OpenAIAssistantCrud(192-246)backend/app/crud/document_collection.py (1)
DocumentCollectionCrud(8-48)
backend/app/services/collections/helpers.py (4)
backend/app/core/util.py (1)
post_callback(29-39)backend/app/utils.py (3)
APIResponse(29-53)failure_response(42-53)success_response(36-39)backend/app/models/document_collection.py (1)
DocumentCollection(9-23)backend/app/models/collection.py (1)
Collection(13-36)
backend/app/tests/api/routes/collections/test_collection_info.py (4)
backend/app/tests/utils/utils.py (1)
get_project(54-73)backend/app/tests/utils/collection.py (2)
get_collection(25-47)get_collection_job(74-95)backend/app/models/collection_job.py (2)
CollectionActionType(18-20)CollectionJobStatus(11-15)backend/app/tests/conftest.py (3)
db(28-45)client(65-68)user_api_key_header(90-92)
🪛 Ruff (0.14.2)
backend/app/services/collections/delete_collection.py
62-62: Unused function argument: task_instance
(ARG001)
backend/app/services/collections/create_collection.py
80-80: Unused function argument: task_instance
(ARG001)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: checks (3.11.7, 6)
🔇 Additional comments (1)
backend/app/models/collection_job.py (1)
66-88: job_id alias keeps the API contract consistent.Glad to see the model now exposes
job_idwithout leaking the internalid, which keeps client payloads stable while preserving the DB primary key.
There was a problem hiding this comment.
Actionable comments posted: 9
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
backend/app/tests/services/collections/test_create_collection.py (1)
345-402: Remove the duplicatetest_execute_job_success_flow_updates_job_and_creates_collection.This redefinition overwrites the earlier test, so those assertions never run and ruff raises F811. Please drop the duplicate block (or consolidate any differences) so we keep a single definition.
🧹 Nitpick comments (1)
backend/app/api/routes/collections.py (1)
94-97: Simplify flag determination.Since
requestis typed asCreationRequestwithmodelandinstructionsas direct attributes, you can simplify this withoutgetattr:- # True iff both model and instructions were provided in the request body - with_assistant = bool( - getattr(request, "model", None) and getattr(request, "instructions", None) - ) + # True iff both model and instructions were provided in the request body + with_assistant = bool(request.model and request.instructions)Note: The validation in
AssistantOptions._assistant_fields_all_or_noneensures that if one is provided, the other must be too (or the request fails with 422), so this logic is safe.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (20)
backend/app/api/docs/collections/create.md(2 hunks)backend/app/api/docs/collections/delete.md(1 hunks)backend/app/api/docs/collections/docs.md(0 hunks)backend/app/api/docs/collections/info.md(1 hunks)backend/app/api/routes/collection_job.py(3 hunks)backend/app/api/routes/collections.py(2 hunks)backend/app/models/__init__.py(1 hunks)backend/app/models/collection.py(6 hunks)backend/app/models/collection_job.py(4 hunks)backend/app/services/collections/create_collection.py(3 hunks)backend/app/services/collections/delete_collection.py(2 hunks)backend/app/services/collections/helpers.py(2 hunks)backend/app/tests/api/routes/collections/test_collection_delete.py(1 hunks)backend/app/tests/api/routes/collections/test_collection_info.py(1 hunks)backend/app/tests/api/routes/collections/test_collection_job_info.py(1 hunks)backend/app/tests/api/routes/collections/test_collection_list.py(1 hunks)backend/app/tests/api/routes/collections/test_create_collections.py(1 hunks)backend/app/tests/services/collections/test_create_collection.py(8 hunks)backend/app/tests/services/collections/test_delete_collection.py(6 hunks)backend/app/tests/services/collections/test_helpers.py(1 hunks)
💤 Files with no reviewable changes (1)
- backend/app/api/docs/collections/docs.md
🚧 Files skipped from review as they are similar to previous changes (2)
- backend/app/api/docs/collections/delete.md
- backend/app/tests/services/collections/test_helpers.py
🧰 Additional context used
📓 Path-based instructions (4)
**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Use type hints in Python code (Python 3.11+ project)
Files:
backend/app/services/collections/delete_collection.pybackend/app/tests/services/collections/test_delete_collection.pybackend/app/services/collections/create_collection.pybackend/app/api/routes/collections.pybackend/app/tests/api/routes/collections/test_create_collections.pybackend/app/models/collection.pybackend/app/tests/api/routes/collections/test_collection_delete.pybackend/app/tests/api/routes/collections/test_collection_job_info.pybackend/app/tests/api/routes/collections/test_collection_info.pybackend/app/api/routes/collection_job.pybackend/app/models/__init__.pybackend/app/tests/api/routes/collections/test_collection_list.pybackend/app/tests/services/collections/test_create_collection.pybackend/app/models/collection_job.pybackend/app/services/collections/helpers.py
backend/app/services/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Implement business logic services under backend/app/services/
Files:
backend/app/services/collections/delete_collection.pybackend/app/services/collections/create_collection.pybackend/app/services/collections/helpers.py
backend/app/api/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Expose FastAPI REST endpoints under backend/app/api/ organized by domain
Files:
backend/app/api/routes/collections.pybackend/app/api/routes/collection_job.py
backend/app/models/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Define SQLModel entities (database tables and domain objects) in backend/app/models/
Files:
backend/app/models/collection.pybackend/app/models/__init__.pybackend/app/models/collection_job.py
🧠 Learnings (1)
📚 Learning: 2025-08-06T05:29:45.557Z
Learnt from: avirajsingh7
Repo: ProjectTech4DevAI/ai-platform PR: 319
File: backend/app/crud/prompt.py:111-118
Timestamp: 2025-08-06T05:29:45.557Z
Learning: In FastAPI applications using Pydantic/SQLModel, when update model fields have min_length=1 validation, empty strings are rejected at the validation layer before reaching CRUD functions. This means checking `if field:` vs `if field is not None:` in CRUD operations should consider the model validation constraints.
Applied to files:
backend/app/models/collection.py
🧬 Code graph analysis (15)
backend/app/services/collections/delete_collection.py (7)
backend/app/crud/rag/open_ai.py (2)
OpenAIAssistantCrud(192-246)OpenAIVectorStoreCrud(101-189)backend/app/models/collection_job.py (5)
CollectionJobStatus(12-16)CollectionJobUpdate(83-88)CollectionJob(24-72)CollectionJobPublic(108-113)job_id(63-64)backend/app/models/collection.py (2)
CollectionIDPublic(139-140)DeletionRequest(132-133)backend/app/services/collections/helpers.py (1)
extract_error_message(21-47)backend/app/utils.py (4)
get_openai_client(175-205)APIResponse(29-53)success_response(36-39)failure_response(42-53)backend/app/crud/collection/collection_job.py (3)
update(58-76)CollectionJobCrud(19-96)read_one(24-45)backend/app/crud/collection/collection.py (4)
read_one(59-83)CollectionCrud(18-145)delete(103-111)delete_by_id(96-100)
backend/app/tests/services/collections/test_delete_collection.py (5)
backend/app/models/collection_job.py (3)
CollectionJobStatus(12-16)CollectionActionType(19-21)job_id(63-64)backend/app/tests/utils/collection.py (2)
get_collection(25-47)get_collection_job(74-95)backend/app/services/collections/delete_collection.py (2)
start_job(26-54)execute_job(146-250)backend/app/tests/conftest.py (1)
db(28-45)backend/app/crud/collection/collection.py (5)
_(114-121)_(124-145)delete(103-111)delete_by_id(96-100)read_one(59-83)
backend/app/services/collections/create_collection.py (8)
backend/app/models/collection.py (5)
CollectionPublic(143-152)CreationRequest(121-129)AssistantOptions(58-111)Collection(14-37)extract_super_type(126-129)backend/app/models/collection_job.py (5)
CollectionJobPublic(108-113)CollectionJob(24-72)job_id(63-64)CollectionJobUpdate(83-88)CollectionJobStatus(12-16)backend/app/services/collections/helpers.py (3)
_backout(72-80)batch_documents(50-69)extract_error_message(21-47)backend/app/utils.py (4)
get_openai_client(175-205)APIResponse(29-53)success_response(36-39)failure_response(42-53)backend/app/core/cloud/storage.py (1)
get_cloud_storage(262-279)backend/app/crud/document.py (1)
DocumentCrud(14-135)backend/app/crud/rag/open_ai.py (2)
OpenAIVectorStoreCrud(101-189)OpenAIAssistantCrud(192-246)backend/app/crud/document_collection.py (1)
DocumentCollectionCrud(8-48)
backend/app/api/routes/collections.py (6)
backend/app/models/collection_job.py (5)
CollectionJobPublic(108-113)CollectionJobImmediatePublic(103-105)CollectionJobCreate(76-80)CollectionActionType(19-21)CollectionJobStatus(12-16)backend/app/models/collection.py (5)
CollectionWithDocsPublic(155-156)CreationRequest(121-129)CallbackRequest(114-118)DeletionRequest(132-133)CollectionPublic(143-152)backend/app/utils.py (4)
APIResponse(29-53)load_description(248-253)success_response(36-39)_(257-258)backend/app/crud/collection/collection.py (7)
CollectionCrud(18-145)read_all(85-94)delete(103-111)_(114-121)_(124-145)read_one(59-83)create(43-57)backend/app/crud/document_collection.py (3)
create(12-23)DocumentCollectionCrud(8-48)read(25-48)backend/app/models/document.py (1)
DocumentPublic(43-60)
backend/app/tests/api/routes/collections/test_create_collections.py (3)
backend/app/models/collection_job.py (1)
CollectionJobStatus(12-16)backend/app/models/collection.py (1)
CreationRequest(121-129)backend/app/tests/conftest.py (3)
client(65-68)user_api_key_header(90-92)user_api_key(102-104)
backend/app/models/collection.py (2)
backend/app/core/util.py (1)
now(15-16)backend/app/models/document.py (1)
DocumentPublic(43-60)
backend/app/tests/api/routes/collections/test_collection_delete.py (4)
backend/app/models/collection_job.py (1)
CollectionJobStatus(12-16)backend/app/tests/utils/utils.py (1)
get_project(54-73)backend/app/tests/utils/collection.py (1)
get_collection(25-47)backend/app/tests/conftest.py (3)
db(28-45)user_api_key_header(90-92)user_api_key(102-104)
backend/app/tests/api/routes/collections/test_collection_job_info.py (4)
backend/app/tests/utils/utils.py (1)
get_project(54-73)backend/app/tests/utils/collection.py (2)
get_collection(25-47)get_collection_job(74-95)backend/app/models/collection_job.py (2)
CollectionActionType(19-21)CollectionJobStatus(12-16)backend/app/tests/conftest.py (2)
db(28-45)user_api_key_header(90-92)
backend/app/tests/api/routes/collections/test_collection_info.py (5)
backend/app/tests/utils/utils.py (2)
get_project(54-73)get_document(102-123)backend/app/tests/utils/collection.py (2)
get_collection(25-47)get_vector_store_collection(50-71)backend/app/crud/document_collection.py (1)
DocumentCollectionCrud(8-48)backend/app/models/collection.py (1)
Collection(14-37)backend/app/tests/conftest.py (3)
db(28-45)client(65-68)user_api_key_header(90-92)
backend/app/api/routes/collection_job.py (6)
backend/app/models/collection_job.py (2)
CollectionActionType(19-21)CollectionJobPublic(108-113)backend/app/models/collection.py (2)
CollectionIDPublic(139-140)CollectionPublic(143-152)backend/app/utils.py (2)
APIResponse(29-53)failure_response(42-53)backend/app/crud/collection/collection.py (2)
CollectionCrud(18-145)read_one(59-83)backend/app/crud/collection/collection_job.py (1)
read_one(24-45)backend/app/services/collections/helpers.py (1)
extract_error_message(21-47)
backend/app/models/__init__.py (2)
backend/app/models/collection.py (4)
Collection(14-37)CollectionPublic(143-152)CollectionIDPublic(139-140)CollectionWithDocsPublic(155-156)backend/app/models/collection_job.py (2)
CollectionJob(24-72)CollectionJobImmediatePublic(103-105)
backend/app/tests/api/routes/collections/test_collection_list.py (3)
backend/app/tests/utils/utils.py (1)
get_project(54-73)backend/app/tests/utils/collection.py (2)
get_collection(25-47)get_vector_store_collection(50-71)backend/app/tests/conftest.py (2)
db(28-45)user_api_key_header(90-92)
backend/app/tests/services/collections/test_create_collection.py (7)
backend/app/services/collections/create_collection.py (2)
start_job(43-73)execute_job(165-320)backend/app/tests/utils/openai.py (1)
get_mock_openai_client_with_vector_store(90-117)backend/app/tests/utils/collection.py (2)
get_collection_job(74-95)get_collection(25-47)backend/app/crud/collection/collection.py (6)
_(114-121)_(124-145)read_one(59-83)CollectionCrud(18-145)create(43-57)delete(103-111)backend/app/tests/utils/document.py (2)
project(65-66)DocumentStore(53-80)backend/app/crud/collection/collection_job.py (4)
CollectionJobCrud(19-96)read_one(24-45)create(78-96)update(58-76)backend/app/crud/document_collection.py (1)
DocumentCollectionCrud(8-48)
backend/app/models/collection_job.py (1)
backend/app/models/collection.py (2)
CollectionPublic(143-152)CollectionIDPublic(139-140)
backend/app/services/collections/helpers.py (3)
backend/app/models/document_collection.py (1)
DocumentCollection(9-23)backend/app/models/collection.py (1)
Collection(14-37)backend/app/tests/services/collections/test_helpers.py (2)
delete(94-95)delete(107-108)
🪛 Ruff (0.14.3)
backend/app/services/collections/delete_collection.py
153-153: Unused function argument: task_instance
(ARG001)
backend/app/services/collections/create_collection.py
172-172: Unused function argument: task_instance
(ARG001)
187-187: Local variable callback is assigned to but never used
Remove assignment to unused variable callback
(F841)
backend/app/tests/api/routes/collections/test_collection_list.py
13-13: Unused function argument: db
(ARG001)
55-55: Local variable before_ids is assigned to but never used
Remove assignment to unused variable before_ids
(F841)
112-112: Unused function argument: db
(ARG001)
backend/app/tests/services/collections/test_create_collection.py
344-344: Redefinition of unused test_execute_job_success_flow_updates_job_and_creates_collection from line 118
(F811)
424-424: Redefinition of unused test_execute_job_assistant_create_failure_marks_failed_and_deletes_vector from line 198
(F811)
575-575: Unused function argument: MockAssistantCrud
(ARG001)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: checks (3.11.7, 6)
🔇 Additional comments (24)
backend/app/tests/api/routes/collections/test_collection_job_info.py (1)
1-160: LGTM! Comprehensive test coverage for collection job info endpoint.The test suite thoroughly covers all job states (PENDING, SUCCESSFUL, FAILED) and action types (CREATE, DELETE) with appropriate assertions for job metadata, timestamps, collection presence, and error messages.
backend/app/models/collection.py (5)
6-6: LGTM!The
model_validatorimport is correctly used for the Pydantic v2 validator on line 90.
9-9: LGTM!The
DocumentPublicimport is correctly used inCollectionWithDocsPublicon line 156.
62-111: LGTM! Validator correctly implements both-or-neither constraint.The
@model_validator(mode="before")properly addresses past review feedback by:
- Using Pydantic v2
model_validatorinstead of deprecatedroot_validator- Including proper type hints (
values: dict[str, Any] -> dict[str, Any])- Only normalizing strings while preserving non-strings for Pydantic coercion
- Using
is Nonechecks instead ofbool()for presence detection- Correctly handling the XOR constraint for model and instructions fields
126-129: LGTM! Correct Pydantic v2 API usage.The change from
cls.__fields__tocls.model_fields.keys()properly migrates to Pydantic v2 API.
139-156: LGTM! Well-structured public response models.The new models follow best practices:
CollectionIDPublicprovides a minimal representation for DELETE responsesCollectionPublicexposes all necessary collection metadataCollectionWithDocsPubliccleanly extends with optional document list using proper type hintsbackend/app/tests/services/collections/test_delete_collection.py (1)
1-372: LGTM! Comprehensive delete flow test coverage.The test suite properly validates:
- Job creation and task scheduling with correct parameters
- Remote deletion via
OpenAIAssistantCrud.delete()- Local soft-delete via
CollectionCrud.delete_by_id()- Success and failure callback payload structure
- Error message propagation on failures
The tests correctly use the updated test utilities and payload serialization patterns introduced in this PR.
backend/app/api/docs/collections/info.md (1)
1-6: LGTM! Clear documentation of include_docs functionality.The documentation accurately describes the optional document inclusion feature.
backend/app/api/docs/collections/create.md (1)
1-29: LGTM! Documentation accurately reflects new creation flow.The updates correctly document:
- Assistant creation requires both model and instructions
- Vector-store-only creation when neither is provided
- Job-based polling using
job_idfrom the response- Expanded failure scenarios
backend/app/models/__init__.py (1)
9-23: LGTM! Public model exports correctly updated.The new exports properly expose:
CollectionIDPublicandCollectionWithDocsPublicfrom the collection moduleCollectionJobImmediatePublicfrom the collection_job moduleThese align with the new public API surface introduced in this PR.
backend/app/api/routes/collection_job.py (3)
28-33: LGTM! Cleaner endpoint path and appropriate field exclusion.The path simplification and
response_model_excludeforerror_messageensure errors are only exposed at the top level in failure responses, not nested in data.
44-54: LGTM! Correct collection hydration logic.The implementation properly:
- Fetches full collection details for successful CREATE operations
- Provides minimal ID-only response for DELETE operations
- Uses appropriate public models (
CollectionPublicvsCollectionIDPublic)
56-65: LGTM! Clean error handling with helpful context.The failure response correctly:
- Extracts user-friendly error messages via
extract_error_message- Includes job data for context
- Maintains consistent API response structure
backend/app/tests/api/routes/collections/test_create_collections.py (4)
1-12: LGTM! Clean imports and defensive metadata helper.The
_extract_metadatahelper safely handles API response variations for the metadata field.
15-60: LGTM! Comprehensive assistant creation test.The test properly validates:
- Response structure and status
- Job timestamps presence
with_assistant=Trueflag- Request payload serialization
- Metadata absence for assistant path
62-101: LGTM! Complete vector-only creation test.The test correctly validates:
- Response structure for vector-store-only path
- Presence of descriptive metadata note
with_assistant=Falseflag- Request payload handling
103-128: LGTM! Proper validation error test.The test correctly validates that providing only one of model/instructions triggers a 422 error with a clear message, confirming the Pydantic validator works as expected.
backend/app/services/collections/create_collection.py (4)
76-94: Well-structured success payload builder.The helper cleanly encapsulates the success response construction using the new public models. The use of
exclude_none=Trueappropriately omits unset optional fields from the JSON response.
120-138: Robust cleanup with guarded fallbacks.The best-effort cleanup pattern properly handles partially created resources and logs warnings on backout failures without propagating exceptions, which is appropriate for cleanup code.
140-162: Good session isolation for failure marking.Opening a fresh session to persist the FAILED status is the correct pattern and aligns with the session management approach used throughout this module.
224-231: Assistant options extraction is clean.Filtering out
Nonevalues before passing to the OpenAI SDK prevents sending unset optional parameters, which is good practice. The validation inAssistantOptions._assistant_fields_all_or_noneensures bothmodelandinstructionsare present whenwith_assistantis True.backend/app/api/routes/collections.py (3)
42-55: Excellent callback documentation for OpenAPI.The callback router specification properly documents the expected request/response contract for async callbacks, directly addressing the past review comment about documenting callbacks in Swagger. This follows FastAPI best practices.
Based on past review comments.
108-115: Clear metadata note for vector-store-only mode.The conditional metadata note effectively communicates to API consumers when assistant creation is skipped, improving the developer experience.
186-191: Good conditional document loading with pagination.The
include_docsflag with pagination parameters provides flexible API control. Conditionally loading documents only when requested avoids unnecessary joins and data transfer.
backend/app/tests/api/routes/collections/test_collection_list.py
Outdated
Show resolved
Hide resolved
backend/app/tests/services/collections/test_create_collection.py
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (1)
backend/app/services/collections/create_collection.py (1)
317-319: Potential NameError in exception handler.If an exception occurs before line 189 (where
creation_requestis assigned), the reference at line 317 will raise aNameError. Initializecreation_request = Nonewith the other resource variables at the top of the function (around line 186):# Keep references for potential backout/cleanup on failure assistant = None assistant_crud = None vector_store = None vector_store_crud = None collection_job = None + creation_request = None try: creation_request = CreationRequest(**request)
🧹 Nitpick comments (3)
backend/app/services/collections/delete_collection.py (1)
57-96: Consider consolidating payload builders into shared helpers.The
build_success_payloadandbuild_failure_payloadfunctions are duplicated betweendelete_collection.pyandcreate_collection.py(lines 76-117) with only minor differences. Consider extracting these toapp/services/collections/helpers.pyas generic functions that accept the appropriate parameters.backend/app/tests/api/routes/collections/test_collection_list.py (1)
108-127: Unused fixture parameter.The
dbparameter is declared but never used in this test. Since the test assumes a potentially empty database, the fixture is unnecessary. Remove it:def test_list_collections_does_not_error_with_no_collections( - db: Session, client: TestClient, user_api_key_header, ):backend/app/services/collections/create_collection.py (1)
76-117: Payload builders duplicated across services.The
build_success_payloadandbuild_failure_payloadfunctions are duplicated in bothcreate_collection.pyanddelete_collection.pywith only minor signature differences. Consider consolidating these intoapp/services/collections/helpers.pyas generic utilities.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
backend/app/api/routes/collections.py(3 hunks)backend/app/services/collections/create_collection.py(3 hunks)backend/app/services/collections/delete_collection.py(2 hunks)backend/app/tests/api/routes/collections/test_collection_list.py(1 hunks)backend/app/tests/services/collections/test_create_collection.py(8 hunks)
🧰 Additional context used
📓 Path-based instructions (3)
**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Use type hints in Python code (Python 3.11+ project)
Files:
backend/app/tests/services/collections/test_create_collection.pybackend/app/services/collections/create_collection.pybackend/app/api/routes/collections.pybackend/app/services/collections/delete_collection.pybackend/app/tests/api/routes/collections/test_collection_list.py
backend/app/services/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Implement business logic services under backend/app/services/
Files:
backend/app/services/collections/create_collection.pybackend/app/services/collections/delete_collection.py
backend/app/api/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Expose FastAPI REST endpoints under backend/app/api/ organized by domain
Files:
backend/app/api/routes/collections.py
🧬 Code graph analysis (5)
backend/app/tests/services/collections/test_create_collection.py (6)
backend/app/models/collection.py (1)
CreationRequest(121-129)backend/app/services/collections/create_collection.py (2)
start_job(43-73)execute_job(165-319)backend/app/tests/utils/openai.py (1)
get_mock_openai_client_with_vector_store(114-141)backend/app/tests/utils/collection.py (2)
get_collection_job(74-95)get_collection(25-47)backend/app/tests/utils/document.py (2)
project(65-66)DocumentStore(53-80)backend/app/core/cloud/storage.py (1)
AmazonCloudStorageClient(28-84)
backend/app/services/collections/create_collection.py (12)
backend/app/models/collection.py (5)
CollectionPublic(143-152)CreationRequest(121-129)AssistantOptions(58-111)Collection(14-37)extract_super_type(126-129)backend/app/models/collection_job.py (5)
CollectionJobPublic(108-113)CollectionJob(24-72)job_id(63-64)CollectionJobUpdate(83-88)CollectionJobStatus(12-16)backend/app/services/collections/helpers.py (3)
_backout(72-80)batch_documents(50-69)extract_error_message(21-47)backend/app/celery/utils.py (1)
start_low_priority_job(46-71)backend/app/utils.py (4)
get_openai_client(176-206)APIResponse(30-54)success_response(37-40)failure_response(43-54)backend/app/services/collections/delete_collection.py (3)
build_success_payload(57-73)build_failure_payload(76-96)execute_job(146-238)backend/app/crud/collection/collection_job.py (4)
update(58-76)CollectionJobCrud(19-96)read_one(24-45)create(78-96)backend/app/crud/collection/collection.py (3)
read_one(59-83)create(43-57)CollectionCrud(18-145)backend/app/core/cloud/storage.py (1)
get_cloud_storage(262-279)backend/app/crud/document.py (1)
DocumentCrud(14-135)backend/app/crud/rag/open_ai.py (2)
OpenAIVectorStoreCrud(101-189)OpenAIAssistantCrud(192-246)backend/app/crud/document_collection.py (1)
DocumentCollectionCrud(8-48)
backend/app/api/routes/collections.py (7)
backend/app/models/collection_job.py (5)
CollectionJobPublic(108-113)CollectionJobImmediatePublic(103-105)CollectionJobCreate(76-80)CollectionActionType(19-21)CollectionJobStatus(12-16)backend/app/models/collection.py (5)
CollectionWithDocsPublic(155-156)CreationRequest(121-129)CallbackRequest(114-118)DeletionRequest(132-133)CollectionPublic(143-152)backend/app/utils.py (4)
APIResponse(30-54)load_description(279-284)success_response(37-40)_(288-289)backend/app/crud/collection/collection.py (7)
CollectionCrud(18-145)read_all(85-94)delete(103-111)_(114-121)_(124-145)read_one(59-83)create(43-57)backend/app/services/collections/create_collection.py (1)
start_job(43-73)backend/app/services/collections/delete_collection.py (1)
start_job(26-54)backend/app/crud/document_collection.py (3)
create(12-23)DocumentCollectionCrud(8-48)read(25-48)
backend/app/services/collections/delete_collection.py (7)
backend/app/crud/rag/open_ai.py (2)
OpenAIAssistantCrud(192-246)OpenAIVectorStoreCrud(101-189)backend/app/models/collection_job.py (5)
CollectionJobStatus(12-16)CollectionJobUpdate(83-88)CollectionJob(24-72)CollectionJobPublic(108-113)job_id(63-64)backend/app/models/collection.py (2)
CollectionIDPublic(139-140)DeletionRequest(132-133)backend/app/services/collections/helpers.py (1)
extract_error_message(21-47)backend/app/utils.py (4)
get_openai_client(176-206)APIResponse(30-54)success_response(37-40)failure_response(43-54)backend/app/crud/collection/collection_job.py (2)
update(58-76)read_one(24-45)backend/app/crud/collection/collection.py (4)
read_one(59-83)CollectionCrud(18-145)delete(103-111)delete_by_id(96-100)
backend/app/tests/api/routes/collections/test_collection_list.py (3)
backend/app/tests/utils/utils.py (1)
get_project(54-73)backend/app/tests/utils/collection.py (2)
get_collection(25-47)get_vector_store_collection(50-71)backend/app/tests/conftest.py (2)
user_api_key_header(90-92)db(28-45)
🪛 Ruff (0.14.3)
backend/app/services/collections/create_collection.py
172-172: Unused function argument: task_instance
(ARG001)
backend/app/services/collections/delete_collection.py
153-153: Unused function argument: task_instance
(ARG001)
backend/app/tests/api/routes/collections/test_collection_list.py
109-109: Unused function argument: db
(ARG001)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: checks (3.11.7, 6)
🔇 Additional comments (10)
backend/app/services/collections/delete_collection.py (2)
165-238: Bootstrap error handling is adequate.The outer try/except block (lines 165-238) properly catches bootstrap failures and delegates to
_mark_job_failed_and_callback, which handles job state updates and callback notifications. This addresses the previous concern about jobs getting stuck in PROCESSING state.
18-18: Magic string concern addressed.The
OPENAI_VECTOR_STOREconstant is properly imported from the helpers module, addressing the previous concern about magic strings. This ensures consistency across create and delete operations.Based on learnings
backend/app/tests/services/collections/test_create_collection.py (2)
50-113: Test structure properly updated.The test correctly uses the new
get_collection_jobhelper and validates the updatedstart_jobsignature withwith_assistantparameter and JSON serialization viarequest.model_dump(mode="json").
195-257: Good coverage of assistant creation failure scenario.This test properly validates that when assistant creation fails, the job is marked FAILED with the error message, and the vector store is cleaned up. The mock assertions confirm the cleanup path is invoked.
backend/app/services/collections/create_collection.py (1)
219-246: Clear conditional logic for assistant creation.The
with_assistantflag properly controls whether to create an assistant-backed collection or a standalone vector store. The logging messages clearly document each path.backend/app/api/routes/collections.py (5)
42-55: Callback endpoint properly documented.The callback router and notification endpoint provide clear OpenAPI documentation for consumers, showing the expected callback payload structure with
APIResponse[CollectionJobPublic].
94-97: Clean derivation of with_assistant flag.The logic correctly determines
with_assistantby checking bothmodelandinstructionsare present, aligning with the validation inAssistantOptions.
108-115: Helpful metadata note for vector-store-only creation.The metadata note clearly communicates to users when an assistant is not being created, which improves the API developer experience.
132-140: DELETE endpoint design follows REST conventions.The endpoint properly uses DELETE method with the collection ID in the path and an optional callback configuration in the body. The upfront validation via
read_oneensures the collection exists before queuing the deletion job.
174-179: Query parameters provide flexible document retrieval.The
include_docs,skip, andlimitparameters allow callers to control pagination and whether to fetch associated documents, which is good API design for performance.
There was a problem hiding this comment.
Actionable comments posted: 2
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (6)
backend/app/api/docs/collections/create.md(2 hunks)backend/app/api/docs/collections/delete.md(1 hunks)backend/app/api/docs/collections/info.md(1 hunks)backend/app/api/docs/collections/job_info.md(1 hunks)backend/app/api/routes/collections.py(3 hunks)backend/app/services/collections/delete_collection.py(2 hunks)
🚧 Files skipped from review as they are similar to previous changes (3)
- backend/app/api/docs/collections/delete.md
- backend/app/api/docs/collections/info.md
- backend/app/api/docs/collections/create.md
🧰 Additional context used
📓 Path-based instructions (3)
**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Use type hints in Python code (Python 3.11+ project)
Files:
backend/app/services/collections/delete_collection.pybackend/app/api/routes/collections.py
backend/app/services/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Implement business logic services under backend/app/services/
Files:
backend/app/services/collections/delete_collection.py
backend/app/api/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Expose FastAPI REST endpoints under backend/app/api/ organized by domain
Files:
backend/app/api/routes/collections.py
🧬 Code graph analysis (2)
backend/app/services/collections/delete_collection.py (7)
backend/app/crud/rag/open_ai.py (2)
OpenAIAssistantCrud(192-246)OpenAIVectorStoreCrud(101-189)backend/app/models/collection_job.py (5)
CollectionJobStatus(12-16)CollectionJobUpdate(83-88)CollectionJob(24-72)CollectionJobPublic(108-113)job_id(63-64)backend/app/models/collection.py (2)
CollectionIDPublic(139-140)DeletionRequest(132-133)backend/app/services/collections/helpers.py (1)
extract_error_message(21-47)backend/app/utils.py (4)
get_openai_client(176-206)APIResponse(30-54)success_response(37-40)failure_response(43-54)backend/app/crud/collection/collection_job.py (3)
update(58-76)CollectionJobCrud(19-96)read_one(24-45)backend/app/crud/collection/collection.py (4)
read_one(59-83)CollectionCrud(18-145)delete(103-111)delete_by_id(96-100)
backend/app/api/routes/collections.py (6)
backend/app/models/collection_job.py (5)
CollectionJobPublic(108-113)CollectionJobImmediatePublic(103-105)CollectionJobCreate(76-80)CollectionActionType(19-21)CollectionJobStatus(12-16)backend/app/models/collection.py (5)
CollectionWithDocsPublic(155-156)CreationRequest(121-129)CallbackRequest(114-118)DeletionRequest(132-133)CollectionPublic(143-152)backend/app/utils.py (4)
APIResponse(30-54)load_description(279-284)success_response(37-40)_(288-289)backend/app/crud/collection/collection.py (7)
CollectionCrud(18-145)read_all(85-94)delete(103-111)_(114-121)_(124-145)read_one(59-83)create(43-57)backend/app/crud/document_collection.py (3)
create(12-23)DocumentCollectionCrud(8-48)read(25-48)backend/app/models/document.py (1)
DocumentPublic(43-60)
🪛 markdownlint-cli2 (0.18.1)
backend/app/api/docs/collections/job_info.md
6-6: Unordered list indentation
Expected: 2; Actual: 4
(MD007, ul-indent)
7-7: Unordered list indentation
Expected: 2; Actual: 4
(MD007, ul-indent)
🪛 Ruff (0.14.3)
backend/app/services/collections/delete_collection.py
153-153: Unused function argument: task_instance
(ARG001)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: checks (3.11.7, 6)
There was a problem hiding this comment.
Actionable comments posted: 1
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
backend/app/tests/api/routes/collections/test_collection_info.py(1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Use type hints in Python code (Python 3.11+ project)
Files:
backend/app/tests/api/routes/collections/test_collection_info.py
🧬 Code graph analysis (1)
backend/app/tests/api/routes/collections/test_collection_info.py (5)
backend/app/tests/utils/utils.py (2)
get_project(54-73)get_document(102-123)backend/app/tests/utils/collection.py (2)
get_collection(25-47)get_vector_store_collection(50-71)backend/app/crud/document_collection.py (1)
DocumentCollectionCrud(8-48)backend/app/tests/conftest.py (3)
db(28-45)client(65-68)user_api_key_header(90-92)backend/app/crud/collection/collection.py (3)
create(43-57)_(114-121)_(124-145)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: checks (3.11.7, 6)
backend/app/tests/api/routes/collections/test_collection_info.py
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (1)
backend/app/services/collections/delete_collection.py (1)
99-144: Critical: Handle bootstrap failures whencollection_jobisNone.If
get_openai_clientor any bootstrap step before line 170 throws,execute_jobcalls this function withcollection_job=None. Line 118 will raiseAttributeErroroncollection_job.id, the exception handler at line 125 swallows it, and the job stays stuck inPENDING/PROCESSINGwith no callback. This is the exact issue flagged in past reviews.Apply this diff to use the
job_idparameter as fallback:def _mark_job_failed_and_callback( *, project_id: int, - collection_job: CollectionJob, + collection_job: CollectionJob | None, collection_id: UUID, job_id: UUID, err: Exception, callback_url: str | None, ) -> None: """ Common failure handler: - mark job as FAILED with error_message - log error - send failure callback (if configured) """ try: with Session(engine) as session: collection_job_crud = CollectionJobCrud(session, project_id) + # Use job_id when collection_job is None (bootstrap failures) + target_id = collection_job.id if collection_job else job_id collection_job_crud.update( - collection_job.id, + target_id, CollectionJobUpdate( status=CollectionJobStatus.FAILED, error_message=str(err), ), ) - collection_job = collection_job_crud.read_one(collection_job.id) + collection_job = collection_job_crud.read_one(target_id) except Exception: logger.warning("[delete_collection.execute_job] Failed to mark job as FAILED")This ensures the job is marked
FAILEDand the callback fires even when bootstrap steps fail beforecollection_jobis loaded.
🧹 Nitpick comments (4)
backend/app/api/docs/collections/job_info.md (1)
5-8: Simplify the conditional phrasing on line 5 for clarity.The phrasing "If the job has finished, has been successful and it was a job of creation of collection" is verbose and difficult to parse. Consider rephrasing to be more concise and direct, similar to the pattern used in line 9.
Apply this diff:
-* If the job has finished, has been successful and it was a job of creation of collection then this endpoint will fetch the associated collection details from the collection table, including: +* If the collection-creation job succeeds, the response includes the associated collection details from the collection table:backend/app/services/collections/delete_collection.py (3)
146-154: Add return type annotation.The function is missing a return type annotation. As per coding guidelines, use type hints in Python code.
Apply this diff:
def execute_job( request: dict, project_id: int, organization_id: int, task_id: str, job_id: str, collection_id: str, task_instance, -) -> None: +) -> None:Wait, the return type is already
-> Noneon line 154 in the provided code. Let me re-check... Actually, looking at the annotated code, line 154 shows:154 ) -> None:But line 154 doesn't have a
~marker, so this was already present in the old code. Let me look again at the signature... Lines 146-154 show the full signature with) -> None:at line 154. So the return type IS present. I was mistaken.Actually, I see the issue - let me re-read the annotated code format. The code shows line 154 ending the function signature, but I don't see
) -> None:explicitly. Let me check the actual code block again...Looking at lines 146-154:
146 def execute_job( 147 request: dict, 148 project_id: int, 149 organization_id: int, 150 task_id: str, 151 job_id: str, 152 collection_id: str, 153 task_instance, 154 ) -> None:Oh I see, line 154 DOES show
) -> None:. So the return type annotation is present. I was confused. Let me revise my analysis.So the return type is fine. The only issue is the unused task_instance parameter, which is very minor.
[scratchpad_end] -->
153-153: Optional: Document or prefix unusedtask_instanceparameter.Static analysis flags
task_instanceas unused. If it's required by Celery's signature, consider prefixing with underscore (_task_instance) to signal intentional non-use.Based on coding guidelines.
187-191: Consider adding a comment explaining the intentional session closure.Lines 187-190 are intentionally outside the first
with Sessionblock to avoid holding database connections during external API calls (as noted in PR objectives). While this is correct, the indentation might confuse readers unfamiliar with the pattern.Consider adding a brief comment:
+ # Close DB session before external API calls to avoid holding connection locks if is_vector: OpenAIVectorStoreCrud(client).delete(llm_service_id) else: OpenAIAssistantCrud(client).delete(llm_service_id)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
backend/app/api/docs/collections/job_info.md(1 hunks)backend/app/services/collections/delete_collection.py(2 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Use type hints in Python code (Python 3.11+ project)
Files:
backend/app/services/collections/delete_collection.py
backend/app/services/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Implement business logic services under backend/app/services/
Files:
backend/app/services/collections/delete_collection.py
🧬 Code graph analysis (1)
backend/app/services/collections/delete_collection.py (8)
backend/app/crud/rag/open_ai.py (2)
OpenAIAssistantCrud(192-246)OpenAIVectorStoreCrud(101-189)backend/app/models/collection_job.py (5)
CollectionJobStatus(12-16)CollectionJobUpdate(83-88)CollectionJob(24-72)CollectionJobPublic(108-113)job_id(63-64)backend/app/models/collection.py (2)
CollectionIDPublic(139-140)DeletionRequest(132-133)backend/app/services/collections/helpers.py (1)
extract_error_message(21-47)backend/app/utils.py (4)
get_openai_client(176-206)APIResponse(30-54)success_response(37-40)failure_response(43-54)backend/app/services/collections/create_collection.py (3)
build_success_payload(76-94)build_failure_payload(97-117)execute_job(165-319)backend/app/crud/collection/collection_job.py (3)
update(58-76)CollectionJobCrud(19-96)read_one(24-45)backend/app/crud/collection/collection.py (4)
read_one(59-83)CollectionCrud(18-145)delete(103-111)delete_by_id(96-100)
🪛 markdownlint-cli2 (0.18.1)
backend/app/api/docs/collections/job_info.md
6-6: Unordered list indentation
Expected: 2; Actual: 4
(MD007, ul-indent)
7-7: Unordered list indentation
Expected: 2; Actual: 4
(MD007, ul-indent)
🪛 Ruff (0.14.3)
backend/app/services/collections/delete_collection.py
153-153: Unused function argument: task_instance
(ARG001)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: checks (3.11.7, 6)
🔇 Additional comments (5)
backend/app/services/collections/delete_collection.py (5)
1-21: LGTM: Imports align with vector store vs assistant deletion paths.The imports properly support the new dual-path deletion logic and use
OPENAI_VECTOR_STOREconstant from helpers to avoid magic strings.
26-54: LGTM: Request-driven job scheduling.The refactor to pass
request.model_dump(mode="json")and derive collection context fromrequest.collection_idaligns with the PR objectives and maintains proper separation of concerns.
57-96: LGTM: Payload builders are clean and consistent.The success and failure payload functions properly construct standardized APIResponse objects with CollectionJobPublic. Using
extract_error_messagefor error normalization is a good practice.
182-184: LGTM: Clean service type detection.The refactored logic directly uses
collection.llm_service_idwithout redundant attribute fallbacks, addressing past review feedback.
192-227: Success and callback paths are correct.The local deletion, job status update, and conditional callback sending are properly implemented. The exception handler correctly delegates to the centralized failure handler (though that handler has the critical issue flagged separately).
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (1)
backend/app/services/collections/delete_collection.py (1)
99-144: Critical: _mark_job_failed_and_callback still fails when collection_job is None.The past review comment remains unaddressed. When bootstrap fails (e.g.,
get_openai_clientthrows on line 177 orCollectionJobCrud.read_onefails on line 168),collection_jobisNonewhen this handler is called from line 220. Line 118'scollection_job.idraisesAttributeError, the job staysPROCESSING, and clients poll forever.Apply this diff to handle None collection_job:
def _mark_job_failed_and_callback( *, project_id: int, - collection_job: CollectionJob, + collection_job: CollectionJob | None, collection_id: UUID, job_id: UUID, err: Exception, callback_url: str | None, ) -> None: """ Common failure handler: - mark job as FAILED with error_message - log error - send failure callback (if configured) """ try: with Session(engine) as session: collection_job_crud = CollectionJobCrud(session, project_id) + job_pk = collection_job.id if collection_job else job_id collection_job_crud.update( - collection_job.id, + job_pk, CollectionJobUpdate( status=CollectionJobStatus.FAILED, error_message=str(err), ), ) - collection_job = collection_job_crud.read_one(collection_job.id) + collection_job = collection_job_crud.read_one(job_pk) except Exception: logger.warning("[delete_collection.execute_job] Failed to mark job as FAILED")
🧹 Nitpick comments (1)
backend/app/services/collections/delete_collection.py (1)
153-153: Prefix unused parameter with underscore.The
task_instanceparameter is never used. If it's required by the Celery signature, prefix it with an underscore to signal intentional non-use.- task_instance, + _task_instance, ) -> None:
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
backend/app/services/collections/delete_collection.py(2 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Use type hints in Python code (Python 3.11+ project)
Files:
backend/app/services/collections/delete_collection.py
backend/app/services/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Implement business logic services under backend/app/services/
Files:
backend/app/services/collections/delete_collection.py
🧬 Code graph analysis (1)
backend/app/services/collections/delete_collection.py (8)
backend/app/crud/rag/open_ai.py (2)
OpenAIAssistantCrud(192-246)OpenAIVectorStoreCrud(101-189)backend/app/models/collection_job.py (5)
CollectionJobStatus(12-16)CollectionJobUpdate(83-88)CollectionJob(24-72)CollectionJobPublic(108-113)job_id(63-64)backend/app/models/collection.py (2)
CollectionIDPublic(139-140)DeletionRequest(132-133)backend/app/services/collections/helpers.py (1)
extract_error_message(21-47)backend/app/utils.py (4)
get_openai_client(176-206)APIResponse(30-54)success_response(37-40)failure_response(43-54)backend/app/services/collections/create_collection.py (2)
build_success_payload(76-94)build_failure_payload(97-117)backend/app/crud/collection/collection_job.py (3)
update(58-76)CollectionJobCrud(19-96)read_one(24-45)backend/app/crud/collection/collection.py (4)
read_one(59-83)CollectionCrud(18-145)delete(103-111)delete_by_id(96-100)
🪛 Ruff (0.14.3)
backend/app/services/collections/delete_collection.py
153-153: Unused function argument: task_instance
(ARG001)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: checks (3.11.7, 6)
There was a problem hiding this comment.
Actionable comments posted: 0
♻️ Duplicate comments (1)
backend/app/services/collections/delete_collection.py (1)
99-143: Keep the failure handler working when the job never loaded.If
CollectionJobCrud.read_one(job_uuid)raises (e.g., the job row was deleted before the worker ran), we enter theexceptblock withcollection_jobstillNone._mark_job_failed_and_callbackthen dereferencescollection_job.id, raisingAttributeError, so the job never transitions toFAILEDand no callback fires—the worker logs the warning and exits, leaving clients polling forever. Please fall back tojob_idwhenever the in-memory job object is missing so the failure path can still mark the job and dispatch the callback.-def _mark_job_failed_and_callback( - *, - project_id: int, - collection_job: CollectionJob, +def _mark_job_failed_and_callback( + *, + project_id: int, + collection_job: CollectionJob | None, collection_id: UUID, job_id: UUID, err: Exception, callback_url: str | None, ) -> None: @@ - with Session(engine) as session: - collection_job_crud = CollectionJobCrud(session, project_id) - collection_job_crud.update( - collection_job.id, - CollectionJobUpdate( - status=CollectionJobStatus.FAILED, - error_message=str(err), - ), - ) - collection_job = collection_job_crud.read_one(collection_job.id) + with Session(engine) as session: + collection_job_crud = CollectionJobCrud(session, project_id) + job_pk = collection_job.id if collection_job else job_id + collection_job = collection_job_crud.update( + job_pk, + CollectionJobUpdate( + status=CollectionJobStatus.FAILED, + error_message=str(err), + ), + )
🧹 Nitpick comments (1)
backend/app/services/collections/delete_collection.py (1)
146-218: Silence the unused Celery task argument.Ruff (ARG001) flags
task_instanceas unused. Drop it after the docstring so we keep the signature Celery expects while satisfying lint.def execute_job( request: dict, project_id: int, organization_id: int, task_id: str, job_id: str, collection_id: str, task_instance, ) -> None: """Celery worker entrypoint for deleting a collection (both remote and local).""" + del task_instance + deletion_request = DeletionRequest(**request)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
backend/app/api/docs/collections/job_info.md(1 hunks)backend/app/services/collections/delete_collection.py(2 hunks)backend/app/tests/api/routes/collections/test_collection_info.py(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- backend/app/api/docs/collections/job_info.md
🧰 Additional context used
📓 Path-based instructions (2)
**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Use type hints in Python code (Python 3.11+ project)
Files:
backend/app/services/collections/delete_collection.pybackend/app/tests/api/routes/collections/test_collection_info.py
backend/app/services/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Implement business logic services under backend/app/services/
Files:
backend/app/services/collections/delete_collection.py
🧬 Code graph analysis (2)
backend/app/services/collections/delete_collection.py (8)
backend/app/crud/rag/open_ai.py (2)
OpenAIAssistantCrud(192-246)OpenAIVectorStoreCrud(101-189)backend/app/models/collection_job.py (5)
CollectionJobStatus(12-16)CollectionJobUpdate(83-88)CollectionJob(24-72)CollectionJobPublic(108-113)job_id(63-64)backend/app/models/collection.py (2)
CollectionIDPublic(139-140)DeletionRequest(132-133)backend/app/services/collections/helpers.py (1)
extract_error_message(21-47)backend/app/utils.py (4)
get_openai_client(176-206)APIResponse(30-54)success_response(37-40)failure_response(43-54)backend/app/services/collections/create_collection.py (3)
build_success_payload(76-94)build_failure_payload(97-117)execute_job(165-319)backend/app/crud/collection/collection_job.py (3)
update(58-76)CollectionJobCrud(19-96)read_one(24-45)backend/app/crud/collection/collection.py (4)
read_one(59-83)CollectionCrud(18-145)delete(103-111)delete_by_id(96-100)
backend/app/tests/api/routes/collections/test_collection_info.py (6)
backend/app/tests/utils/utils.py (2)
get_project(54-73)get_document(102-123)backend/app/tests/utils/collection.py (2)
get_collection(25-47)get_vector_store_collection(50-71)backend/app/crud/document_collection.py (1)
DocumentCollectionCrud(8-48)backend/app/models/collection.py (1)
Collection(14-37)backend/app/models/document.py (1)
Document(20-40)backend/app/tests/conftest.py (2)
db(28-45)user_api_key_header(90-92)
🪛 Ruff (0.14.3)
backend/app/services/collections/delete_collection.py
153-153: Unused function argument: task_instance
(ARG001)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: checks (3.11.7, 6)
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
backend/app/services/collections/delete_collection.py (1)
152-152: Consider prefixing unused parameter with underscore.The
task_instanceparameter is flagged as unused by Ruff. If it's required by the Celery framework signature but not used in the function body, consider renaming it to_task_instanceto signal intentional non-use.def execute_job( request: dict, project_id: int, organization_id: int, task_id: str, job_id: str, collection_id: str, - task_instance, + _task_instance, ) -> None:
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
backend/app/services/collections/delete_collection.py(2 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Use type hints in Python code (Python 3.11+ project)
Files:
backend/app/services/collections/delete_collection.py
backend/app/services/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Implement business logic services under backend/app/services/
Files:
backend/app/services/collections/delete_collection.py
🧬 Code graph analysis (1)
backend/app/services/collections/delete_collection.py (8)
backend/app/crud/rag/open_ai.py (2)
OpenAIAssistantCrud(192-246)OpenAIVectorStoreCrud(101-189)backend/app/models/collection_job.py (5)
CollectionJobStatus(12-16)CollectionJobUpdate(83-88)CollectionJob(24-72)CollectionJobPublic(108-113)job_id(63-64)backend/app/models/collection.py (2)
CollectionIDPublic(139-140)DeletionRequest(132-133)backend/app/services/collections/helpers.py (1)
extract_error_message(21-47)backend/app/utils.py (4)
get_openai_client(176-206)APIResponse(30-54)success_response(37-40)failure_response(43-54)backend/app/services/collections/create_collection.py (3)
build_success_payload(76-94)build_failure_payload(97-117)execute_job(165-319)backend/app/crud/collection/collection_job.py (3)
update(58-76)CollectionJobCrud(19-96)read_one(24-45)backend/app/crud/collection/collection.py (3)
read_one(59-83)delete(103-111)delete_by_id(96-100)
🪛 Ruff (0.14.3)
backend/app/services/collections/delete_collection.py
152-152: Unused function argument: task_instance
(ARG001)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: checks (3.11.7, 6)
🔇 Additional comments (4)
backend/app/services/collections/delete_collection.py (4)
26-54: LGTM!The refactor to use
request.collection_idand serialize the request withmode="json"aligns well with the updated job-driven architecture.
57-96: LGTM!The payload builders follow the same pattern as
create_collection.pyand appropriately return only the collection ID for deletion operations. Error extraction viaextract_error_messageensures consistent error formatting.
164-224: Excellent fix of the critical indentation bug!The local deletion block (lines 191-202) is now correctly dedented and executes for both vector stores and assistants. Session management is also well-designed: the first session closes before external API calls, and a fresh session opens for local persistence.
181-183: LGTM on service type detection!Using the
OPENAI_VECTOR_STOREconstant for service type comparison addresses the past review concern about magic strings and improves maintainability.
| "/jobs/{job_id}", | ||
| description=load_description("collections/job_info.md"), | ||
| response_model=APIResponse[CollectionJobPublic], | ||
| response_model_exclude={"data": {"error_message"}}, |
There was a problem hiding this comment.
Why response_model_exclude here?
| raw_error = getattr(collection_job, "error_message", None) | ||
| error_message = extract_error_message(raw_error) | ||
|
|
||
| return APIResponse.failure_response( |
There was a problem hiding this comment.
I believe there is some confusion here,
This endpoint gives info of job status,
If job is failure it should not return the failure_response,
Actually your request to this particular endpoint is succesful,
data should contain whatever a job status is (refer doc_transformation endpoints)
There was a problem hiding this comment.
This was my point when all three of us had a discussion about it, that we don't need to return back a failure response here for a failed job, but @kartpop and you were in the support of of returning "false" for success for a failed job, with the error message being returned to the error key of failure response body, or was that just for callback, that when a job has failed then we send success as false to the callback
There was a problem hiding this comment.
The failed job we were referring to earlier was specifically about the callback response sent when a failure occurs. That is a completely different endpoint
There was a problem hiding this comment.
we had a similar discussion for this as well, where i had a similar point that job execution in its whole has not failed, just the job failed, the reason why i updated all the response bodies here , nonethless i still agree with this point that we do not need to show failure response in this endpoint for a failed job
There was a problem hiding this comment.
Updated this endpoint to return a success response even for failed jobs, because a failure response should only be used when the execution of the endpoint itself fails (e.g., an exception or server error).
If the endpoint successfully retrieves the job for the given job ID, it has fulfilled its responsibility — that is not considered an endpoint failure even if the job failed.
Similarly, the top-level "error" field in the response is reserved for issues that occur during the execution of the endpoint itself. If the job itself failed, that "error_message" should be represented within the job object rather than in the top-level error field.
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (1)
backend/app/services/collections/delete_collection.py (1)
192-203: Usejob_uuidinstead ofcollection_job.idfor consistency.Lines 197 and 203 reference
collection_job.id, which is an attribute from an object loaded in the first session (now closed). While accessing the ID attribute is safe, usingjob_uuiddirectly is cleaner and avoids any potential confusion about session boundaries.Apply this diff:
with Session(engine) as session: CollectionCrud(session, project_id).delete_by_id(collection_id) collection_job_crud = CollectionJobCrud(session, project_id) collection_job_crud.update( - collection_job.id, + job_uuid, CollectionJobUpdate( status=CollectionJobStatus.SUCCESSFUL, error_message=None, ), ) - collection_job = collection_job_crud.read_one(collection_job.id) + collection_job = collection_job_crud.read_one(job_uuid)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
backend/app/api/routes/collection_job.py(3 hunks)backend/app/services/collections/delete_collection.py(2 hunks)backend/app/tests/api/routes/collections/test_collection_job_info.py(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- backend/app/tests/api/routes/collections/test_collection_job_info.py
🧰 Additional context used
📓 Path-based instructions (3)
**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Use type hints in Python code (Python 3.11+ project)
Files:
backend/app/services/collections/delete_collection.pybackend/app/api/routes/collection_job.py
backend/app/services/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Implement business logic services under backend/app/services/
Files:
backend/app/services/collections/delete_collection.py
backend/app/api/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Expose FastAPI REST endpoints under backend/app/api/ organized by domain
Files:
backend/app/api/routes/collection_job.py
🧬 Code graph analysis (2)
backend/app/services/collections/delete_collection.py (6)
backend/app/crud/rag/open_ai.py (2)
OpenAIAssistantCrud(192-246)OpenAIVectorStoreCrud(101-189)backend/app/services/collections/helpers.py (1)
extract_error_message(21-47)backend/app/utils.py (4)
get_openai_client(176-206)APIResponse(30-54)success_response(37-40)failure_response(43-54)backend/app/services/collections/create_collection.py (2)
build_success_payload(76-94)build_failure_payload(97-117)backend/app/crud/collection/collection_job.py (3)
update(58-76)CollectionJobCrud(19-96)read_one(24-45)backend/app/crud/collection/collection.py (4)
read_one(59-83)CollectionCrud(18-145)delete(103-111)delete_by_id(96-100)
backend/app/api/routes/collection_job.py (5)
backend/app/models/collection_job.py (3)
CollectionJobStatus(12-16)CollectionActionType(19-21)CollectionJobPublic(108-113)backend/app/models/collection.py (2)
CollectionIDPublic(139-140)CollectionPublic(143-152)backend/app/crud/collection/collection.py (2)
CollectionCrud(18-145)read_one(59-83)backend/app/crud/collection/collection_job.py (1)
read_one(24-45)backend/app/services/collections/helpers.py (1)
extract_error_message(21-47)
🪛 Ruff (0.14.3)
backend/app/services/collections/delete_collection.py
153-153: Unused function argument: task_instance
(ARG001)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: checks (3.11.7, 6)
🔇 Additional comments (6)
backend/app/services/collections/delete_collection.py (6)
26-54: LGTM: Clean job scheduling with proper type hints.The function correctly schedules the deletion job using the collection ID from the request and passes the serialized request to the worker. Type hints are properly used throughout, adhering to the coding guidelines.
57-96: LGTM: Consistent payload builders for success and failure cases.The payload builders follow the same pattern as
create_collection.pyand appropriately use only the collection ID (rather than the full object) since the collection may be deleted. Error message extraction is properly delegated to the helper utility.
99-143: LGTM: Robust centralized failure handler.The failure handler correctly:
- Initializes
collection_job = Noneto prevent NameError if DB operations fail (addresses past review concern)- Opens a fresh session to mark the job as FAILED
- Gracefully degrades if the DB update fails (logs but continues)
- Only sends callback when both the URL and the loaded job are available
146-164: Note: Unused parameter is likely required by Celery signature.The
task_instanceparameter (line 153) is flagged as unused by static analysis. This is typically required by Celery's worker signature even when not used in the function body. You can silence the warning by prefixing with an underscore (_task_instance) if desired.
165-190: LGTM: Proper session management before external API calls.The code correctly closes the DB session (line 166-186) before making external deletion calls (lines 187-190). This avoids holding database connections and transaction locks during potentially long-running external API operations, which is a best practice for async job processing.
165-225: LGTM: Comprehensive error handling for the deletion flow.The outer try-except (lines 165-225) properly catches failures from:
- Bootstrap phase (getting client, reading job/collection)
- External deletion (OpenAI API calls)
- Local deletion (database operations)
All failures are routed to the centralized
_mark_job_failed_and_callbackhandler, which ensures the job status is updated and callbacks are triggered. This addresses the bootstrap failure concerns from previous reviews.
kartpop
left a comment
There was a problem hiding this comment.
approved with comments - please resolve
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
backend/app/api/docs/collections/create.md (1)
14-30: Tighten wording and punctuation in async response note.Please add the comma after
"instruction"(“..., otherwise only a vector store …”), insert a space in “status. Once …”, and say “the creation of the collection” at the end. These small edits keep the guidance easy to read.
♻️ Duplicate comments (1)
backend/app/services/collections/create_collection.py (1)
187-318: Initializecreation_requestbefore entering the try block.If
CreationRequest(**request)raises (e.g., payload drift, schema change), theexceptblock hitsif creation_request …and raisesUnboundLocalError, bypassing failure handling and leaving the job stuck inPENDING/PROCESSING. Setcreation_request: CreationRequest | None = Nonebefore thetry, then assign inside it. Example:- # Keep references for potential backout/cleanup on failure + # Keep references for potential backout/cleanup on failure + creation_request: CreationRequest | None = None assistant = None ... try: creation_request = CreationRequest(**request)This guarantees the cleanup path always runs.
🧹 Nitpick comments (2)
backend/app/api/routes/collections.py (2)
78-78: Consider removingresponse_model_exclude_nonefor API consistency.The
create_collectionanddelete_collectionendpoints useresponse_model_exclude_none=True, whilecollection_infodoes not. This creates inconsistent response shapes across the API:
- With
exclude_none=True: Fields likeerrorandmetadatawill be absent whenNone- Without it: All fields are always present
A past review comment noted: "In a REST API, response structures should remain consistent...if we omit them conditionally, the shape of the response changes. That leads to validation issues on the client side."
Consider removing
response_model_exclude_none=Truefrom lines 78 and 127 to maintain consistent response shapes, or apply it to all endpoints includingcollection_info.Also applies to: 127-127, 168-168
135-135: Consider adding a clarifying comment for the validation check.The underscore assignment pattern is correct for validating that the collection exists (will raise 404 if not found), but a brief comment would improve readability for future maintainers.
+ # Validate collection exists and user has access (raises 404 if not) _ = CollectionCrud(session, current_user.project_id).read_one(collection_id)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (8)
backend/app/api/docs/collections/create.md(2 hunks)backend/app/api/docs/collections/list.md(1 hunks)backend/app/api/routes/collections.py(3 hunks)backend/app/models/collection_job.py(4 hunks)backend/app/services/collections/create_collection.py(3 hunks)backend/app/services/collections/delete_collection.py(2 hunks)backend/app/tests/api/routes/collections/test_collection_info.py(1 hunks)backend/app/tests/api/routes/collections/test_collection_job_info.py(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- backend/app/tests/api/routes/collections/test_collection_job_info.py
🧰 Additional context used
📓 Path-based instructions (4)
**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Use type hints in Python code (Python 3.11+ project)
Files:
backend/app/services/collections/create_collection.pybackend/app/api/routes/collections.pybackend/app/tests/api/routes/collections/test_collection_info.pybackend/app/models/collection_job.pybackend/app/services/collections/delete_collection.py
backend/app/services/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Implement business logic services under backend/app/services/
Files:
backend/app/services/collections/create_collection.pybackend/app/services/collections/delete_collection.py
backend/app/api/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Expose FastAPI REST endpoints under backend/app/api/ organized by domain
Files:
backend/app/api/routes/collections.py
backend/app/models/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Define SQLModel entities (database tables and domain objects) in backend/app/models/
Files:
backend/app/models/collection_job.py
🧠 Learnings (1)
📚 Learning: 2025-10-08T12:05:01.317Z
Learnt from: CR
Repo: ProjectTech4DevAI/ai-platform PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-10-08T12:05:01.317Z
Learning: Applies to backend/app/services/response/**/*.py : Put OpenAI Responses API integration, conversation management, and job execution in backend/app/services/response/
Applied to files:
backend/app/models/collection_job.py
🧬 Code graph analysis (5)
backend/app/services/collections/create_collection.py (10)
backend/app/models/collection.py (5)
CollectionPublic(143-152)CreationRequest(121-129)AssistantOptions(58-111)Collection(14-37)extract_super_type(126-129)backend/app/models/collection_job.py (5)
CollectionJobPublic(102-105)CollectionJob(24-72)job_id(63-64)CollectionJobUpdate(83-88)CollectionJobStatus(12-16)backend/app/services/collections/helpers.py (3)
_backout(72-80)batch_documents(50-69)extract_error_message(21-47)backend/app/utils.py (4)
get_openai_client(176-206)APIResponse(30-54)success_response(37-40)failure_response(43-54)backend/app/services/collections/delete_collection.py (3)
start_job(26-54)build_success_payload(57-72)build_failure_payload(75-91)backend/app/crud/collection/collection_job.py (4)
update(58-76)CollectionJobCrud(19-96)read_one(24-45)create(78-96)backend/app/core/cloud/storage.py (1)
get_cloud_storage(262-279)backend/app/crud/document.py (1)
DocumentCrud(14-135)backend/app/crud/rag/open_ai.py (2)
OpenAIVectorStoreCrud(101-189)OpenAIAssistantCrud(192-246)backend/app/crud/document_collection.py (1)
DocumentCollectionCrud(8-48)
backend/app/api/routes/collections.py (9)
backend/app/models/collection_job.py (5)
CollectionJobPublic(102-105)CollectionJobImmediatePublic(97-99)CollectionJobCreate(76-80)CollectionActionType(19-21)CollectionJobStatus(12-16)backend/app/models/collection.py (5)
CollectionWithDocsPublic(155-156)CreationRequest(121-129)CallbackRequest(114-118)DeletionRequest(132-133)CollectionPublic(143-152)backend/app/utils.py (4)
APIResponse(30-54)load_description(288-293)success_response(37-40)_(297-298)backend/app/crud/collection/collection.py (7)
CollectionCrud(18-145)read_all(85-94)delete(103-111)_(114-121)_(124-145)read_one(59-83)create(43-57)backend/app/crud/collection/collection_job.py (4)
read_all(47-56)read_one(24-45)CollectionJobCrud(19-96)create(78-96)backend/app/services/collections/create_collection.py (1)
start_job(43-73)backend/app/services/collections/delete_collection.py (1)
start_job(26-54)backend/app/crud/document_collection.py (3)
create(12-23)DocumentCollectionCrud(8-48)read(25-48)backend/app/models/document.py (1)
DocumentPublic(43-60)
backend/app/tests/api/routes/collections/test_collection_info.py (4)
backend/app/tests/utils/utils.py (2)
get_project(54-73)get_document(102-123)backend/app/tests/utils/collection.py (2)
get_collection(25-47)get_vector_store_collection(50-71)backend/app/crud/document_collection.py (1)
DocumentCollectionCrud(8-48)backend/app/tests/conftest.py (2)
db(28-45)user_api_key_header(90-92)
backend/app/models/collection_job.py (2)
backend/app/core/util.py (1)
now(13-14)backend/app/models/collection.py (2)
CollectionPublic(143-152)CollectionIDPublic(139-140)
backend/app/services/collections/delete_collection.py (9)
backend/app/crud/rag/open_ai.py (2)
OpenAIAssistantCrud(192-246)OpenAIVectorStoreCrud(101-189)backend/app/models/collection_job.py (5)
CollectionJobStatus(12-16)CollectionJobUpdate(83-88)CollectionJob(24-72)CollectionJobPublic(102-105)job_id(63-64)backend/app/models/collection.py (2)
CollectionIDPublic(139-140)DeletionRequest(132-133)backend/app/services/collections/helpers.py (1)
extract_error_message(21-47)backend/app/celery/utils.py (1)
start_low_priority_job(46-71)backend/app/utils.py (4)
get_openai_client(176-206)APIResponse(30-54)success_response(37-40)failure_response(43-54)backend/app/services/collections/create_collection.py (3)
build_success_payload(76-94)build_failure_payload(97-116)execute_job(164-318)backend/app/crud/collection/collection_job.py (3)
update(58-76)CollectionJobCrud(19-96)read_one(24-45)backend/app/crud/collection/collection.py (4)
read_one(59-83)CollectionCrud(18-145)delete(103-111)delete_by_id(96-100)
🪛 LanguageTool
backend/app/api/docs/collections/create.md
[uncategorized] ~14-~14: Possible missing comma found.
Context: ...model" and "instruction" in the request body otherwise only a vector store will be ...
(AI_HYDRA_LEO_MISSING_COMMA)
[uncategorized] ~25-~25: Possible missing comma found.
Context: ... from this endpoint is collection_job object which is going to contain the collectio...
(AI_HYDRA_LEO_MISSING_COMMA)
[uncategorized] ~30-~30: Possible missing article found.
Context: ...e the information about the creation of collection.
(AI_HYDRA_LEO_MISSING_THE)
backend/app/api/docs/collections/list.md
[uncategorized] ~4-~4: Possible missing comma found.
Context: ... but not deleted If a vector store was created - llm_service_name and `llm_service_i...
(AI_HYDRA_LEO_MISSING_COMMA)
[uncategorized] ~4-~4: The abbreviation “e.g.” (= for example) requires two periods.
Context: ...se denote the name of the vector store (eg. 'openai vector store') and its id. [To...
(E_G)
[uncategorized] ~6-~6: The abbreviation “e.g.” (= for example) requires two periods.
Context: ...me of the model used int the assistant (eg. 'gpt-4o') and assistant id.
(E_G)
🪛 Ruff (0.14.3)
backend/app/services/collections/create_collection.py
171-171: Unused function argument: task_instance
(ARG001)
backend/app/services/collections/delete_collection.py
148-148: Unused function argument: task_instance
(ARG001)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: checks (3.11.7, 6)
🔇 Additional comments (5)
backend/app/api/routes/collections.py (5)
5-5: LGTM: Import additions are appropriate.The new imports support the callback routing, job response models, and document inclusion features introduced in this PR.
Also applies to: 19-21, 25-25
37-56: LGTM: Callback specification is well-documented.The callback router and notification specification clearly document the expected callback behavior for collection operations, addressing previous feedback about callback documentation.
58-70: LGTM: List endpoint is correctly implemented.The endpoint properly retrieves all non-deleted collections for the current project and wraps them in the standard API response format.
133-133: LGTM: Optional request body properly annotated.The explicit
Body(default=None)annotation correctly addresses the previous review feedback, making it clear that the request body is optional for DELETE operations.
174-191: LGTM: Document pagination properly implemented.The endpoint correctly:
- Validates pagination parameters with appropriate constraints (skip ≥ 0, 0 < limit ≤ 100)
- Conditionally fetches documents based on
include_docsflag- Passes pagination parameters to
DocumentCollectionCrud.read()which handles them safely (validates for negative values per the relevant code snippets)- Sets
documentstoNonewheninclude_docs=False, avoiding user confusion about empty vs. not-requested data
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (3)
backend/app/api/routes/collections.py (3)
93-96: Consider simplifying attribute access.Since
CreationRequestinherits fromAssistantOptions(which definesmodelandinstructions), and validation already happens at the model level, direct attribute access is clearer thangetattr.- # True iff both model and instructions were provided in the request body - with_assistant = bool( - getattr(request, "model", None) and getattr(request, "instructions", None) - ) + # True iff both model and instructions were provided in the request body + with_assistant = bool(request.model and request.instructions)
133-133: Consider adding a clarifying comment.The unused assignment
_indicates intentional validation that the collection exists. A brief comment would make this intent explicit.+ # Validate collection exists (raises 404 if not found) _ = CollectionCrud(session, current_user.project_id).read_one(collection_id)
172-177: Consider documenting pagination parameter relevance.The
skipandlimitparameters are accepted even wheninclude_docs=False, where they have no effect. While not a bug, this could be clarified in the endpoint description or the parameter descriptions.skip: int = Query(0, ge=0), - limit: int = Query(100, gt=0, le=100), + limit: int = Query(100, gt=0, le=100, description="Applies only when include_docs=True"),
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
backend/app/api/docs/collections/list.md(1 hunks)backend/app/api/routes/collections.py(3 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- backend/app/api/docs/collections/list.md
🧰 Additional context used
📓 Path-based instructions (2)
**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Use type hints in Python code (Python 3.11+ project)
Files:
backend/app/api/routes/collections.py
backend/app/api/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Expose FastAPI REST endpoints under backend/app/api/ organized by domain
Files:
backend/app/api/routes/collections.py
🧬 Code graph analysis (1)
backend/app/api/routes/collections.py (9)
backend/app/models/collection_job.py (3)
CollectionJobPublic(102-105)CollectionJobImmediatePublic(97-99)CollectionActionType(19-21)backend/app/models/collection.py (5)
CollectionWithDocsPublic(155-156)CreationRequest(121-129)CallbackRequest(114-118)DeletionRequest(132-133)CollectionPublic(143-152)backend/app/utils.py (3)
APIResponse(30-54)success_response(37-40)_(297-298)backend/app/crud/collection/collection.py (7)
CollectionCrud(18-145)read_all(85-94)delete(103-111)_(114-121)_(124-145)read_one(59-83)create(43-57)backend/app/crud/collection/collection_job.py (4)
read_all(47-56)read_one(24-45)CollectionJobCrud(19-96)create(78-96)backend/app/services/collections/delete_collection.py (1)
start_job(26-54)backend/app/services/collections/create_collection.py (1)
start_job(43-73)backend/app/crud/document_collection.py (3)
create(12-23)DocumentCollectionCrud(8-48)read(25-48)backend/app/models/document.py (1)
DocumentPublic(43-60)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: checks (3.11.7, 6)
🔇 Additional comments (6)
backend/app/api/routes/collections.py (6)
5-5: LGTM! Imports are well-organized.All new imports are used appropriately throughout the file and support the callback router functionality and new public response models.
Also applies to: 19-21, 25-25, 39-39
42-55: LGTM! Callback specification properly implemented.The callback endpoint specification correctly defines the contract for external webhook receivers and addresses the previous review request for callback documentation.
58-70: LGTM! List endpoint is clean and straightforward.The implementation properly uses dependency injection, respects project boundaries, and returns a consistent response structure.
107-118: LGTM! Helpful metadata and consistent response structure.The metadata note clearly communicates to users when a vector-store-only collection will be created, and the response model properly represents the immediate job state.
135-160: LGTM! Delete endpoint properly structured.The endpoint correctly validates collection existence, handles optional callback URLs, and follows the established job pattern. The explicit
Body(default=None)annotation addresses the previous review feedback.
182-191: LGTM! Collection info endpoint properly handles conditional document loading.The implementation correctly:
- Maintains consistent response structure (documents field is None when
include_docs=False, avoiding theresponse_model_exclude_noneissue from past reviews)- Applies proper pagination with reasonable defaults
- Uses appropriate validation for both collection and documents
Summary
Target issue is #398 and also covering this issue - #231 since this is just for writing test cases for existing router endpoints
Checklist
Before submitting a pull request, please ensure that you mark these task.
fastapi run --reload app/main.pyordocker compose upin the repository root and test.Notes
for reference of request bodies of the job related endpoints of this module you can check this document out.
Create job (create_collection.execute_job) :
Introduced with_assistant flag: when model & instructions are both present, create an Assistant backed by a Vector Store; otherwise only create a Vector Store.
We now close the DB session before making external API calls—and reopen a fresh session only to persist results—so we don’t hold connections or risk transaction locks. Same is being done during deletion of collection.
Delete job (delete_collection.execute_job) :
Detects service type from collection.llm_service_name (Assistant vs Vector Store) and chooses the correct CRUD (OpenAIAssistantCrud / OpenAIVectorStoreCrud) to use delete method of that specific CRUD class.
Calls local deletion via CollectionCrud.delete_by_id() once remote deletion succeeds.
CRUD refactors :
CollectionCrud.delete_by_id() sets deleted_at=now() and persists.
Document→Collection cascade delete now excludes already-deleted collections (deleted_at IS NULL).
API route :
/collections/create: determines with_assistant by checking both model and instructions. Returns metadata note when creating vector-store-only.
Validation: if only one of model or instructions is provided, responds 422 with a clear message.
Additional :
Removed ResponsePayload class because we are now returning collection job object in response of create and delete collections, we use the collection job object for the callback URL as well.
Rephrased the endpoint docs to remove the indication that user can use any specific endpoint for polling.
Summary by CodeRabbit
New Features
Bug Fixes
Documentation
Tests