ProjectTech4DevAI · nishika26 · Nov 7, 2025 · Oct 20, 2025 · Oct 20, 2025 · Oct 20, 2025
diff --git a/backend/app/api/docs/collections/create.md b/backend/app/api/docs/collections/create.md
@@ -7,10 +7,12 @@ pipeline:
 * Create an OpenAI [Vector
   Store](https://platform.openai.com/docs/api-reference/vector-stores)
   based on those File's.
-* Attach the Vector Store to an OpenAI
+* [To be deprecated] Attach the Vector Store to an OpenAI
   [Assistant](https://platform.openai.com/docs/api-reference/assistants). Use
   parameters in the request body relevant to an Assistant to flesh out
-  its configuration.
+  its configuration. Note that an assistant will only be created when you pass both
+  "model" and "instruction" in the request body otherwise only a vector store will be
+  created from the documents given.
 
 If any one of the OpenAI interactions fail, all OpenAI resources are
 cleaned up. If a Vector Store is unable to be created, for example,
@@ -19,9 +21,10 @@ OpenAI. Failure can occur from OpenAI being down, or some parameter
 value being invalid. It can also fail due to document types not be
 accepted. This is especially true for PDFs that may not be parseable.
 
-The immediate response from the endpoint is `collection_job` object which is
-going to contain the collection "job ID", status and action type ("CREATE").
-Once the collection has been created, information about the collection will
-be returned to the user via the callback URL. If a callback URL is not provided,
-clients can poll the `collection job info` endpoint with the `id` in the
-`collection_job` object returned as it is the `job id`, to retrieve the same information.
+Vector store/assistant will be created asynchronously. The immediate response
+from this endpoint is `collection_job` object which is going to contain
+the collection "job ID" and status.Once the collection has been created,
+information about the collection will be returned to the user via the
+callback URL. If a callback URL is not provided, clients can check the
+`collection job info` endpoint with the `job_id`, to retrieve the
+information about the creation of collection.
diff --git a/backend/app/api/docs/collections/delete.md b/backend/app/api/docs/collections/delete.md
@@ -7,7 +7,8 @@ Remove a collection from the platform. This is a two step process:
 No action is taken on the documents themselves: the contents of the
 documents that were a part of the collection remain unchanged, those
 documents can still be accessed via the documents endpoints. The response from this
-endpoint will be a `collection_job` object which will contain the collection `job ID`,
-status and action type ("DELETE"). when you take the id returned and use the collection job
-info endpoint, if the job is successful, you will get the status as successful and nothing will
-be returned as the collection as it has been deleted and marked as deleted.
+endpoint will be a `collection_job` object which will contain the collection `job_id` and
+status. when you take the id returned and use the collection job
+info endpoint, if the job is successful, you will get the status as successful.
+Additionally, if a `callback_url` was provided in the request body,
+you will receive a message indicating whether the deletion was successful or if it failed.
diff --git a/backend/app/api/docs/collections/docs.md b/backend/app/api/docs/collections/docs.md
diff --git a/backend/app/api/docs/collections/info.md b/backend/app/api/docs/collections/info.md
@@ -1,4 +1,4 @@
-Retrieve detailed information about a specific collection by its ID from the collection table. Note that this endpoint CANNOT be used as a polling endpoint for collection creation because an entry will be made in the collection table only after the resource creation and association has been successful.
-
-This endpoint returns metadata for the collection, including its project, organization,
+Retrieve detailed information about `a specific collection by its ID` from the collection table. This endpoint returns the collection object including its project, organization,
 timestamps, and associated LLM service details (`llm_service_id`).
+
+Additionally, if the `include_docs` flag in the request body is true then you will get a list of document IDs associated with a given collection as well. Documents returned are not only stored by the AI platform, but also by OpenAI.
diff --git a/backend/app/api/docs/collections/job_info.md b/backend/app/api/docs/collections/job_info.md
@@ -1,12 +1,9 @@
-Retrieve information about a collection job by the collection job ID. This endpoint can be considered the polling endpoint for collection creation job. This endpoint provides detailed status and metadata for a specific collection job
-in the AI platform. It is especially useful for:
+Retrieve information about a collection job by the collection job ID. This endpoint provides detailed status and metadata for a specific collection job in the AI platform. It is especially useful for:
 
-* Fetching the collection job object containing the ID which will be collection job id, collection ID, status of the job as well as error message.
+* Fetching the collection job object, including the collection job ID, the current status, and the associated collection details.
 
 * If the job has finished, has been successful and it was a job of creation of collection then this endpoint will fetch the associated collection details from the collection table, including:
-    - `llm_service_id`: The OpenAI assistant or model used for the collection.
-    - Collection metadata such as ID, project, organization, and timestamps.
+  - `llm_service_id`: The OpenAI assistant or model used for the collection.
+  - Collection metadata such as ID, project, organization, and timestamps.
 
-* If the job of delete collection was successful, we will get the status as successful and nothing will be returned as collection.
-
-* Containing a simplified error messages in the retrieved collection job object when a job has failed.
+* If the delete-collection job succeeds, the status is set to “successful” and the `collection_key` contains the ID of the collection that has been deleted.
diff --git a/backend/app/api/docs/collections/list.md b/backend/app/api/docs/collections/list.md
@@ -1,2 +1,6 @@
 List _active_ collections -- collections that have been created but
 not deleted
+
+If a vector store was created - `llm_service_name` and `llm_service_id` in the response denote the name of the vector store (eg. 'openai vector store') and its id.
+
+[To be deprecated] If an assistant was created, `llm_service_name` and `llm_service_id` in the response denote the name of the model used in the assistant (eg. 'gpt-4o') and assistant id.
diff --git a/backend/app/api/routes/collection_job.py b/backend/app/api/routes/collection_job.py
@@ -10,7 +10,12 @@
     CollectionCrud,
     CollectionJobCrud,
 )
-from app.models import CollectionJobStatus, CollectionJobPublic, CollectionActionType
+from app.models import (
+    CollectionJobStatus,
+    CollectionIDPublic,
+    CollectionActionType,
+    CollectionJobPublic,
+)
 from app.models.collection import CollectionPublic
 from app.utils import APIResponse, load_description
 from app.services.collections.helpers import extract_error_message
@@ -21,7 +26,7 @@
 
 
 @router.get(
-    "/info/jobs/{job_id}",
+    "/jobs/{job_id}",
     description=load_description("collections/job_info.md"),
     response_model=APIResponse[CollectionJobPublic],
 )
@@ -35,16 +40,21 @@ def collection_job_info(
 
     job_out = CollectionJobPublic.model_validate(collection_job)
 
-    if (
-        collection_job.status == CollectionJobStatus.SUCCESSFUL
-        and collection_job.action_type == CollectionActionType.CREATE
-        and collection_job.collection_id
-    ):
-        collection_crud = CollectionCrud(session, current_user.project_id)
-        collection = collection_crud.read_one(collection_job.collection_id)
-        job_out.collection = CollectionPublic.model_validate(collection)
-
-    if collection_job.status == CollectionJobStatus.FAILED and job_out.error_message:
-        job_out.error_message = extract_error_message(job_out.error_message)
+    if collection_job.collection_id:
+        if (
+            collection_job.action_type == CollectionActionType.CREATE
+            and collection_job.status == CollectionJobStatus.SUCCESSFUL
+        ):
+            collection_crud = CollectionCrud(session, current_user.project_id)
+            collection = collection_crud.read_one(collection_job.collection_id)
+            job_out.collection = CollectionPublic.model_validate(collection)
+
+        elif collection_job.action_type == CollectionActionType.DELETE:
+            job_out.collection = CollectionIDPublic(id=collection_job.collection_id)
+
+    if collection_job.status == CollectionJobStatus.FAILED:
+        raw_error = getattr(collection_job, "error_message", None)
+        error_message = extract_error_message(raw_error)
+        job_out.error_message = error_message
 
     return APIResponse.success_response(data=job_out)
diff --git a/backend/app/api/routes/collections.py b/backend/app/api/routes/collections.py
@@ -1,12 +1,10 @@
-import inspect
 import logging
 from uuid import UUID
 from typing import List
 
-from fastapi import APIRouter, Query
+from fastapi import APIRouter, Query, Body
 from fastapi import Path as FastPath
 
-
 from app.api.deps import SessionDep, CurrentUserOrgProject
 from app.crud import (
     CollectionCrud,
@@ -18,28 +16,65 @@
     CollectionJobStatus,
     CollectionActionType,
     CollectionJobCreate,
+    CollectionJobPublic,
+    CollectionJobImmediatePublic,
+    CollectionWithDocsPublic,
 )
 from app.models.collection import (
-    ResponsePayload,
     CreationRequest,
+    CallbackRequest,
     DeletionRequest,
     CollectionPublic,
 )
 from app.utils import APIResponse, load_description
-from app.services.collections.helpers import extract_error_message
 from app.services.collections import (
     create_collection as create_service,
     delete_collection as delete_service,
 )
 
 
 logger = logging.getLogger(__name__)
+
 router = APIRouter(prefix="/collections", tags=["collections"])
+collection_callback_router = APIRouter()
+
+
+@collection_callback_router.post(
+    "{$callback_url}",
+    name="collection_callback",
+)
+def collection_callback_notification(body: APIResponse[CollectionJobPublic]):
+    """
+    Callback endpoint specification for collection creation/deletion.
+
+    The callback will receive:
+    - On success: APIResponse with success=True and data containing CollectionJobPublic
+    - On failure: APIResponse with success=False and error message
+    - metadata field will always be included if provided in the request
+    """
+    ...
+
+
+@router.get(
+    "/",
+    description=load_description("collections/list.md"),
+    response_model=APIResponse[List[CollectionPublic]],
+)
+def list_collections(
+    session: SessionDep,
+    current_user: CurrentUserOrgProject,
+):
+    collection_crud = CollectionCrud(session, current_user.project_id)
+    rows = collection_crud.read_all()
+
+    return APIResponse.success_response(rows)
 
 
 @router.post(
-    "/create",
+    "/",
     description=load_description("collections/create.md"),
+    response_model=APIResponse[CollectionJobImmediatePublic],
+    callbacks=collection_callback_router.routes,
 )
 def create_collection(
     session: SessionDep,
@@ -55,110 +90,102 @@ def create_collection(
         )
     )
 
-    this = inspect.currentframe()
-    route = router.url_path_for(this.f_code.co_name)
-    payload = ResponsePayload(
-        status="processing", route=route, key=str(collection_job.id)
+    # True iff both model and instructions were provided in the request body
+    with_assistant = bool(
+        getattr(request, "model", None) and getattr(request, "instructions", None)
     )
 
     create_service.start_job(
         db=session,
         request=request,
-        payload=payload,
         collection_job_id=collection_job.id,
         project_id=current_user.project_id,
         organization_id=current_user.organization_id,
+        with_assistant=with_assistant,
     )
 
-    return APIResponse.success_response(collection_job)
+    metadata = None
+    if not with_assistant:
+        metadata = {
+            "note": (
+                "This job will create a vector store only (no Assistant). "
+                "Assistant creation happens when both 'model' and 'instructions' are included."
+            )
+        }
+
+    return APIResponse.success_response(
+        CollectionJobImmediatePublic.model_validate(collection_job), metadata=metadata
+    )
 
 
-@router.post(
-    "/delete",
+@router.delete(
+    "/{collection_id}",
     description=load_description("collections/delete.md"),
+    response_model=APIResponse[CollectionJobImmediatePublic],
+    callbacks=collection_callback_router.routes,
 )
 def delete_collection(
     session: SessionDep,
     current_user: CurrentUserOrgProject,
-    request: DeletionRequest,
+    collection_id: UUID = FastPath(description="Collection to delete"),
+    request: CallbackRequest | None = Body(default=None),
 ):
-    collection_crud = CollectionCrud(session, current_user.project_id)
-    collection = collection_crud.read_one(request.collection_id)
+    _ = CollectionCrud(session, current_user.project_id).read_one(collection_id)
+
+    deletion_request = DeletionRequest(
+        collection_id=collection_id,
+        callback_url=request.callback_url if request else None,
+    )
 
     collection_job_crud = CollectionJobCrud(session, current_user.project_id)
     collection_job = collection_job_crud.create(
         CollectionJobCreate(
             action_type=CollectionActionType.DELETE,
             project_id=current_user.project_id,
             status=CollectionJobStatus.PENDING,
-            collection_id=collection.id,
+            collection_id=collection_id,
         )
     )
 
-    this = inspect.currentframe()
-    route = router.url_path_for(this.f_code.co_name)
-    payload = ResponsePayload(
-        status="processing", route=route, key=str(collection_job.id)
-    )
-
     delete_service.start_job(
         db=session,
-        request=request,
-        payload=payload,
-        collection=collection,
+        request=deletion_request,
         collection_job_id=collection_job.id,
         project_id=current_user.project_id,
         organization_id=current_user.organization_id,
     )
 
-    return APIResponse.success_response(collection_job)
+    return APIResponse.success_response(
+        CollectionJobImmediatePublic.model_validate(collection_job)
+    )
 
 
 @router.get(
-    "/info/{collection_id}",
+    "/{collection_id}",
     description=load_description("collections/info.md"),
-    response_model=APIResponse[CollectionPublic],
+    response_model=APIResponse[CollectionWithDocsPublic],
 )
 def collection_info(
     session: SessionDep,
     current_user: CurrentUserOrgProject,
     collection_id: UUID = FastPath(description="Collection to retrieve"),
+    include_docs: bool = Query(
+        True,
+        description="If true, include documents linked to this collection",
+    ),
+    skip: int = Query(0, ge=0),
+    limit: int = Query(100, gt=0, le=100),
 ):
     collection_crud = CollectionCrud(session, current_user.project_id)
     collection = collection_crud.read_one(collection_id)
 
-    return APIResponse.success_response(collection)
-
+    collection_with_docs = CollectionWithDocsPublic.model_validate(collection)
 
-@router.get(
-    "/list",
-    description=load_description("collections/list.md"),
-    response_model=APIResponse[List[CollectionPublic]],
-)
-def list_collections(
-    session: SessionDep,
-    current_user: CurrentUserOrgProject,
-):
-    collection_crud = CollectionCrud(session, current_user.project_id)
-    rows = collection_crud.read_all()
+    if include_docs:
+        document_collection_crud = DocumentCollectionCrud(session)
+        docs = document_collection_crud.read(collection, skip, limit)
+        collection_with_docs.documents = [
+            DocumentPublic.model_validate(doc) for doc in docs
+        ]
 
-    return APIResponse.success_response(rows)
-
-
-@router.post(
-    "/docs/{collection_id}",
-    description=load_description("collections/docs.md"),
-    response_model=APIResponse[List[DocumentPublic]],
-)
-def collection_documents(
-    session: SessionDep,
-    current_user: CurrentUserOrgProject,
-    collection_id: UUID = FastPath(description="Collection to retrieve"),
-    skip: int = Query(0, ge=0),
-    limit: int = Query(100, gt=0, le=100),
-):
-    collection_crud = CollectionCrud(session, current_user.project_id)
-    document_collection_crud = DocumentCollectionCrud(session)
-    collection = collection_crud.read_one(collection_id)
-    data = document_collection_crud.read(collection, skip, limit)
-    return APIResponse.success_response(data)
+    return APIResponse.success_response(collection_with_docs)