Collection: Standardized OpenAI Creds Fetching by nishika26 · Pull Request #258 · ProjectTech4DevAI/kaapi-backend

nishika26 · 2025-07-01T10:55:49Z

Summary

Target issue is #226
Explain the motivation for making this change. What existing problem does the pull request solve?
This PR introduces a standardized approach to fetching OpenAI credentials using the organization_id and project_id derived from the API key, and applies it to the following:

Changes Made
Collections Endpoint
Updated the collection endpoint to fetch OpenAI credentials dynamically using the current user's organization_id and project_id, instead of relying on environment variables.

Document Deletion Endpoint

The permanent delete route as well as the soft delete router for documents also requires deletion of associated OpenAI assistants and vector stores.
Previously, it fetched the OpenAI API key from environment variables.
This has now been refactored to use the same standardized credentials fetching approach.

Test Case Adjustments

Test cases for documents and most of collections endpoints were previously authenticated using username and password.
However, organization_id and project_id are not available in this form of authentication.
Updated test cases to authenticate using API keys, which ensures the necessary org/project context is available to fetch OpenAI credentials.

Additionally
Removing the dependency on fetching OpenAI credentials from environment variables was essential, as we plan to eliminate the use of the OpenAI API key from env configuration entirely. Since both the router logic and test cases were tightly coupled with this approach, this PR includes changes across multiple files to decouple that logic and ensure credentials are consistently derived from the organization and project context instead.

Summary by CodeRabbit

Bug Fixes
- Improved environment variable formatting in configuration examples for better compatibility.
Refactor
- Streamlined how OpenAI clients are managed and injected throughout the application, enhancing consistency and maintainability.
- Updated internal logic to require explicit OpenAI client configuration, removing fallback defaults.
Tests
- Enhanced test reliability by introducing explicit OpenAI client mocking and removing dependencies on global credentials.
- Cleaned up test imports and removed unused fixtures.
- Added a new utility for generating mock OpenAI clients with vector store capabilities.

codecov · 2025-07-01T11:24:01Z

Codecov Report

Attention: Patch coverage is 95.83333% with 4 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
backend/app/api/routes/collections.py	77.77%	2 Missing ⚠️
backend/app/crud/rag/open_ai.py	60.00%	2 Missing ⚠️

📢 Thoughts on this report? Let us know!

backend/app/api/routes/collections.py

backend/app/api/routes/documents.py

backend/app/crud/rag/open_ai.py

backend/app/api/routes/collections.py

nishika26 · 2025-07-07T07:02:49Z

Putting this on hold because first our priority is it fix testcases

backend/app/tests/utils/utils.py

backend/app/tests/utils/collections_openai_mock.py

backend/app/tests/crud/collections/test_crud_collection_read_one.py

coderabbitai

Actionable comments posted: 0

♻️ Duplicate comments (1)

backend/app/tests/api/routes/documents/test_route_document_permanent_remove.py (1)

53-57: Move remaining test logic inside OpenAI mock context.

The OpenAI mock context manager should encompass all test logic to ensure the mock applies throughout the entire test execution, not just during client setup.

     openai_mock = OpenAIMock()
     with openai_mock.router:
         client = OpenAI(api_key="sk-test-key")
         mock_get_openai_client.return_value = client

-    # Setup AWS
-    aws = AmazonCloudStorageClient()
-    aws.create()
+        # Setup AWS
+        aws = AmazonCloudStorageClient()
+        aws.create()
+
+        # Setup document in DB and S3
+        store = DocumentStore(db)
+        document = store.put()
+        s3_key = Path(urlparse(document.object_store_url).path).relative_to("/")
+        aws.client.put_object(
+            Bucket=settings.AWS_S3_BUCKET, Key=str(s3_key), Body=b"test"
+        )
+
+        # Delete document
+        response = crawler.delete(route.append(document, suffix="permanent"))
+        assert response.is_success
+
+        db.refresh(document)
+
+        stmt = select(Document).where(Document.id == document.id)
+        doc_in_db = db.exec(stmt).first()
+        assert doc_in_db is not None
+        assert doc_in_db.deleted_at is not None
+
+        with pytest.raises(ClientError) as exc_info:
+            aws.client.head_object(
+                Bucket=settings.AWS_S3_BUCKET,
+                Key=str(s3_key),
+            )
+        assert exc_info.value.response["Error"]["Code"] == "404"

🧹 Nitpick comments (1)

backend/app/api/routes/collections.py (1)
8-8: Remove unused import as flagged by static analysis.

The HTTPException import is not used anywhere in the file and should be removed to clean up the imports.
-from openai import OpenAIError, OpenAI
+from openai import OpenAIError, OpenAI
-from fastapi import APIRouter, HTTPException, BackgroundTasks, Query
+from fastapi import APIRouter, BackgroundTasks, Query

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d87c09a and e30dcff.

📒 Files selected for processing (8)

backend/app/api/routes/collections.py (8 hunks)
backend/app/api/routes/documents.py (3 hunks)
backend/app/tests/api/routes/documents/test_route_document_permanent_remove.py (2 hunks)
backend/app/tests/api/routes/documents/test_route_document_remove.py (2 hunks)
backend/app/tests/crud/collections/test_crud_collection_delete.py (4 hunks)
backend/app/tests/crud/collections/test_crud_collection_read_all.py (2 hunks)
backend/app/tests/crud/collections/test_crud_collection_read_one.py (1 hunks)
backend/app/tests/utils/utils.py (1 hunks)

🚧 Files skipped from review as they are similar to previous changes (5)

backend/app/tests/crud/collections/test_crud_collection_read_all.py
backend/app/tests/utils/utils.py
backend/app/tests/crud/collections/test_crud_collection_delete.py
backend/app/api/routes/documents.py
backend/app/tests/crud/collections/test_crud_collection_read_one.py

🧰 Additional context used

🧬 Code Graph Analysis (2)

backend/app/tests/api/routes/documents/test_route_document_permanent_remove.py (7)

backend/app/tests/conftest.py (2)

db (17-34)

client (45-48)

backend/app/tests/api/routes/documents/test_route_document_remove.py (1)

route (19-20)

backend/app/tests/api/routes/documents/test_route_document_info.py (1)

route (16-17)

backend/app/tests/api/routes/documents/test_route_document_upload.py (2)

route (43-44)

aws_credentials (53-58)

backend/app/tests/utils/document.py (3)

Route (83-110)

crawler (161-162)

WebCrawler (114-128)

backend/app/core/cloud/storage.py (1)

client (26-38)

backend/app/tests/api/routes/collections/test_create_collections.py (1)

mock_s3 (20-41)

backend/app/tests/api/routes/documents/test_route_document_remove.py (1)

backend/app/tests/utils/document.py (9)

Route (83-110)

crawler (161-162)

WebCrawler (114-128)

DocumentStore (51-80)

get (118-122)

append (105-110)

put (66-73)

clear (53-55)

DocumentMaker (30-48)

🪛 Ruff (0.12.2)

backend/app/api/routes/collections.py

9-9: fastapi.HTTPException imported but unused

Remove unused import: fastapi.HTTPException

(F401)

backend/app/tests/api/routes/documents/test_route_document_permanent_remove.py

51-51: Redefinition of unused crawler from line 23

(F811)

backend/app/tests/api/routes/documents/test_route_document_remove.py

31-31: Redefinition of unused crawler from line 14

(F811)

50-50: Redefinition of unused crawler from line 14

(F811)

74-74: Redefinition of unused crawler from line 14

(F811)

🔇 Additional comments (11)

backend/app/api/routes/collections.py (5)

183-189: LGTM! Proper refactoring to accept OpenAI client parameter.

The function signature correctly updated to accept an explicit OpenAI client parameter instead of creating it internally. This aligns with the standardized credential fetching approach and improves testability.

267-270: Excellent implementation of standardized OpenAI client fetching.

The route handler now properly obtains the OpenAI client using get_openai_client with the current user's organization and project context, replacing the previous environment variable dependency. This is exactly what the PR objectives called for.

288-288: Correct parameter passing to background task.

The background task call properly includes the client parameter, ensuring the standardized OpenAI client is passed through to the background processing function.

298-304: LGTM! Consistent pattern for deletion function.

The do_delete_collection function follows the same pattern as do_create_collection, accepting the OpenAI client as a parameter and using CurrentUserOrgProject for proper organization/project context.

343-346: Consistent implementation across deletion endpoint.

The delete endpoint mirrors the create endpoint's approach, properly fetching the OpenAI client with organization and project context before passing it to the background task.

backend/app/tests/api/routes/documents/test_route_document_remove.py (4)

3-6: LGTM! Proper imports for OpenAI client mocking.

The new imports support the updated testing approach that mocks the centralized get_openai_client function instead of relying on global fixtures.

24-42: Excellent standardization of OpenAI client mocking.

The test correctly patches get_openai_client and provides a mocked OpenAI client within the context manager. This approach:

Aligns with the production code changes

Ensures proper isolation of OpenAI API calls

Maintains the same test logic while adapting to the new client fetching pattern

43-66: Consistent mocking pattern applied correctly.

The second test method follows the same mocking pattern as the first, ensuring consistency across the test suite. The test logic for verifying soft deletion remains intact while properly mocking the OpenAI client.

67-86: Complete coverage with consistent mocking approach.

The third test method completes the migration to the new mocking pattern. All three test methods now consistently use the centralized OpenAI client mocking approach, ensuring the test suite properly reflects the production code changes.

backend/app/tests/api/routes/documents/test_route_document_permanent_remove.py (2)

6-6: LGTM! Proper imports for standardized OpenAI mocking.

The added imports support the new mocking pattern that patches get_openai_client instead of using global fixtures, aligning with the broader refactoring effort.

Also applies to: 11-13

44-57: OpenAI client mocking correctly implemented.

The test properly patches get_openai_client and sets up the mocked OpenAI client within the context manager. This approach ensures the test aligns with the production code's standardized credential fetching.

AkhileshNegi · 2025-07-25T08:33:30Z

backend/app/tests/crud/collections/test_crud_collection_read_all.py

    openai_mock = OpenAIMock()
    with openai_mock.router:
-        client = OpenAI(api_key=settings.OPENAI_API_KEY)
+        client = OpenAI(api_key="sk-test-key")


eventually we want this to be pulled from env.test

@AkhileshNegi @nishika26 That is the case rn, it is being pulled from .env right now( I know the fixture sets this but if set OPENAI_API_KEY in .env, we dont need to use fixture).
here it is getting change to hardcoded string than again we will move this to use settings.OPENAI_API_KEY

But wouldn’t that go against the goal of mirroring production behavior in testing? Since we don’t retrieve OpenAI keys from the environment in production, it doesn’t make sense to do it here either.

avirajsingh7 · 2025-07-25T08:52:24Z

backend/app/tests/utils/collections_openai_mock.py

+from unittest.mock import MagicMock
+
+
+def get_mock_openai_client():


No need to create a file specially for this mock
keep this in utils/openai.py

Also please rename this function to mock_openai_client_with_vector_store.

Also I do feel this highly specific to a particular test case or endpoint and we dont need this is utils but still your call on to keep this or not.

I'll move it to utils/openai, but I'd prefer to keep this utility function for now, as removing it would require significantly more time to implement something that works the same way.

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (1)

backend/app/tests/utils/openai.py (1)

43-43: Consider clarifying the function name scope.

The function name suggests it's specifically for vector store operations, but it also provides comprehensive mocking for assistants and file operations. Consider whether the name accurately reflects its full scope or if it should be more generic like get_mock_openai_client.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5b45d6a and 016aa16.

📒 Files selected for processing (2)

backend/app/tests/api/routes/collections/test_create_collections.py (5 hunks)
backend/app/tests/utils/openai.py (2 hunks)

🚧 Files skipped from review as they are similar to previous changes (1)

backend/app/tests/api/routes/collections/test_create_collections.py

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: checks (3.11.7, 6)

🔇 Additional comments (2)

backend/app/tests/utils/openai.py (2)

4-4: LGTM: Import addition is appropriate.

The MagicMock import is correctly added to support the new mock client functionality.

43-70: Well-structured mock client with comprehensive OpenAI API coverage.

The mock client implementation covers the essential OpenAI operations needed for testing:

Vector store creation and file operations

Assistant creation and management

File batch upload and polling

The mock return values appear realistic and consistent with OpenAI API responses. The nested structure for file counts and proper attribute assignments demonstrate good understanding of the API structure.

nishika26 and others added 3 commits July 1, 2025 16:15

configure openai creds

7732dd1

Merge branch 'main' into enhancement/openai_creds_fetch

fe1a547

test cases

613b7cc

nishika26 added 2 commits July 1, 2025 16:55

cleaning up a little

49f428a

additional test cases changes

f64fa46

nishika26 marked this pull request as ready for review July 1, 2025 11:44

nishika26 added 4 commits July 2, 2025 12:14

big changes .1

d8d67dc

big change .2

487166f

removing spaces

765e156

mock aws

7575920

nishika26 changed the title ~~configure openai creds~~ Collection: Standardized OpenAI Creds Fetching Jul 2, 2025

nishika26 self-assigned this Jul 2, 2025

nishika26 added the enhancement New feature or request label Jul 2, 2025

nishika26 added this to Kaapi-dev Jul 2, 2025

nishika26 moved this to In Progress in Kaapi-dev Jul 2, 2025

nishika26 linked an issue Jul 2, 2025 that may be closed by this pull request

Credentials : Standardized fetching for /create-collection endpoint #226

Closed

nishika26 requested review from Ishankoradia, avirajsingh7 and vijay-T4D July 2, 2025 08:00

nishika26 added the ready-for-review label Jul 2, 2025

avirajsingh7 suggested changes Jul 3, 2025

View reviewed changes

backend/app/api/routes/collections.py Outdated Show resolved Hide resolved

backend/app/api/routes/collections.py Outdated Show resolved Hide resolved

backend/app/api/routes/documents.py Outdated Show resolved Hide resolved

backend/app/crud/rag/open_ai.py Show resolved Hide resolved

logging

f0990fb

avirajsingh7 suggested changes Jul 4, 2025

View reviewed changes

backend/app/api/routes/collections.py Outdated Show resolved Hide resolved

backend/app/api/routes/collections.py Outdated Show resolved Hide resolved

backend/app/api/routes/collections.py Outdated Show resolved Hide resolved

nishika26 added on hold and removed enhancement New feature or request ready-for-review labels Jul 7, 2025

nishika26 moved this from In Progress to Closed in Kaapi-dev Jul 14, 2025

nishika26 removed the status in Kaapi-dev Jul 14, 2025

avirajsingh7 suggested changes Jul 24, 2025

View reviewed changes

backend/app/tests/utils/utils.py Outdated Show resolved Hide resolved

backend/app/tests/utils/collections_openai_mock.py Outdated Show resolved Hide resolved

backend/app/tests/crud/collections/test_crud_collection_read_one.py Outdated Show resolved Hide resolved

second review fix

e30dcff

coderabbitai bot reviewed Jul 24, 2025

View reviewed changes

nishika26 and others added 5 commits July 24, 2025 14:28

create collection crud file

f633a5d

logging

3454a00

Merge branch 'main' into enhancement/openai_creds_fetch

b8ed73c

formatting

4b5d8d8

normal user to user

5b45d6a

AkhileshNegi approved these changes Jul 25, 2025

View reviewed changes

avirajsingh7 suggested changes Jul 25, 2025

View reviewed changes

nishika26 added 2 commits July 25, 2025 14:38

normal user to user

d4afb58

moving the util function

016aa16

coderabbitai bot reviewed Jul 25, 2025

View reviewed changes

avirajsingh7 approved these changes Jul 25, 2025

View reviewed changes

nishika26 merged commit 6640296 into main Jul 25, 2025
3 checks passed

nishika26 deleted the enhancement/openai_creds_fetch branch July 25, 2025 11:26

coderabbitai bot mentioned this pull request Aug 31, 2025

Classification Feature #320

Merged

This was referenced Sep 24, 2025

Move response API to Celery-backed job processing #381

Merged

Celery : moving collections module to celery #385

Merged

This was referenced Oct 6, 2025

Authentication: Replace encryption with hashing and add permission-based authorization #396

Merged

Credentials: Using Hard Delete & Updating is_active logic #360

Merged

This was referenced Oct 20, 2025

Collection : with assistant flag and additional enhancements #412

Merged

Implement Unified LLM API v1 to Simplify LLM Integrations and Enable Multi-Provider Extensibility #413

Merged

coderabbitai bot mentioned this pull request Nov 5, 2025

Evaluation #405

Merged

2 tasks

coderabbitai bot mentioned this pull request Nov 18, 2025

Document : moving transform job to celery and adding callback url #437

Merged

2 tasks

coderabbitai bot mentioned this pull request Dec 1, 2025

Onboarding : take multiple credentials from endpoint #453

Merged

2 tasks

		from unittest.mock import MagicMock


		def get_mock_openai_client():

Conversation

nishika26 commented Jul 1, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Summary by CodeRabbit

Uh oh!

codecov bot commented Jul 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

nishika26 commented Jul 7, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

AkhileshNegi Jul 25, 2025

Choose a reason for hiding this comment

Uh oh!

avirajsingh7 Jul 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

nishika26 Jul 25, 2025

Choose a reason for hiding this comment

Uh oh!

avirajsingh7 Jul 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

nishika26 Jul 25, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

nishika26 commented Jul 1, 2025 •

edited by coderabbitai bot

Loading

codecov bot commented Jul 1, 2025 •

edited

Loading

avirajsingh7 Jul 25, 2025 •

edited

Loading

avirajsingh7 Jul 25, 2025 •

edited

Loading