Skip to content

Classification Feature#320

Merged
AkhileshNegi merged 32 commits intomainfrom
feature/classification
Sep 4, 2025
Merged

Classification Feature#320
AkhileshNegi merged 32 commits intomainfrom
feature/classification

Conversation

@nishika26
Copy link
Copy Markdown
Collaborator

@nishika26 nishika26 commented Aug 5, 2025

Summary

Target issue is #299

The documentation for this feature is here - https://docs.google.com/document/d/1y2WLRUm2dZaQMetvb31XpDH5HqteRe4UNpziG8pq608/edit?usp=sharing

Summary by CodeRabbit

  • New Features

    • Start fine-tuning jobs from uploaded CSV datasets with configurable split ratios, optional system prompt, background processing, status refresh, and document-scoped job listing; run queued model evaluations producing prediction artifacts and MCC scores with top-model lookup per document.
  • Documentation

    • Added API docs for initiating fine-tunes and refreshing job status.
  • Tests

    • Added comprehensive API and CRUD tests for fine-tuning and evaluation flows.
  • Chores

    • DB migration, seed documents, improved storage error logging, CSV format recognition, added pandas and scikit-learn.

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Aug 5, 2025

Walkthrough

Adds end-to-end fine-tuning and model-evaluation features: new API routers, docs, SQLModel schemas and Alembic migration, CRUD functions, preprocessing and evaluation core classes, background workers calling OpenAI and S3, seed data for documents, tests, and new runtime deps (pandas, scikit-learn).

Changes

Cohort / File(s) Summary
API docs
backend/app/api/docs/fine_tuning/create.md, backend/app/api/docs/fine_tuning/retrieve.md
New docs describing fine-tuning creation (multi-split, system_prompt) and status refresh flows.
API routing & endpoints
backend/app/api/main.py, backend/app/api/routes/fine_tuning.py, backend/app/api/routes/model_evaluation.py
Registers new routers and implements endpoints for initiating fine-tuning, refreshing job status, listing jobs by document, kicking off model evaluations, and retrieval endpoints; includes background workers and OpenAI integration.
Core finetune logic
backend/app/core/finetune/preprocessing.py, backend/app/core/finetune/evaluation.py
Adds DataPreprocessor for CSV validation/splitting/JSONL creation and ModelEvaluator to run predictions against fine-tuned models, normalize outputs, compute MCC, and upload prediction artifacts.
CRUD layer
backend/app/crud/fine_tuning.py, backend/app/crud/model_evaluation.py, backend/app/crud/__init__.py
New CRUD operations for fine-tuning jobs and model evaluations; package-level re-exports added.
Models & schema exports
backend/app/models/fine_tuning.py, backend/app/models/model_evaluation.py, backend/app/models/__init__.py, backend/app/models/project.py, backend/app/models/organization.py
Adds Fine_Tuning and ModelEvaluation SQLModel schemas and enums, public model exports, Project→Fine_Tuning relationship; removed unused import in organization.
DB migration
backend/app/alembic/versions/6ed6ed401847_add_fine_tuning_and_model_evaluation_.py
New Alembic revision creating enums and tables for fine_tuning and model_evaluation with CASCADE FKs and upgrade/downgrade.
Seeding
backend/app/seed_data/seed_data.json, backend/app/seed_data/seed_data.py
Adds document entries to seed JSON and document seeding logic to create Document records during seeding.
Tests & test utils
backend/app/tests/api/routes/test_fine_tuning.py, backend/app/tests/api/routes/test_model_evaluation.py, backend/app/tests/crud/test_fine_tuning.py, backend/app/tests/crud/test_model_evaluation.py, backend/app/tests/utils/test_data.py, backend/app/tests/utils/utils.py
Adds unit/integration tests for fine-tuning and model-evaluation endpoints and CRUD; adds test helpers for creating fine-tuning jobs, evaluations, and fetching documents.
Infra / deps
backend/app/core/cloud/storage.py, backend/pyproject.toml
get_cloud_storage wrapped with try/except logging; adds runtime deps: pandas>=2.3.2, scikit-learn>=1.7.1.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  actor User
  participant API as API /fine_tuning/fine_tune
  participant CRUD as CRUD (Fine_Tuning)
  participant BG as Background Task
  participant DP as DataPreprocessor
  participant S3 as Cloud Storage
  participant OA as OpenAI API

  User->>API: POST fine_tune (document_id, base_model, split_ratio[], system_prompt)
  API->>OA: Validate API key / client init
  loop for each split_ratio
    API->>CRUD: create_fine_tuning_job(pending)
    API->>BG: enqueue process_fine_tuning_job(job_id, ratio)
  end
  API-->>User: 200 {jobs[], message}

  par process job
    BG->>CRUD: fetch_by_id(job_id)
    BG->>DP: process(document, split_ratio, system_prompt)
    DP->>S3: upload train/test CSVs
    DP-->>BG: {train_jsonl_temp, train/test s3 objects}
    BG->>OA: files.create(training_jsonl)
    OA-->>BG: training_file_id
    BG->>OA: fine_tuning.jobs.create(base_model, training_file_id)
    OA-->>BG: provider_job_id
    BG->>CRUD: update_finetune_job(running, ids, s3 objs, ratio)
  and on error
    BG->>CRUD: update_finetune_job(failed, error_message)
  end
Loading
sequenceDiagram
  autonumber
  actor User
  participant API as API /fine_tuning/{id}/refresh
  participant CRUD as CRUD (Fine_Tuning)
  participant OA as OpenAI API

  User->>API: GET refresh job status
  API->>CRUD: fetch_by_id(id)
  alt has provider_job_id
    API->>OA: fine_tuning.jobs.retrieve(provider_job_id)
    OA-->>API: {status, fine_tuned_model?, error?}
    API->>CRUD: update_finetune_job(status mapped, model/error)
  else no provider id
    note right of API: return current job state
  end
  API-->>User: 200 job state
Loading
sequenceDiagram
  autonumber
  actor User
  participant API as API /model_evaluation/evaluate_models
  participant CRUD as CRUD (ModelEvaluation)
  participant BG as Background Eval
  participant ME as ModelEvaluator
  participant S3 as Cloud Storage
  participant OA as OpenAI API

  User->>API: POST evaluate_models {fine_tuning_ids[]}
  API->>CRUD: create_model_evaluation(pending) x N
  API->>BG: enqueue run_model_evaluation(eval_id) x N
  API-->>User: 200 {evaluations[], message}

  par each eval
    BG->>CRUD: fetch_by_eval_id
    BG->>CRUD: update_model_eval(running)
    BG->>ME: run(fine_tuned_model, test_data, system_prompt)
    ME->>S3: stream test CSV
    loop prompts
      ME->>OA: chat.completions.create(model, messages)
      OA-->>ME: output
    end
    ME->>S3: upload predictions CSV
    ME-->>BG: {score(MCC), prediction_s3_obj}
    BG->>CRUD: update_model_eval(completed, score, prediction_s3_obj)
  and on error
    BG->>CRUD: update_model_eval(failed, error_message)
  end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Suggested reviewers

  • kartpop
  • AkhileshNegi
  • avirajsingh7

Poem

I nibble rows and split with care,
Upload to clouds and spin the air.
Jobs take flight, metrics hum and play,
MCC hops in to light the way.
Rabbit cheers—fine-tune hooray! 🐇✨

✨ Finishing Touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feature/classification

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbitai help to get the list of available commands.

Other keywords and placeholders

  • Add @coderabbitai ignore or @coderabbit ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Status, Documentation and Community

  • Visit our Status Page to check the current availability of CodeRabbit.
  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@nishika26 nishika26 changed the base branch from main to dev/classification-transformer-aws August 25, 2025 04:58
* alembic file for adding and removing columns

* train and test s3 url column
@nishika26 nishika26 changed the base branch from dev/classification-transformer-aws to main August 28, 2025 08:50
@AkhileshNegi AkhileshNegi added the enhancement New feature or request label Aug 29, 2025
@nishika26 nishika26 marked this pull request as ready for review August 31, 2025 07:57
@nishika26 nishika26 requested review from AkhileshNegi, avirajsingh7 and kartpop and removed request for AkhileshNegi and avirajsingh7 August 31, 2025 08:01
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (1)
backend/app/models/project.py (1)

58-60: Fix Ruff F821: add TYPE_CHECKING import for Fine_Tuning forward reference

Ruff reports F821 for the forward reference. Add a conditional import to satisfy static analysis without runtime cycles. Also ensure Fine_Tuning has project = Relationship(back_populates="fine_tuning").

Apply this outside the selected lines (near other imports):

 from app.core.util import now
+from typing import TYPE_CHECKING
+
+if TYPE_CHECKING:
+    from .fine_tuning import Fine_Tuning
🧹 Nitpick comments (1)
backend/app/models/project.py (1)

58-60: Naming consistency: prefer plural for collection relationship

All other collection relationships on Project are plural (users, creds, assistants, etc.). Consider renaming to fine_tunings for consistency and clearer API/serialization.

-    fine_tuning: list["Fine_Tuning"] = Relationship(
+    fine_tunings: list["Fine_Tuning"] = Relationship(
         back_populates="project", cascade_delete=True
     )

If you adopt this, also update the reciprocal back_populates in Fine_Tuning and any usages.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between bcbaf20 and 46377cd.

📒 Files selected for processing (2)
  • backend/app/models/__init__.py (1 hunks)
  • backend/app/models/project.py (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • backend/app/models/init.py
🧰 Additional context used
🧬 Code graph analysis (1)
backend/app/models/project.py (3)
backend/app/alembic/versions/60b6c511a485_add_project_id_in_api_key_table.py (1)
  • upgrade (20-47)
backend/app/models/project_user.py (3)
  • ProjectUserPublic (20-23)
  • ProjectUser (27-35)
  • ProjectUserBase (10-17)
backend/app/alembic/versions/1a31ce608336_add_cascade_delete_relationships.py (1)
  • upgrade (20-27)
🪛 Ruff (0.12.2)
backend/app/models/project.py

58-58: Undefined name Fine_Tuning

(F821)

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
backend/app/alembic/versions/6ed6ed401847_add_fine_tuning_and_model_evaluation_.py (1)

127-130: Drop custom ENUM types on downgrade to avoid schema residue.

Currently downgrade leaves finetuningstatus and modelevaluationstatus behind.

 def downgrade():
     op.drop_table("model_evaluation")
     op.drop_table("fine_tuning")
+    bind = op.get_bind()
+    modelevaluation_status_enum.drop(bind, checkfirst=True)
+    finetuning_status_enum.drop(bind, checkfirst=True)
🧹 Nitpick comments (8)
backend/app/alembic/versions/6ed6ed401847_add_fine_tuning_and_model_evaluation_.py (8)

55-56: Make enum server_default explicit to avoid implicit casts.

Using a text default is safer across PostgreSQL versions.

-            server_default="pending",
+            server_default=sa.text("'pending'"),

Also applies to: 103-104


69-72: Provide defaults for NOT NULL audit flags/timestamps.

Without defaults, inserts must always set these fields; earlier migrations often default inserted_at to now.

-        sa.Column("is_deleted", sa.Boolean(), nullable=False),
-        sa.Column("inserted_at", sa.DateTime(), nullable=False),
-        sa.Column("updated_at", sa.DateTime(), nullable=False),
+        sa.Column("is_deleted", sa.Boolean(), nullable=False, server_default=sa.text("false")),
+        sa.Column("inserted_at", sa.DateTime(), nullable=False, server_default=sa.func.now()),
+        sa.Column("updated_at", sa.DateTime(), nullable=False, server_default=sa.func.now()),

(Apply in both tables.)

Also applies to: 108-111


43-44: Constrain split_ratio to a sane range.

Add a CHECK to prevent invalid ratios.

     op.create_table(
         "fine_tuning",
         sa.Column("base_model", sqlmodel.sql.sqltypes.AutoString(), nullable=False),
         sa.Column("split_ratio", sa.Float(), nullable=False),
@@
-        sa.PrimaryKeyConstraint("id"),
+        sa.PrimaryKeyConstraint("id"),
+        sa.CheckConstraint("split_ratio > 0 AND split_ratio < 1", name="ck_finetune_split_ratio"),
     )
@@
     op.create_table(
         "model_evaluation",
@@
-        sa.PrimaryKeyConstraint("id"),
+        sa.PrimaryKeyConstraint("id"),
+        sa.CheckConstraint("split_ratio > 0 AND split_ratio < 1", name="ck_modeleval_split_ratio"),
     )

Also applies to: 93-94


82-85: Add indices on common filters/joins.

Likely access paths: by document, project, organization, and status; also foreign key joins.

     op.create_table(
         "fine_tuning",
         ...
     )
+    op.create_index(op.f("ix_fine_tuning_document_id"), "fine_tuning", ["document_id"])
+    op.create_index(op.f("ix_fine_tuning_project_id"), "fine_tuning", ["project_id"])
+    op.create_index(op.f("ix_fine_tuning_organization_id"), "fine_tuning", ["organization_id"])
+    op.create_index(op.f("ix_fine_tuning_status"), "fine_tuning", ["status"])

     op.create_table(
         "model_evaluation",
         ...
     )
+    op.create_index(op.f("ix_model_evaluation_fine_tuning_id"), "model_evaluation", ["fine_tuning_id"])
+    op.create_index(op.f("ix_model_evaluation_document_id"), "model_evaluation", ["document_id"])
+    op.create_index(op.f("ix_model_evaluation_project_id"), "model_evaluation", ["project_id"])
+    op.create_index(op.f("ix_model_evaluation_organization_id"), "model_evaluation", ["organization_id"])
+    op.create_index(op.f("ix_model_evaluation_status"), "model_evaluation", ["status"])

(Note: dropping a table drops its indexes, so no downgrade changes needed.)

Also applies to: 124-127


95-95: Consider JSONB for score payloads.

If you’ll query into score fields, JSONB is preferable.

-        sa.Column("score", postgresql.JSON(astext_type=sa.Text()), nullable=True),
+        sa.Column("score", postgresql.JSONB(astext_type=sa.Text()), nullable=True),

61-65: Naming nit: prefer snake_case for acronyms.

Columns like train_data_s3_URI are non-idiomatic in SQL. Consider train_data_s3_uri, test_data_s3_uri, prediction_data_s3_uri for consistency with existing tables.

Also applies to: 97-98


20-36: Alternative fix (optional): allow auto-creation of ENUMs.

If you prefer implicit creation during table DDL, set create_type=True (or omit the arg) and still drop the types on downgrade.

-    name="finetuningstatus",
-    create_type=False,
+    name="finetuningstatus",
+    create_type=True,
@@
-    name="modelevaluationstatus",
-    create_type=False,
+    name="modelevaluationstatus",
+    create_type=True,

(Keep the explicit drops in downgrade.)


50-52: Optional: cap lengths where appropriate and add uniqueness if needed.

For provider_job_id, fine_tuned_model, model_name, consider length limits and (if applicable) uniqueness on non-null provider_job_id.

-        sa.Column("provider_job_id", sqlmodel.sql.sqltypes.AutoString(), nullable=True),
+        sa.Column("provider_job_id", sqlmodel.sql.sqltypes.AutoString(length=255), nullable=True),
...
-        sa.Column("model_name", sqlmodel.sql.sqltypes.AutoString(), nullable=False),
+        sa.Column("model_name", sqlmodel.sql.sqltypes.AutoString(length=255), nullable=False),

And optionally:

op.create_index(
    "uq_fine_tuning_provider_job_id",
    "fine_tuning",
    ["provider_job_id"],
    unique=True,
    postgresql_where=sa.text("provider_job_id IS NOT NULL"),
)

Also applies to: 88-92

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 46377cd and 634f591.

📒 Files selected for processing (1)
  • backend/app/alembic/versions/6ed6ed401847_add_fine_tuning_and_model_evaluation_.py (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
backend/app/alembic/versions/6ed6ed401847_add_fine_tuning_and_model_evaluation_.py (2)
backend/app/alembic/versions/c43313eca57d_add_document_tables.py (1)
  • upgrade (20-36)
backend/app/alembic/versions/99f4fc325617_add_organization_project_setup.py (1)
  • upgrade (20-72)
🪛 GitHub Actions: AI Platform CI
backend/app/alembic/versions/6ed6ed401847_add_fine_tuning_and_model_evaluation_.py

[error] 40-40: alembic upgrade head failed: UndefinedObject - type 'finetuningstatus' does not exist while creating table 'fine_tuning' (migration 6ed6ed401847_add_fine_tuning_and_model_evaluation_.py).

🔇 Additional comments (1)
backend/app/alembic/versions/6ed6ed401847_add_fine_tuning_and_model_evaluation_.py (1)

73-76: Confirm desired delete behavior for document FK.

No ondelete is set for document_id. Should fine-tunes/evals cascade when a document is removed?

If cascade is desired:

-        sa.ForeignKeyConstraint(
-            ["document_id"],
-            ["document.id"],
-        ),
+        sa.ForeignKeyConstraint(["document_id"], ["document.id"], ondelete="CASCADE"),

(Apply in both tables.)

Also applies to: 112-115

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (1)
backend/app/seed_data/seed_data.py (1)

324-333: Remove unused users[1] selection; it can IndexError and trips linters.
This block doesn’t influence document creation and will raise if fewer than 2 users exist. It also introduces an unused variable.

Apply:

-        users = session.exec(
-            select(User)
-            .join(APIKey, APIKey.user_id == User.id)
-            .where(APIKey.organization_id == organization.id)
-        ).all()
-
-        user = users[1]
-        if not user:
-            raise ValueError(f"No user found in organization '{organization.name}'")
+        # No user association needed for Document creation; skip user lookup.
🧹 Nitpick comments (2)
backend/app/seed_data/seed_data.py (2)

76-81: Tighten input validation for DocumentData (non-empty names, URL scheme).
Consider basic validation to prevent bad seed rows: non-empty names and allowed URL schemes (s3/http/https).

Example (add inside DocumentData):

from pydantic import field_validator

class DocumentData(BaseModel):
    fname: str
    object_store_url: str
    organization_name: str
    project_name: str

    @field_validator("fname", "organization_name", "project_name")
    @classmethod
    def _non_empty(cls, v: str) -> str:
        v = v.strip()
        if not v:
            raise ValueError("must be non-empty")
        return v

    @field_validator("object_store_url")
    @classmethod
    def _allowed_scheme(cls, v: str) -> str:
        from urllib.parse import urlsplit
        if urlsplit(v).scheme not in {"s3", "http", "https"}:
            raise ValueError("object_store_url must use s3/http/https")
        return v

432-437: Good integration of document seeding; also clear documents on reset.
Loop looks fine. Add Document to clear_database to avoid leftovers when projects aren’t fully purged.

Add in clear_database before deleting Project:

session.exec(delete(Document))
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between dc27edf and f9f7b4e.

📒 Files selected for processing (2)
  • backend/app/alembic/versions/6ed6ed401847_add_fine_tuning_and_model_evaluation_.py (1 hunks)
  • backend/app/seed_data/seed_data.py (4 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • backend/app/alembic/versions/6ed6ed401847_add_fine_tuning_and_model_evaluation_.py
🧰 Additional context used
🧬 Code graph analysis (1)
backend/app/seed_data/seed_data.py (5)
backend/app/models/api_key.py (1)
  • APIKey (28-38)
backend/app/models/organization.py (1)
  • Organization (34-57)
backend/app/models/project.py (1)
  • Project (29-63)
backend/app/models/user.py (1)
  • User (48-57)
backend/app/models/document.py (2)
  • Document (19-34)
  • DocumentBase (9-16)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: checks (3.11.7, 6)
🔇 Additional comments (1)
backend/app/seed_data/seed_data.py (1)

13-21: LGTM: Added Document import is correct and required for new seeding path.

* first commit covering all

* changing model name to fine tuned model in model eval

* error handling in get cloud storage and document not found error handling
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 9

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)
backend/app/core/cloud/storage.py (2)

55-62: Do not cast AWS error code to int; codes can be non-numeric (e.g., 'NoSuchBucket')

int(err.response["Error"]["Code"]) will crash for non-numeric codes. Compare stringly and handle common codes.

-            response = int(err.response["Error"]["Code"])
-            if response != 404:
+            code = err.response.get("Error", {}).get("Code")
+            if str(code) not in {"404", "NoSuchBucket"}:
                 logger.error(
                     f"[AmazonCloudStorage.create] Unexpected AWS error | "
-                    f"{{'bucket': '{mask_string(settings.AWS_S3_BUCKET)}', 'error': '{str(err)}', 'code': {response}}}",
+                    f"{{'bucket': '{mask_string(settings.AWS_S3_BUCKET)}', 'error': '{str(err)}', 'code': '{code}'}}",
                     exc_info=True,
                 )
                 raise CloudStorageError(err) from err

96-104: Fix malformed s3 URL generation (missing leading slash in path)

urlunparse requires path to start with "/". Without it, str(SimpleStorageName) can become "s3://bucketkey".

-        kwargs = {
+        kwargs = {
             "scheme": "s3",
             "netloc": self.Bucket,
-            "path": self.Key,
+            "path": f"/{self.Key.lstrip('/')}",
         }
backend/app/alembic/versions/6ed6ed401847_add_fine_tuning_and_model_evaluation_.py (1)

133-136: Drop ENUM types in downgrade to avoid orphaned types

Enums are created in upgrade; they must be dropped in downgrade.

 def downgrade():
     op.drop_table("model_evaluation")
     op.drop_table("fine_tuning")
+    finetuning_status_enum.drop(op.get_bind(), checkfirst=True)
+    modelevaluation_status_enum.drop(op.get_bind(), checkfirst=True)
♻️ Duplicate comments (14)
backend/app/api/routes/model_evaluation.py (2)

135-139: Remove unused variable or convert to validation-only call

Avoid F841 and keep preflight validation behavior.

-    client = get_openai_client(
-        session, current_user.organization_id, current_user.project_id
-    )  # keeping this here for checking if the user's validated OpenAI key is present or not,
-    # even though the client will be initialized separately inside the background task
+    # Validate OpenAI credentials exist for this org/project (raises on failure)
+    get_openai_client(session, current_user.organization_id, current_user.project_id)

97-112: Don’t reuse a session after rollback; use a fresh session for the error update

Reusing db after rollback can leave it in a bad state. Also log with traceback.

-        except Exception as e:
-            error_msg = str(e)
-            logger.error(
-                f"[run_model_evaluation] Failed | eval_id={eval_id}, project_id={current_user.project_id}: {e}"
-            )
-            db.rollback()
-            update_model_eval(
-                session=db,
-                eval_id=eval_id,
-                project_id=current_user.project_id,
-                update=ModelEvaluationUpdate(
-                    status=ModelEvaluationStatus.failed,
-                    error_message="failed during background job processing:"
-                    + error_msg,
-                ),
-            )
+        except Exception as e:
+            error_msg = str(e)
+            logger.error(
+                f"[run_model_evaluation] Failed | eval_id={eval_id}, project_id={current_user.project_id}: {error_msg}",
+                exc_info=True,
+            )
+            db.rollback()
+            # Use a fresh session for the failure update
+            with Session(engine) as error_db:
+                update_model_eval(
+                    session=error_db,
+                    eval_id=eval_id,
+                    project_id=current_user.project_id,
+                    update=ModelEvaluationUpdate(
+                        status=ModelEvaluationStatus.failed,
+                        error_message=f"failed during background job processing: {error_msg[:500]}",
+                    ),
+                )
backend/app/core/finetune/evaluation.py (2)

113-116: Handle empty allowed_labels to avoid StopIteration

-        logger.warning(
-            f"[normalize_prediction] No close match found for '{t}'. Using default label '{next(iter(self.allowed_labels))}'."
-        )
-        return next(iter(self.allowed_labels))
+        if not self.allowed_labels:
+            logger.error("[normalize_prediction] No allowed labels available for normalization")
+            raise ValueError("No allowed labels available for normalization")
+        default_label = next(iter(self.allowed_labels))
+        logger.warning(
+            f"[normalize_prediction] No close match found for '{t}'. Using default label '{default_label}'."
+        )
+        return default_label

141-147: Timeout branch never increments attempts; can loop forever

                     elapsed_time = time.time() - start_time
                     if elapsed_time > self.max_latency:
                         logger.warning(
                             f"[generate_predictions] Timeout exceeded for prompt {idx}/{total_prompts}. Retrying..."
                         )
-                        continue
+                        attempt += 1
+                        continue

Optional: configure a real request timeout.

# e.g., before the loop:
# self.client = self.client.with_options(timeout=self.max_latency)
backend/app/crud/model_evaluation.py (1)

29-34: Fix validation logic and error message (OR, not AND)

Creation must require both model and test data; condition is inverted, and message is inconsistent. This was flagged previously.

-    if fine_tuning_job.fine_tuned_model and fine_tuning_job.test_data_s3_object is None:
+    if not fine_tuning_job.fine_tuned_model or not fine_tuning_job.test_data_s3_object:
         logger.error(
             f"[create_model_evaluation] No fine tuned model or test data found for the given fine tuning ID | fine_tuning_id={request.fine_tuning_id}, project_id={project_id}"
         )
-        raise HTTPException(404, "Fine tuned model not found")
+        raise HTTPException(404, "Fine tuned model or test data not found")
backend/app/models/model_evaluation.py (2)

66-68: Fix tuple assignment causing status to be a tuple, not a Field

Trailing comma makes the RHS a tuple. Assign the Field directly.

-    status: ModelEvaluationStatus = (
-        Field(default=ModelEvaluationStatus.pending, description="Evaluation status"),
-    )
+    status: ModelEvaluationStatus = Field(
+        default=ModelEvaluationStatus.pending, description="Evaluation status"
+    )

1-11: Import forward refs under TYPE_CHECKING to resolve undefined names

Static analysis flags Project and Fine_Tuning. Guard imports with TYPE_CHECKING to avoid runtime cycles.

 from typing import Optional
+from typing import TYPE_CHECKING
 from uuid import UUID
 from enum import Enum
 from datetime import datetime
@@
 from app.core.util import now
+
+if TYPE_CHECKING:
+    from app.models.project import Project
+    from app.models.fine_tuning import Fine_Tuning

Also applies to: 84-85

backend/app/models/fine_tuning.py (4)

67-72: Typo: “ins S3” → “in S3”

Minor docstring polish.

-    train_data_s3_object: str | None = Field(
-        default=None, description="S3 URI of the training data stored ins S3"
-    )
+    train_data_s3_object: str | None = Field(
+        default=None, description="S3 URI of the training data stored in S3"
+    )
-    test_data_s3_object: str | None = Field(
-        default=None, description="S3 URI of the testing data stored ins S3"
-    )
+    test_data_s3_object: str | None = Field(
+        default=None, description="S3 URI of the testing data stored in S3"
+    )

61-63: Fix tuple assignment causing status to be a tuple, not a Field

Same trailing-comma issue as noted previously.

-    status: FineTuningStatus = (
-        Field(default=FineTuningStatus.pending, description="Fine tuning status"),
-    )
+    status: FineTuningStatus = Field(
+        default=FineTuningStatus.pending, description="Fine tuning status"
+    )

1-11: Add TYPE_CHECKING imports for forward-referenced relationship types

Resolves undefined-name warnings without introducing import cycles.

 from typing import Optional
+from typing import TYPE_CHECKING
 from uuid import UUID
 from enum import Enum
 from datetime import datetime
@@
 from app.core.util import now
+
+if TYPE_CHECKING:
+    from app.models.project import Project
+    from app.models.model_evaluation import ModelEvaluation

Also applies to: 88-89


99-100: Align update model status type with enum

Use FineTuningStatus for consistency and validation.

-    status: Optional[str] = None
+    status: Optional[FineTuningStatus] = None
backend/app/api/routes/fine_tuning.py (3)

63-64: Fix missing space in log message

Small readability fix.

-        f"[process_fine_tuning_job]Starting fine-tuning job processing | job_id={job_id}, project_id={project_id}|"
+        f"[process_fine_tuning_job] Starting fine-tuning job processing | job_id={job_id}, project_id={project_id}|"

320-321: Fix typo and spacing in log tag

“retrive” → “retrieve” and add missing space.

-            f"[retrive_job_by_document]No fine-tuning jobs found for document_id={document_id}, project_id={project_id}"
+            f"[retrieve_job_by_document] No fine-tuning jobs found for document_id={document_id}, project_id={project_id}"

189-194: Remove unused variable assignment

You only need side-effect validation; no need to bind to client.

-    client = get_openai_client(  # Used here only to validate the user's OpenAI key;
+    get_openai_client(  # Used here only to validate the user's OpenAI key;
         # the actual client is re-initialized separately inside the background task
         session,
         current_user.organization_id,
         current_user.project_id,
     )
🧹 Nitpick comments (22)
backend/app/core/cloud/storage.py (1)

157-164: Consider enabling server-side encryption on uploads

If datasets may contain sensitive data, set SSE (AES256 or KMS) via ExtraArgs.

             self.aws.client.upload_fileobj(
                 source.file,
                 ExtraArgs={
-                    "ContentType": source.content_type,
+                    "ContentType": source.content_type,
+                    "ServerSideEncryption": "AES256",
                 },
                 **kwargs,
             )
backend/app/tests/api/routes/test_model_evaluation.py (2)

43-43: Remove pointless f-string

No placeholders; drop the f prefix.

-    assert json_data["error"] == f"Job not found"
+    assert json_data["error"] == "Job not found"

26-28: Prefer assert_called_once_with for clearer intent

Asserting on positional tuple index is brittle. Use ANY for the user arg.

-from unittest.mock import patch
+from unittest.mock import patch, ANY
...
-    mock_run_eval.assert_called_once()
-    assert mock_run_eval.call_args[0][0] == evals[0]["id"]
+    mock_run_eval.assert_called_once_with(evals[0]["id"], ANY)
backend/app/tests/api/routes/test_fine_tuning.py (4)

10-22: Drop unused parameter from helper

file_type isn’t used. Removes Ruff ARG001 and simplifies.

-def create_file_mock(file_type):
+def create_file_mock():
     counter = {"train": 0, "test": 0}
 
     def _side_effect(file=None, purpose=None):
         if purpose == "fine-tune":
             if "train" in file.name:
                 counter["train"] += 1
                 return MagicMock(id=f"file_{counter['train']}")
             elif "test" in file.name:
                 counter["test"] += 1
                 return MagicMock(id=f"file_{counter['test']}")
 
     return _side_effect

52-56: Update caller to match helper signature change

-        mock_openai.files.create.side_effect = create_file_mock("fine-tune")
+        mock_openai.files.create.side_effect = create_file_mock()

29-36: Avoid hardcoding /tmp; use pytest tmp_path fixture for isolation and cleanup

Prevents cross-test interference and works on non-Unix platforms.

-    def test_finetune_from_csv_multiple_split_ratio(
+    def test_finetune_from_csv_multiple_split_ratio(
         self,
         mock_get_openai_client,
         mock_preprocessor_cls,
         client,
         db,
         user_api_key_header,
+        tmp_path,
     ):
...
-        for path in ["/tmp/train.jsonl", "/tmp/test.jsonl"]:
-            with open(path, "w") as f:
+        for name in ["train.jsonl", "test.jsonl"]:
+            with open(tmp_path / name, "w") as f:
                 f.write("{}")
 
         mock_preprocessor = MagicMock()
         mock_preprocessor.process.return_value = {
-            "train_jsonl_temp_filepath": "/tmp/train.jsonl",
+            "train_jsonl_temp_filepath": str(tmp_path / "train.jsonl"),

Also applies to: 39-43


76-81: Verify resource cleanup was invoked

Assert cleanup to ensure temp files are removed.

         assert json_data["metadata"] is None
 
         jobs = db.query(Fine_Tuning).all()
         assert len(jobs) == 3
+        mock_preprocessor.cleanup.assert_called_once()
backend/app/api/routes/model_evaluation.py (1)

214-218: Guard against invalid/missing S3 object before signing

If prediction_data_s3_object is an invalid URL, get_signed_url may raise; consider handling and leaving URL None.

-    top_model = attach_prediction_file_url(top_model, storage)
+    try:
+        top_model = attach_prediction_file_url(top_model, storage)
+    except Exception:
+        logger.warning("[get_top_model_by_doc_id] Failed to presign prediction file URL; returning without URL")
backend/app/core/finetune/evaluation.py (2)

4-4: Use built-in generics: set[str] instead of typing.Set

-from typing import Set

And at initialization:

-        self.allowed_labels: Set[str] = set()
+        self.allowed_labels: set[str] = set()

183-185: Sanitize filename (model names can include ':', '/'); ensure safe S3 key

-        filename = f"predictions_{self.fine_tuned_model}_{unique_id}.csv"
+        safe_model = "".join(ch if ch.isalnum() or ch in ("-", "_") else "_" for ch in self.fine_tuned_model)
+        filename = f"predictions_{safe_model}_{unique_id}.csv"
backend/app/crud/fine_tuning.py (2)

42-46: Remove unused variable assignment

Avoid the F841 warning; you only need the existence check.

-    document = document_crud.read_one(request.document_id)
+    document_crud.read_one(request.document_id)

Also applies to: 45-45


79-85: Consistent naming: provider vs OpenAI in error/log

Message uses OpenAI-specific wording while parameter is provider-agnostic.

-        logger.warning(
-            f"Fine-tune job not found for openai_job_id={provider_job_id}, project_id={project_id}"
-        )
-        raise HTTPException(status_code=404, detail="OpenAI fine tuning ID not found")
+        logger.warning(
+            f"[fetch_by_provider_job_id] Fine-tune job not found | provider_job_id={provider_job_id}, project_id={project_id}"
+        )
+        raise HTTPException(status_code=404, detail="Provider fine-tuning ID not found")

Also applies to: 81-83

backend/app/crud/model_evaluation.py (4)

55-57: Fix log tag to the correct function name

Small but improves traceability.

-        f"[Create_fine_tuning_job]Created new model evaluation from fine tuning job ID={fine_tuning_job.id}, project_id={project_id}"
+        f"[create_model_evaluation] Created new model evaluation from fine tuning job ID={fine_tuning_job.id}, project_id={project_id}"

61-79: Correct logger prefixes in fetch_by_eval_id

Use the actual function name.

-        logger.error(
-            f"[fetch_by_id]Model evaluation not found for eval_id={eval_id}, project_id={project_id}"
-        )
+        logger.error(
+            f"[fetch_by_eval_id] Model evaluation not found | eval_id={eval_id}, project_id={project_id}"
+        )
@@
-    logger.info(
-        f"[fetch_by_id]Fetched model evaluation for eval ID={model_eval.id}, project_id={project_id}"
-    )
+    logger.info(
+        f"[fetch_by_eval_id] Fetched model evaluation | eval_id={model_eval.id}, project_id={project_id}"
+    )

Also applies to: 71-78


7-7: De-ambiguate import to avoid name collisions

fetch_by_id is generic; alias it to make intent obvious.

-from app.crud import fetch_by_id
+from app.crud import fetch_by_id as fetch_ft_by_id
@@
-    fine_tuning_job = fetch_by_id(session, request.fine_tuning_id, project_id)
+    fine_tuning_job = fetch_ft_by_id(session, request.fine_tuning_id, project_id)

Also applies to: 27-27


96-103: Prefer empty list over 404 for “list” endpoint semantics

Returning [] is more RESTful and simplifies clients.

-    if not model_evals:
-        logger.error(
-            f"[fetch_eval_by_doc_id]Model evaluation not found for document_id={document_id}, project_id={project_id}"
-        )
-        raise HTTPException(status_code=404, detail="Model evaluation not found")
+    if not model_evals:
+        logger.info(
+            f"[fetch_eval_by_doc_id]No model evaluations found for document_id={document_id}, project_id={project_id}"
+        )
+        return []
backend/app/alembic/versions/6ed6ed401847_add_fine_tuning_and_model_evaluation_.py (2)

85-131: Add indexes to match hot queries

Both CRUD layers filter by document_id/project_id/status/is_deleted and sort by inserted_at/updated_at. Add indexes now to prevent future full scans.

     op.create_table(
         "model_evaluation",
@@
     )
+    # Fine-grained indexes for query patterns
+    op.create_index(
+        "ix_finetune_doc_proj_model_split_status_del",
+        "fine_tuning",
+        ["document_id", "project_id", "base_model", "split_ratio", "status", "is_deleted"],
+        unique=False,
+    )
+    op.create_index(
+        "ix_finetune_proj_inserted_at",
+        "fine_tuning",
+        ["project_id", "inserted_at"],
+        unique=False,
+    )
+    op.create_index(
+        "ix_modeleval_doc_proj_status_del",
+        "model_evaluation",
+        ["document_id", "project_id", "status", "is_deleted"],
+        unique=False,
+    )
+    op.create_index(
+        "ix_modeleval_ftid_inserted_at",
+        "model_evaluation",
+        ["fine_tuning_id", "inserted_at"],
+        unique=False,
+    )

71-74: Consider DB-level defaults for timestamps and is_deleted

Optional, but server defaults increase robustness if inserts omit fields.

-        sa.Column("is_deleted", sa.Boolean(), nullable=False),
-        sa.Column("inserted_at", sa.DateTime(), nullable=False),
-        sa.Column("updated_at", sa.DateTime(), nullable=False),
+        sa.Column("is_deleted", sa.Boolean(), nullable=False, server_default=sa.text("false")),
+        sa.Column("inserted_at", sa.DateTime(), nullable=False, server_default=sa.func.now()),
+        sa.Column("updated_at", sa.DateTime(), nullable=False, server_default=sa.func.now()),
@@
-        sa.Column("is_deleted", sa.Boolean(), nullable=False),
-        sa.Column("inserted_at", sa.DateTime(), nullable=False),
-        sa.Column("updated_at", sa.DateTime(), nullable=False),
+        sa.Column("is_deleted", sa.Boolean(), nullable=False, server_default=sa.text("false")),
+        sa.Column("inserted_at", sa.DateTime(), nullable=False, server_default=sa.func.now()),
+        sa.Column("updated_at", sa.DateTime(), nullable=False, server_default=sa.func.now()),

Also applies to: 115-117

backend/app/models/model_evaluation.py (1)

45-57: Consider indexing hot query fields

Likely query keys (document_id, project_id, organization_id, status) will benefit from DB indexes.

-    document_id: UUID = Field(
-        foreign_key="document.id",
-        nullable=False,
-    )
+    document_id: UUID = Field(
+        foreign_key="document.id",
+        nullable=False,
+        index=True,
+    )
@@
-    status: ModelEvaluationStatus = Field(
+    status: ModelEvaluationStatus = Field(
         default=ModelEvaluationStatus.pending, description="Evaluation status"
-    )
+    , index=True)
@@
-    project_id: int = Field(
-        foreign_key="project.id", nullable=False, ondelete="CASCADE"
-    )
+    project_id: int = Field(
+        foreign_key="project.id", nullable=False, ondelete="CASCADE", index=True
+    )
-    organization_id: int = Field(
-        foreign_key="organization.id", nullable=False, ondelete="CASCADE"
-    )
+    organization_id: int = Field(
+        foreign_key="organization.id", nullable=False, ondelete="CASCADE", index=True
+    )

Also applies to: 66-77

backend/app/models/fine_tuning.py (1)

106-114: Optional: Strongly type public status as enum

If your API layer serializes enums to strings, Pydantic v2 will still emit strings. This keeps types consistent end-to-end.

-    status: str
+    status: FineTuningStatus
backend/app/api/routes/fine_tuning.py (2)

215-216: Fix spacing in log message

Add a space after the tag for readability.

-            f"[fine_tune_from_CSV]All fine-tuning job creations failed for document_id={request.document_id}, project_id={current_user.project_id}"
+            f"[fine_tune_from_CSV] All fine-tuning job creations failed for document_id={request.document_id}, project_id={current_user.project_id}"

271-284: Type consistency: mapped_status is an enum, not str

OPENAI_TO_INTERNAL_STATUS maps to FineTuningStatus. Adjust local typing and ensure FineTuningUpdate.status is the enum (see model fix).

-        mapped_status: Optional[str] = OPENAI_TO_INTERNAL_STATUS.get(
+        mapped_status: Optional[FineTuningStatus] = OPENAI_TO_INTERNAL_STATUS.get(
             getattr(openai_job, "status", None)
         )
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between f9f7b4e and 68954f2.

📒 Files selected for processing (15)
  • backend/app/alembic/versions/6ed6ed401847_add_fine_tuning_and_model_evaluation_.py (1 hunks)
  • backend/app/api/routes/fine_tuning.py (1 hunks)
  • backend/app/api/routes/model_evaluation.py (1 hunks)
  • backend/app/core/cloud/storage.py (1 hunks)
  • backend/app/core/finetune/evaluation.py (1 hunks)
  • backend/app/core/finetune/preprocessing.py (1 hunks)
  • backend/app/crud/fine_tuning.py (1 hunks)
  • backend/app/crud/model_evaluation.py (1 hunks)
  • backend/app/models/fine_tuning.py (1 hunks)
  • backend/app/models/model_evaluation.py (1 hunks)
  • backend/app/tests/api/routes/test_fine_tuning.py (1 hunks)
  • backend/app/tests/api/routes/test_model_evaluation.py (1 hunks)
  • backend/app/tests/crud/test_fine_tuning.py (1 hunks)
  • backend/app/tests/crud/test_model_evaluation.py (1 hunks)
  • backend/app/tests/utils/test_data.py (2 hunks)
🚧 Files skipped from review as they are similar to previous changes (4)
  • backend/app/core/finetune/preprocessing.py
  • backend/app/tests/crud/test_model_evaluation.py
  • backend/app/tests/crud/test_fine_tuning.py
  • backend/app/tests/utils/test_data.py
🧰 Additional context used
🧬 Code graph analysis (11)
backend/app/core/cloud/storage.py (1)
backend/app/initial_storage.py (2)
  • init (10-15)
  • main (18-21)
backend/app/core/finetune/evaluation.py (3)
backend/app/core/cloud/storage.py (4)
  • AmazonCloudStorage (145-259)
  • client (30-42)
  • stream (125-127)
  • stream (179-195)
backend/app/api/routes/fine_tuning.py (1)
  • handle_openai_error (45-49)
backend/app/core/finetune/preprocessing.py (2)
  • DataPreprocessor (17-157)
  • upload_csv_to_s3 (31-60)
backend/app/crud/model_evaluation.py (2)
backend/app/crud/fine_tuning.py (1)
  • fetch_by_id (88-104)
backend/app/models/model_evaluation.py (4)
  • ModelEvaluation (38-85)
  • ModelEvaluationStatus (14-18)
  • ModelEvaluationBase (21-26)
  • ModelEvaluationUpdate (88-92)
backend/app/crud/fine_tuning.py (2)
backend/app/models/fine_tuning.py (4)
  • Fine_Tuning (54-89)
  • FineTuningJobCreate (28-51)
  • FineTuningUpdate (92-100)
  • FineTuningStatus (13-17)
backend/app/crud/document.py (1)
  • DocumentCrud (14-135)
backend/app/tests/api/routes/test_model_evaluation.py (2)
backend/app/tests/utils/test_data.py (2)
  • create_test_finetuning_job_with_extra_fields (163-174)
  • create_test_model_evaluation (177-202)
backend/app/tests/conftest.py (2)
  • db (24-41)
  • user_api_key_header (77-79)
backend/app/models/model_evaluation.py (3)
backend/app/models/project.py (2)
  • Project (29-60)
  • ProjectBase (10-13)
backend/app/models/organization.py (1)
  • Organization (35-58)
backend/app/models/document.py (1)
  • Document (19-34)
backend/app/models/fine_tuning.py (3)
backend/app/models/project.py (2)
  • Project (29-60)
  • ProjectBase (10-13)
backend/app/models/document.py (2)
  • Document (19-34)
  • DocumentBase (9-16)
backend/app/models/collection.py (1)
  • Collection (21-54)
backend/app/alembic/versions/6ed6ed401847_add_fine_tuning_and_model_evaluation_.py (3)
backend/app/alembic/versions/3389c67fdcb4_add_alter_columns_in_collections_table.py (1)
  • upgrade (29-61)
backend/app/alembic/versions/c43313eca57d_add_document_tables.py (1)
  • upgrade (20-36)
backend/app/alembic/versions/99f4fc325617_add_organization_project_setup.py (1)
  • upgrade (20-72)
backend/app/api/routes/fine_tuning.py (5)
backend/app/models/fine_tuning.py (4)
  • FineTuningJobCreate (28-51)
  • FineTuningJobPublic (103-119)
  • FineTuningUpdate (92-100)
  • FineTuningStatus (13-17)
backend/app/core/cloud/storage.py (5)
  • get_cloud_storage (262-279)
  • client (30-42)
  • create (44-84)
  • get_signed_url (135-137)
  • get_signed_url (217-242)
backend/app/crud/document.py (1)
  • DocumentCrud (14-135)
backend/app/utils.py (5)
  • get_openai_client (170-200)
  • APIResponse (27-48)
  • mask_string (158-167)
  • load_description (204-209)
  • success_response (34-37)
backend/app/crud/fine_tuning.py (4)
  • create_fine_tuning_job (20-66)
  • fetch_by_id (88-104)
  • update_finetune_job (165-182)
  • fetch_by_document_id (107-127)
backend/app/api/routes/model_evaluation.py (5)
backend/app/crud/fine_tuning.py (1)
  • fetch_by_id (88-104)
backend/app/crud/model_evaluation.py (5)
  • create_model_evaluation (20-58)
  • fetch_active_model_evals (150-170)
  • fetch_eval_by_doc_id (82-109)
  • update_model_eval (173-196)
  • fetch_top_model_by_doc_id (112-147)
backend/app/core/cloud/storage.py (4)
  • get_cloud_storage (262-279)
  • get_signed_url (135-137)
  • get_signed_url (217-242)
  • client (30-42)
backend/app/core/finetune/evaluation.py (2)
  • ModelEvaluator (21-241)
  • run (228-241)
backend/app/utils.py (3)
  • get_openai_client (170-200)
  • APIResponse (27-48)
  • success_response (34-37)
backend/app/tests/api/routes/test_fine_tuning.py (4)
backend/app/tests/utils/test_data.py (1)
  • create_test_fine_tuning_jobs (133-160)
backend/app/tests/utils/utils.py (1)
  • get_document (116-137)
backend/app/tests/conftest.py (2)
  • db (24-41)
  • user_api_key_header (77-79)
backend/app/core/finetune/preprocessing.py (2)
  • process (118-157)
  • cleanup (72-76)
🪛 Ruff (0.12.2)
backend/app/core/finetune/evaluation.py

4-4: typing.Set is deprecated, use set instead

(UP035)

backend/app/crud/fine_tuning.py

45-45: Local variable document is assigned to but never used

Remove assignment to unused variable document

(F841)

backend/app/tests/api/routes/test_model_evaluation.py

43-43: f-string without any placeholders

Remove extraneous f prefix

(F541)

backend/app/models/model_evaluation.py

84-84: Undefined name Project

(F821)


85-85: Undefined name Fine_Tuning

(F821)

backend/app/models/fine_tuning.py

88-88: Undefined name Project

(F821)


89-89: Undefined name ModelEvaluation

(F821)

backend/app/api/routes/fine_tuning.py

189-189: Local variable client is assigned to but never used

Remove assignment to unused variable client

(F841)


255-255: Redefinition of unused storage from line 17

(F811)


315-315: Redefinition of unused storage from line 17

(F811)

backend/app/api/routes/model_evaluation.py

135-135: Local variable client is assigned to but never used

Remove assignment to unused variable client

(F841)

backend/app/tests/api/routes/test_fine_tuning.py

10-10: Unused function argument: file_type

(ARG001)

🔇 Additional comments (7)
backend/app/crud/model_evaluation.py (1)

112-148: Use consistent metric key (‘mcc’) with fallback; avoid silent misses

The model schema example uses “mcc”, but code reads “mcc_score”.
[Suggest_essential_refactor]

-    for model_eval in model_evals:
-        if model_eval.score is not None:
-            mcc = model_eval.score.get("mcc_score", None)
+    for model_eval in model_evals:
+        if model_eval.score is not None:
+            mcc = model_eval.score.get("mcc")
+            if mcc is None:
+                mcc = model_eval.score.get("mcc_score")
             if mcc is not None and mcc > highest_mcc:
                 highest_mcc = mcc
                 top_model = model_eval

Also applies to: 129-135

backend/app/models/model_evaluation.py (2)

29-36: Nice validator: stable, order-preserving de-duplication

Using dict.fromkeys preserves order and removes dupes. Good.


21-27: Confirm API design: Create model accepts many IDs, DB model stores one

Create uses fine_tuning_ids: list[int] while the DB model stores fine_tuning_id: int. Ensure the CRUD/API creates multiple ModelEvaluation rows (one per ID) and validation/error reporting are clear when partial failures occur.

Also applies to: 29-36

backend/app/models/fine_tuning.py (1)

88-89: Back_populates linkage verified—no action needed
Project model defines

fine_tuning: list["Fine_Tuning"] = Relationship(
    back_populates="project", …
)

which matches the project: Relationship(back_populates="fine_tuning") in Fine_Tuning.

backend/app/api/routes/fine_tuning.py (3)

80-88: Verify preprocessor artifact keys

You open a JSONL temp file but store CSV S3 paths. Confirm DataPreprocessor returns keys: train_jsonl_temp_filepath, train_csv_s3_object, test_csv_s3_object; otherwise wire-up may break.

Also applies to: 81-83


176-176: Overall: pipeline structure reads cleanly

Background task orchestration, masking, and signed URL generation look good.

Also applies to: 241-241, 304-304, 347-347


94-99: Catch specific OpenAI SDK exceptions
Import and handle openai.error.APIError and openai.error.RateLimitError (or any other v1.x subclasses) in their own except blocks before the generic OpenAIError. Confirm your installed SDK v1.x exception hierarchy and adjust the caught exceptions accordingly.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
backend/app/crud/__init__.py (2)

80-87: All CRUD symbols verified; optionally lock down exports

  • Verified that create_model_evaluation, fetch_active_model_evals, fetch_by_eval_id, fetch_eval_by_doc_id, fetch_top_model_by_doc_id, and update_model_eval are all defined in backend/app/crud/model_evaluation.py and no circular imports were found.
  • Optional: add an __all__ list in backend/app/crud/__init__.py to explicitly declare the package’s public API.

71-78: backend/app/crud/init.py: alias generic export names to fine-tuning specific names

Consider updating the public exports to more descriptive aliases to prevent future collisions:

-from .fine_tuning import (
-    create_fine_tuning_job,
-    fetch_by_id,
-    fetch_by_provider_job_id,
-    fetch_by_document_id,
-    update_finetune_job,
-    fetch_active_jobs_by_document_id,
-)
+from .fine_tuning import (
+    create_fine_tuning_job,
+    fetch_by_id as fetch_finetune_by_id,
+    fetch_by_provider_job_id as fetch_finetune_by_provider_job_id,
+    fetch_by_document_id as fetch_finetunes_by_document_id,
+    update_finetune_job,
+    fetch_active_jobs_by_document_id as fetch_active_finetune_jobs_by_document_id,
+)
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 68954f2 and 1932825.

⛔ Files ignored due to path filters (1)
  • backend/uv.lock is excluded by !**/*.lock
📒 Files selected for processing (5)
  • backend/app/alembic/versions/6ed6ed401847_add_fine_tuning_and_model_evaluation_.py (1 hunks)
  • backend/app/api/main.py (2 hunks)
  • backend/app/crud/__init__.py (1 hunks)
  • backend/app/models/__init__.py (1 hunks)
  • backend/pyproject.toml (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (4)
  • backend/app/api/main.py
  • backend/app/models/init.py
  • backend/app/alembic/versions/6ed6ed401847_add_fine_tuning_and_model_evaluation_.py
  • backend/pyproject.toml
🧰 Additional context used
🧬 Code graph analysis (1)
backend/app/crud/__init__.py (2)
backend/app/crud/fine_tuning.py (6)
  • create_fine_tuning_job (20-66)
  • fetch_by_id (88-104)
  • fetch_by_provider_job_id (69-85)
  • fetch_by_document_id (107-127)
  • update_finetune_job (165-182)
  • fetch_active_jobs_by_document_id (130-162)
backend/app/crud/model_evaluation.py (5)
  • create_model_evaluation (20-58)
  • fetch_by_eval_id (61-79)
  • fetch_eval_by_doc_id (82-109)
  • fetch_top_model_by_doc_id (112-147)
  • update_model_eval (173-196)

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (10)
backend/app/crud/model_evaluation.py (2)

160-168: Use Enum in filters; clarify “active” semantics

Use the Enum instead of a raw string; optionally limit “active” to pending/running.

Minimal type-safe fix:

-            ModelEvaluation.status != "failed",
+            ModelEvaluation.status != ModelEvaluationStatus.failed,

If “active” should exclude completed:

-            ModelEvaluation.status != "failed",
+            ModelEvaluation.status.in_([ModelEvaluationStatus.pending, ModelEvaluationStatus.running]),

29-34: Fix OR validation and unify 404 detail; current AND contradicts the message

Detect either missing fine-tuned model or test data; also use status_code/detail for consistency.

-    if fine_tuning_job.fine_tuned_model and fine_tuning_job.test_data_s3_object is None:
+    if not fine_tuning_job.fine_tuned_model or not fine_tuning_job.test_data_s3_object:
         logger.error(
             f"[create_model_evaluation] No fine tuned model or test data found for the given fine tuning ID | fine_tuning_id={request.fine_tuning_id}, project_id={project_id}"
         )
-        raise HTTPException(404, "Fine tuned model not found")
+        raise HTTPException(status_code=404, detail="Fine tuned model or test data not found")
backend/app/crud/fine_tuning.py (4)

107-113: Annotate split_ratio as Optional[float].

Parameter is optional but typed as float.

 def fetch_by_document_id(
     session: Session,
     document_id: UUID,
     project_id: int,
-    split_ratio: float = None,
+    split_ratio: Optional[float] = None,
     base_model: Optional[str] = None,
 ) -> list[Fine_Tuning]:

20-27: Require non-null project_id and organization_id (guard at entry).

These IDs are mandatory for ownership checks. Add a quick guard before any queries.

 def create_fine_tuning_job(
     session: Session,
     request: FineTuningJobCreate,
     split_ratio: float,
     status: FineTuningStatus = FineTuningStatus.pending,
-    project_id: int = None,
-    organization_id: int = None,
+    project_id: int = None,
+    organization_id: int = None,
 ) -> tuple[Fine_Tuning, bool]:
+    if project_id is None or organization_id is None:
+        raise HTTPException(
+            status_code=400,
+            detail="project_id and organization_id are required",
+        )
     active_jobs = fetch_active_jobs_by_document_id(

Also applies to: 28-35


130-151: “Active” must exclude completed jobs.

Completed jobs block new creations today. Filter to pending/running and fix docstring.

 def fetch_active_jobs_by_document_id(
@@
-    """
-    Return all ACTIVE jobs for the given document & project.
-    Active = status != failed AND is_deleted is false.
-    """
+    """
+    Return all ACTIVE jobs for the given document & project.
+    Active = status IN ('pending','running') AND is_deleted is false.
+    """
@@
-            Fine_Tuning.is_deleted.is_(False),
-            Fine_Tuning.status != FineTuningStatus.failed,
+            Fine_Tuning.is_deleted.is_(False),
+            Fine_Tuning.status.in_([FineTuningStatus.pending, FineTuningStatus.running]),

Also applies to: 147-149


170-172: Coerce status to Enum on updates.

Prevents stray strings in a typed Enum column.

-    for key, value in update.model_dump(exclude_unset=True).items():
-        setattr(job, key, value)
+    for key, value in update.model_dump(exclude_unset=True).items():
+        if key == "status" and isinstance(value, str):
+            value = FineTuningStatus(value)
+        setattr(job, key, value)
backend/app/api/routes/fine_tuning.py (4)

188-193: Remove unused variable assignment to satisfy Ruff F841.

-    client = get_openai_client(  # Used here only to validate the user's OpenAI key;
+    get_openai_client(  # Used here only to validate the user's OpenAI key;
         # the actual client is re-initialized separately inside the background task
         session,
         current_user.organization_id,
         current_user.project_id,
     )

71-78: Rename local “storage” → “cloud_storage” for clarity and to avoid shadowing.

Consistent naming improves readability and prevents confusion with module names.

-            storage = get_cloud_storage(
+            cloud_storage = get_cloud_storage(
                 session=session, project_id=current_user.project_id
             )
@@
-            preprocessor = DataPreprocessor(
-                document, storage, ratio, request.system_prompt
+            preprocessor = DataPreprocessor(
+                document, cloud_storage, ratio, request.system_prompt
             )
-    storage = get_cloud_storage(session=session, project_id=current_user.project_id)
+    cloud_storage = get_cloud_storage(session=session, project_id=current_user.project_id)
-    job = job.model_copy(
+    job = job.model_copy(
         update={
-            "train_data_file_url": storage.get_signed_url(job.train_data_s3_object)
+            "train_data_file_url": cloud_storage.get_signed_url(job.train_data_s3_object)
             if job.train_data_s3_object
             else None,
-            "test_data_file_url": storage.get_signed_url(job.test_data_s3_object)
+            "test_data_file_url": cloud_storage.get_signed_url(job.test_data_s3_object)
             if job.test_data_s3_object
             else None,
         }
     )
-    storage = get_cloud_storage(session=session, project_id=current_user.project_id)
+    cloud_storage = get_cloud_storage(session=session, project_id=current_user.project_id)
@@
-        train_url = (
-            storage.get_signed_url(job.train_data_s3_object)
+        train_url = (
+            cloud_storage.get_signed_url(job.train_data_s3_object)
             if job.train_data_s3_object
             else None
         )
-        test_url = (
-            storage.get_signed_url(job.test_data_s3_object)
+        test_url = (
+            cloud_storage.get_signed_url(job.test_data_s3_object)
             if job.test_data_s3_object
             else None
         )

Also applies to: 254-255, 292-301, 314-316, 327-336


318-320: Fix typo and add spacing in log tag.

-            f"[retrive_job_by_document]No fine-tuning jobs found for document_id={document_id}, project_id={project_id}"
+            f"[retrieve_job_by_document] No fine-tuning jobs found for document_id={document_id}, project_id={project_id}"

160-174: Guard update in except when job lookup fails.

If fetch_by_id raises, fine_tune remains None; update_finetune_job(None, ...) will crash.

         except Exception as e:
             error_msg = str(e)
             logger.error(
                 f"[process_fine_tuning_job] Background job failure: {e} | "
                 f"job_id={job_id}, project_id={project_id}|"
             )
-            update_finetune_job(
-                session=session,
-                job=fine_tune,
-                update=FineTuningUpdate(
-                    status=FineTuningStatus.failed,
-                    error_message="Error while processing the background job : "
-                    + error_msg,
-                ),
-            )
+            if fine_tune:
+                update_finetune_job(
+                    session=session,
+                    job=fine_tune,
+                    update=FineTuningUpdate(
+                        status=FineTuningStatus.failed,
+                        error_message="Error while processing the background job : "
+                        + error_msg,
+                    ),
+                )
🧹 Nitpick comments (11)
backend/app/crud/model_evaluation.py (6)

55-57: Correct log tag to match function

-        f"[Create_fine_tuning_job]Created new model evaluation from fine tuning job ID={fine_tuning_job.id}, project_id={project_id}"
+        f"[create_model_evaluation]Created new model evaluation from fine tuning job ID={fine_tuning_job.id}, project_id={project_id}"

70-79: Fix log tag names to the correct function

-            f"[fetch_by_id]Model evaluation not found for eval_id={eval_id}, project_id={project_id}"
+            f"[fetch_by_eval_id]Model evaluation not found for eval_id={eval_id}, project_id={project_id}"
@@
-        f"[fetch_by_id]Fetched model evaluation for eval ID={model_eval.id}, project_id={project_id}"
+        f"[fetch_by_eval_id]Fetched model evaluation for eval ID={model_eval.id}, project_id={project_id}"

87-95: Exclude soft-deleted records in list queries

     query = (
         select(ModelEvaluation)
         .where(
             ModelEvaluation.document_id == document_id,
             ModelEvaluation.project_id == project_id,
+            ModelEvaluation.is_deleted.is_(False),
         )
         .order_by(ModelEvaluation.updated_at.desc())
     )

DB: consider a composite index on (project_id, document_id, is_deleted, updated_at DESC) to speed this and similar queries.


98-103: Consider returning an empty list instead of 404 (contract check)

If the API treats “no evaluations yet” as non-exceptional, return []:

-        raise HTTPException(status_code=404, detail="Model evaluation not found")
+        logger.info(f"[fetch_eval_by_doc_id]No model evaluations found for document_id={document_id}, project_id={project_id}")
+        return []

Confirm expected behavior in the route/tests before changing.


115-123: Filter to non-deleted and completed evaluations for top-model selection

     query = (
         select(ModelEvaluation)
         .where(
             ModelEvaluation.document_id == document_id,
             ModelEvaluation.project_id == project_id,
+            ModelEvaluation.is_deleted.is_(False),
+            ModelEvaluation.status == ModelEvaluationStatus.completed,
         )
         .order_by(ModelEvaluation.updated_at.desc())
     )

155-158: Docstring mismatch: function filters by fine_tuning_id, not document

-    Return all ACTIVE model evaluations for the given document & project.
+    Return all ACTIVE model evaluations for the given fine_tuning_id & project.
backend/app/crud/fine_tuning.py (3)

42-46: Remove unused local “document”.

Validation side-effect is enough; drop the assignment.

-    document_crud = DocumentCrud(
-        session, project_id
-    )  # to check if the given document is present in the document table or not
-    document = document_crud.read_one(request.document_id)
+    document_crud = DocumentCrud(session, project_id)  # validate presence/ownership
+    document_crud.read_one(request.document_id)

137-137: Tighten return annotation (drop quotes).

No forward-ref needed since Fine_Tuning is imported.

-) -> list["Fine_Tuning"]:
+) -> list[Fine_Tuning]:

28-35: Close the race between “active check” and insert (unique constraint).

Two concurrent requests can create duplicates. Add a partial unique index at DB level for (document_id, base_model, split_ratio, project_id) when is_deleted=false and status in (pending,running).

Example (SQLModel/SQLAlchemy index in your model or Alembic migration):

# In Alembic migration (Postgres)
op.create_index(
    "uq_active_ft_job",
    "fine_tuning",
    ["document_id", "base_model", "split_ratio", "project_id"],
    unique=True,
    postgresql_where=sa.text("NOT is_deleted AND status IN ('pending','running')")
)

Also consider rechecking inside a transaction or handling IntegrityError to return the existing job.

Also applies to: 59-66

backend/app/api/routes/fine_tuning.py (2)

292-301: Harden signed-URL generation (don’t fail the whole response).

Wrap presign in try/except and fall back to None on errors so a single S3 issue doesn’t 500 the endpoint.

-    job = job.model_copy(
+    job = job.model_copy(
         update={
-            "train_data_file_url": cloud_storage.get_signed_url(job.train_data_s3_object)
+            "train_data_file_url": _maybe_signed_url(cloud_storage, job.train_data_s3_object)
             if job.train_data_s3_object
             else None,
-            "test_data_file_url": cloud_storage.get_signed_url(job.test_data_s3_object)
+            "test_data_file_url": _maybe_signed_url(cloud_storage, job.test_data_s3_object)
             if job.test_data_s3_object
             else None,
         }
     )
-        train_url = (
-            cloud_storage.get_signed_url(job.train_data_s3_object)
+        train_url = (
+            _maybe_signed_url(cloud_storage, job.train_data_s3_object)
             if job.train_data_s3_object
             else None
         )
-        test_url = (
-            cloud_storage.get_signed_url(job.test_data_s3_object)
+        test_url = (
+            _maybe_signed_url(cloud_storage, job.test_data_s3_object)
             if job.test_data_s3_object
             else None
         )

Helper to add near the top of this module:

from app.core.cloud.storage import CloudStorageError

def _maybe_signed_url(storage, key):
    try:
        return storage.get_signed_url(key) if key else None
    except CloudStorageError as e:
        logger.warning(f"[signed_url] Failed to presign key={key}: {e}")
        return None

Also applies to: 327-336


68-89: Add client timeouts/retries for external calls.

Use a client with sane timeouts to avoid hanging background tasks.

Example (in get_openai_client):

  • return OpenAI(...).with_options(timeout=60.0, max_retries=3)

Also applies to: 114-137

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 1932825 and 58cd080.

⛔ Files ignored due to path filters (1)
  • backend/uv.lock is excluded by !**/*.lock
📒 Files selected for processing (3)
  • backend/app/api/routes/fine_tuning.py (1 hunks)
  • backend/app/crud/fine_tuning.py (1 hunks)
  • backend/app/crud/model_evaluation.py (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (3)
backend/app/crud/fine_tuning.py (2)
backend/app/models/fine_tuning.py (4)
  • Fine_Tuning (54-89)
  • FineTuningJobCreate (28-51)
  • FineTuningUpdate (92-100)
  • FineTuningStatus (13-17)
backend/app/crud/document.py (1)
  • DocumentCrud (14-135)
backend/app/crud/model_evaluation.py (2)
backend/app/crud/fine_tuning.py (1)
  • fetch_by_id (88-104)
backend/app/models/model_evaluation.py (4)
  • ModelEvaluation (38-85)
  • ModelEvaluationStatus (14-18)
  • ModelEvaluationBase (21-26)
  • ModelEvaluationUpdate (88-92)
backend/app/api/routes/fine_tuning.py (6)
backend/app/models/fine_tuning.py (4)
  • FineTuningJobCreate (28-51)
  • FineTuningJobPublic (103-119)
  • FineTuningUpdate (92-100)
  • FineTuningStatus (13-17)
backend/app/core/cloud/storage.py (5)
  • get_cloud_storage (262-279)
  • client (30-42)
  • create (44-84)
  • get_signed_url (135-137)
  • get_signed_url (217-242)
backend/app/crud/document.py (1)
  • DocumentCrud (14-135)
backend/app/utils.py (5)
  • get_openai_client (170-200)
  • APIResponse (27-48)
  • mask_string (158-167)
  • load_description (204-209)
  • success_response (34-37)
backend/app/crud/fine_tuning.py (4)
  • create_fine_tuning_job (20-66)
  • fetch_by_id (88-104)
  • update_finetune_job (165-182)
  • fetch_by_document_id (107-127)
backend/app/core/finetune/preprocessing.py (3)
  • DataPreprocessor (17-157)
  • process (118-157)
  • cleanup (72-76)
🪛 Ruff (0.12.2)
backend/app/crud/fine_tuning.py

45-45: Local variable document is assigned to but never used

Remove assignment to unused variable document

(F841)

backend/app/api/routes/fine_tuning.py

188-188: Local variable client is assigned to but never used

Remove assignment to unused variable client

(F841)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: checks (3.11.7, 6)
🔇 Additional comments (2)
backend/app/crud/model_evaluation.py (2)

182-183: Nice: Pydantic v2 model_dump usage

Using model_dump(exclude_unset=True) is correct and keeps partial updates clean.


7-7: fetch_by_id is correctly re-exported from app/crud/init.py; no import changes needed.

Comment on lines +35 to +41
OPENAI_TO_INTERNAL_STATUS = {
"validating_files": FineTuningStatus.running,
"queued": FineTuningStatus.running,
"running": FineTuningStatus.running,
"succeeded": FineTuningStatus.completed,
"failed": FineTuningStatus.failed,
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Verification agent

🧩 Analysis chain

Verify OpenAI status mapping completeness.

Consider handling statuses like “cancelled” explicitly.


🌐 Web query:

What are the possible status values for OpenAI fine-tuning jobs in the Python SDK (1.x) and REST API as of September 2025?

💡 Result:

Both use the same set of statuses: validating_files, queued, running, succeeded, failed, and cancelled. [1][2]

Sources:
[1] OpenAI fine-tuning job schema (Python SDK 1.x reference).
[2] OpenAI REST API fine-tuning job docs (job status values).


Add ‘cancelled’ to OPENAI_TO_INTERNAL_STATUS
The OpenAI fine-tuning API returns a “cancelled” status that isn’t mapped. Extend OPENAI_TO_INTERNAL_STATUS in backend/app/api/routes/fine_tuning.py (lines 35–41) to include "cancelled": FineTuningStatus.<appropriate>.

🤖 Prompt for AI Agents
In backend/app/api/routes/fine_tuning.py around lines 35–41, the
OPENAI_TO_INTERNAL_STATUS mapping misses the OpenAI "cancelled" state; add an
entry "cancelled": FineTuningStatus.cancelled if that enum value exists,
otherwise map it to FineTuningStatus.failed to ensure cancelled jobs are
handled; update and run tests/lint to confirm no missing imports or references.

Comment on lines +129 to +135
for model_eval in model_evals:
if model_eval.score is not None:
mcc = model_eval.score.get("mcc_score", None)
if mcc is not None and mcc > highest_mcc:
highest_mcc = mcc
top_model = model_eval

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

MCC key mismatch can yield no top model; support canonical key

Model doc uses {"mcc": ...}; code reads "mcc_score". Read mcc first, then fallback.

-        if model_eval.score is not None:
-            mcc = model_eval.score.get("mcc_score", None)
+        if model_eval.score is not None:
+            mcc = model_eval.score.get("mcc")
+            if mcc is None:
+                mcc = model_eval.score.get("mcc_score")
             if mcc is not None and mcc > highest_mcc:
                 highest_mcc = mcc
                 top_model = model_eval

Optionally, centralize metric keys as constants to avoid drift.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
for model_eval in model_evals:
if model_eval.score is not None:
mcc = model_eval.score.get("mcc_score", None)
if mcc is not None and mcc > highest_mcc:
highest_mcc = mcc
top_model = model_eval
for model_eval in model_evals:
if model_eval.score is not None:
mcc = model_eval.score.get("mcc")
if mcc is None:
mcc = model_eval.score.get("mcc_score")
if mcc is not None and mcc > highest_mcc:
highest_mcc = mcc
top_model = model_eval
🤖 Prompt for AI Agents
In backend/app/crud/model_evaluation.py around lines 129 to 135, the loop reads
score.get("mcc_score") but model documents use "mcc", so replace the single-key
lookup with a fallback that reads score.get("mcc") first then
score.get("mcc_score") if absent (or use a helper constant/utility for metric
keys), assign that value to mcc, and use it for the highest_mcc comparison so
the correct top_model is selected.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
backend/app/core/doctransform/registry.py (1)

49-49: Canonical CSV reverse mapping added; add a lightweight consistency test.

Good to have a canonical extension. Since multiple extensions can map to one format (e.g., .htm/.htmlhtml), add a small test to prevent accidental regressions:

# tests/test_doctransform_registry.py
from app.core.doctransform.registry import get_file_format, FORMAT_TO_EXTENSION
def test_csv_mapping():
    assert get_file_format("data.csv") == "csv"
    assert FORMAT_TO_EXTENSION["csv"] == ".csv"

If TSV support is expected by the classification pipeline, confirm whether we also need .tsvtsv (or csv) mappings and plan corresponding transforms later.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 58cd080 and bfe272a.

⛔ Files ignored due to path filters (1)
  • backend/uv.lock is excluded by !**/*.lock
📒 Files selected for processing (2)
  • backend/app/core/doctransform/registry.py (2 hunks)
  • backend/pyproject.toml (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • backend/pyproject.toml
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: checks (3.11.7, 6)
🔇 Additional comments (1)
backend/app/core/doctransform/registry.py (1)

38-38: CSV extension recognized; add focused tests
Mapping includes .csv (handled via .lower()); no calls to resolve_transformer("csv", …) or convert_document with CSV were found. Add a unit test for get_file_format("*.csv").

Copy link
Copy Markdown
Collaborator

@kartpop kartpop left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving, as the feature branch PRs have been reviewed

@AkhileshNegi AkhileshNegi merged commit 01f66ee into main Sep 4, 2025
2 of 3 checks passed
@AkhileshNegi AkhileshNegi deleted the feature/classification branch September 4, 2025 16:47
@coderabbitai coderabbitai bot mentioned this pull request Oct 6, 2025
2 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request ready-for-review

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants