Conversation
|
Caution Review failedPull request was closed or merged during review 📝 WalkthroughWalkthroughThis PR introduces comprehensive LLM chain orchestration with sequential block execution and callbacks, extends LLM providers with multimodal input support (images, PDFs), adds SarvamAI STT/TTS provider integration, implements STT evaluation metrics computation (WER, CER, WIP), and enhances STT evaluations with signed URL support for audio files. Includes database schema migration, extensive API routes, service layer implementations, and comprehensive test coverage. Changes
Sequence Diagram(s)sequenceDiagram
actor User
participant API as API Endpoint
participant JobService as Job Service
participant CeleryTask as Celery Task
participant ChainExec as Chain Executor
participant LLMChain as LLM Chain
participant Provider as LLM Provider
participant Callback as Callback Webhook
User->>API: POST /llm/chain (with request)
API->>JobService: start_chain_job()
JobService->>CeleryTask: Schedule execute_chain_job
JobService-->>API: Return success message
API-->>User: Job accepted
Note over CeleryTask: Async execution
CeleryTask->>ChainExec: Create ChainExecutor
ChainExec->>LLMChain: Initialize with blocks
ChainExec->>ChainExec: Setup (mark job PROCESSING)
loop For each block
ChainExec->>LLMChain: execute(query)
LLMChain->>LLMChain: Get current block
LLMChain->>Provider: execute_llm_call()
Provider-->>LLMChain: BlockResult
LLMChain->>ChainExec: on_block_completed callback
ChainExec->>ChainExec: Aggregate usage, update DB
alt Intermediate callback enabled
ChainExec->>Callback: Send intermediate response
end
alt Block failed
LLMChain-->>ChainExec: Error result, break loop
else Block succeeded
LLMChain->>LLMChain: Convert output to next query
end
end
alt Execution succeeded
ChainExec->>ChainExec: Teardown (build final response)
ChainExec->>Callback: Send final callback
ChainExec->>ChainExec: Update chain COMPLETED
else Execution failed
ChainExec->>ChainExec: Handle error
ChainExec->>Callback: Send error callback
ChainExec->>ChainExec: Update chain FAILED
end
sequenceDiagram
actor User
participant API as API Endpoint
participant JobService as Job Service
participant Utils as resolve_input()
participant Provider as LLM Provider
participant BaseProvider as BaseProvider
User->>API: POST with multimodal input
API->>JobService: execute_llm_call(query_input)
JobService->>Utils: resolve_input(query_input)
alt Input is TextInput
Utils->>Utils: Extract text content
Utils-->>JobService: (text_string, None)
else Input is ImageInput
Utils->>Utils: resolve_image_content()
Utils-->>JobService: (list[ImageContent], None)
else Input is PDFInput
Utils->>Utils: resolve_pdf_content()
Utils-->>JobService: (list[PDFContent], None)
else Input is list (multimodal)
Utils->>Utils: Aggregate parts (Text/Image/PDF)
Utils->>Utils: Build MultiModalInput
Utils-->>JobService: (MultiModalInput, None)
end
JobService->>Provider: execute(resolved_input)
alt Provider is OpenAI
Provider->>Provider: format_parts(contents)
Provider->>Provider: Build user message with parts
else Provider is GoogleAI
Provider->>Provider: format_parts(contents)
Provider->>Provider: Build Gemini contents
end
Provider->>BaseProvider: Send formatted input
BaseProvider-->>Provider: Response
Provider-->>JobService: LLMCallResponse
sequenceDiagram
participant Cron as STT Cron Job
participant CeleryTask as Celery Task
participant MetricJob as execute_metric_computation
participant DB as Database
participant MetricsUtil as Metrics Utilities
Cron->>Cron: Check completed STT runs
Cron->>CeleryTask: Schedule metric computation
Note over MetricJob: Async metric calculation
MetricJob->>DB: Fetch all results for run
MetricJob->>DB: Fetch all samples for run
loop For each result with transcription
MetricJob->>DB: Get matching sample
alt Sample found with ground_truth
MetricJob->>MetricsUtil: calculate_stt_metrics(hypothesis, reference, language)
MetricsUtil->>MetricsUtil: normalize_text() per language
MetricsUtil->>MetricsUtil: Compute WER, CER, lenient_WER, WIP
MetricsUtil-->>MetricJob: metric_dict
MetricJob->>MetricJob: Store per-result scores
else Missing sample or ground_truth
MetricJob->>MetricJob: Count as skipped
end
end
MetricJob->>DB: Bulk update STTResult scores
MetricJob->>MetricsUtil: compute_run_aggregate_scores()
MetricsUtil->>MetricsUtil: Calculate avg, std per metric
MetricsUtil-->>MetricJob: aggregate_scores
MetricJob->>DB: Update STTEvaluationRun aggregates
MetricJob-->>Cron: Return summary (scored, skipped, failed)
Estimated code review effort🎯 5 (Critical) | ⏱️ ~120+ minutes Possibly related PRs
Suggested labels
Suggested reviewers
✨ Finishing Touches
🧪 Generate unit tests (beta)
|
There was a problem hiding this comment.
Actionable comments posted: 5
🧹 Nitpick comments (2)
backend/app/services/stt_evaluations/helpers.py (2)
12-13: Strengthenstoragetyping and use explicitNonechecks.
object | Nonehides the required interface (get_signed_url) and weakens static guarantees. Use a protocol andstorage is not None.♻️ Proposed change
+from typing import Protocol + +class SignedUrlStorage(Protocol): + def get_signed_url(self, object_store_url: str) -> str: ... + def build_file_schema( *, file: File, include_url: bool, - storage: object | None, + storage: SignedUrlStorage | None, ) -> FilePublic: @@ - if include_url and storage: + if include_url and storage is not None: schema.signed_url = storage.get_signed_url(file.object_store_url) @@ def build_file_schemas( @@ - storage: object | None, + storage: SignedUrlStorage | None, ) -> list[FilePublic]: @@ - if include_url and storage: + if include_url and storage is not None: schema.signed_url = storage.get_signed_url(file.object_store_url)Also applies to: 25-26, 34-35, 49-50
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/app/services/stt_evaluations/helpers.py` around lines 12 - 13, Replace the loose storage: object | None parameter with a typed Protocol that declares the get_signed_url(...) method (e.g., class StorageProtocol: def get_signed_url(self, ...): ...) and use that protocol as the parameter type (StorageProtocol | None); then change all truthy checks of storage to explicit comparisons using storage is not None in the helper functions that return FilePublic (and the other occurrences noted at lines 25-26, 34-35, 49-50) so callers and static checkers know the required interface and None is handled explicitly.
3-3: Prefercollections.abc.Iterablefor Python 3.11+ style typing.This aligns with modern stdlib typing style and matches Ruff’s UP035 guidance.
♻️ Proposed change
-from typing import Iterable +from collections.abc import Iterable🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/app/services/stt_evaluations/helpers.py` at line 3, Replace usage of typing.Iterable with collections.abc.Iterable in this module: update the import line to import Iterable from collections.abc and update any type annotations that reference Iterable (e.g., function signatures or variable annotations in helpers.py) to use the imported Iterable from collections.abc so the code follows Python 3.11+ typing style and Ruff UP035 guidance.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@backend/app/api/routes/stt_evaluations/files.py`:
- Around line 62-66: The docstring for list_audio is outdated: remove references
to the removed file_ids request body and update it to describe the current GET
signature that only accepts the include_url query parameter; specifically, in
the list_audio function's docstring mention that the endpoint returns all audio
files for the project and that include_url (boolean query param) controls
whether each returned file includes a presigned URL, and remove any language
about accepting file_ids in the request body.
- Around line 79-83: The audio endpoints currently call list_files (and the
single-file fetch) using only tenant/project scope which allows non-audio files
through; update the calls to pass file_type=FileType.AUDIO to list_files (the
invocation with session=_session, organization_id=..., project_id=...) and, for
the single-file fetch handler, verify the returned file's type (e.g., file.type
or file.file_type) equals FileType.AUDIO and return a 404/raise if not; ensure
FileType is imported/accessible in the module and adjust both places (the
list_files call around the list handler and the single-file fetch around the
get-by-id handler) so only audio files are returned.
- Around line 49-54: The GET handlers that currently set summary="List audio
files" (and the other GET at lines ~94-99) need external markdown descriptions:
add description=load_description("files/list.md") to the router.get(...) for the
"List audio files" endpoint and likewise add
description=load_description("files/<appropriate-action>.md") to the other GET
handler; ensure load_description is imported and use the same relative markdown
naming convention used elsewhere (e.g., domain/action.md) so Swagger pulls
descriptions from external files instead of inline text.
In `@backend/app/tests/api/routes/stt_evaluations/test_files.py`:
- Around line 77-80: Tests are calling the obsolete POST
/stt-evaluations/files/list contract; update the tests to hit the current GET
/stt-evaluations/files endpoint by replacing client.post(...) calls with
client.get(...) and move any JSON body payload (e.g., "file_ids") into query
params or omit as appropriate, ensuring calls match the route implemented in
backend/app/api/routes/stt_evaluations/files.py (GET /stt-evaluations/files) and
test helpers that reference the listing behavior (the client invocation sites
around the client.post lines in test_files.py).
- Around line 18-60: The three fixtures audio_file_1, audio_file_2, and
audio_file_3 duplicate create_file calls; replace them with a single factory
fixture (e.g., audio_file_factory) that wraps create_file and accepts parameters
like object_store_url, filename, size_bytes, content_type, file_type,
organization_id, and project_id, then update existing fixtures or tests to call
audio_file_factory(...) (or keep audio_file_1/2/3 as thin wrappers that call
audio_file_factory with the original arguments) so all file creation logic is
centralized in one place for consistency and easier extension.
---
Nitpick comments:
In `@backend/app/services/stt_evaluations/helpers.py`:
- Around line 12-13: Replace the loose storage: object | None parameter with a
typed Protocol that declares the get_signed_url(...) method (e.g., class
StorageProtocol: def get_signed_url(self, ...): ...) and use that protocol as
the parameter type (StorageProtocol | None); then change all truthy checks of
storage to explicit comparisons using storage is not None in the helper
functions that return FilePublic (and the other occurrences noted at lines
25-26, 34-35, 49-50) so callers and static checkers know the required interface
and None is handled explicitly.
- Line 3: Replace usage of typing.Iterable with collections.abc.Iterable in this
module: update the import line to import Iterable from collections.abc and
update any type annotations that reference Iterable (e.g., function signatures
or variable annotations in helpers.py) to use the imported Iterable from
collections.abc so the code follows Python 3.11+ typing style and Ruff UP035
guidance.
ℹ️ Review info
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (12)
backend/app/api/routes/stt_evaluations/dataset.pybackend/app/api/routes/stt_evaluations/evaluation.pybackend/app/api/routes/stt_evaluations/files.pybackend/app/api/routes/stt_evaluations/result.pybackend/app/crud/file.pybackend/app/models/__init__.pybackend/app/models/file.pybackend/app/models/stt_evaluation.pybackend/app/services/stt_evaluations/audio.pybackend/app/services/stt_evaluations/helpers.pybackend/app/tests/api/routes/stt_evaluations/test_files.pybackend/app/tests/api/routes/stt_evaluations/test_stt_evaluation.py
Codecov Report❌ Patch coverage is 📢 Thoughts on this report? Let us know! |
There was a problem hiding this comment.
♻️ Duplicate comments (1)
backend/app/api/routes/stt_evaluations/files.py (1)
77-81:⚠️ Potential issue | 🟠 MajorEnforce
FileType.AUDIOin both audio GET handlers.Line 77 and Line 114 currently scope by org/project only, so non-audio rows can leak through these audio-specific endpoints. Filter list results to audio and return 404 for non-audio in
get_audio.🐛 Proposed fix
files = list_files( session=_session, organization_id=auth_context.organization_.id, project_id=auth_context.project_.id, ) + files = [f for f in files if f.file_type == FileType.AUDIO.value] @@ - if not file: + if not file or file.file_type != FileType.AUDIO.value: raise HTTPException(status_code=404, detail=f"File with ID {file_id} not found")Also applies to: 114-122
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/app/api/routes/stt_evaluations/files.py` around lines 77 - 81, The audio endpoints currently call list_files without restricting by type, so non-audio rows can be returned; update the list_files calls used in the audio GET handlers to include file_type=FileType.AUDIO (referencing the existing list_files call and FileType enum) and, in get_audio (the handler that fetches a single file), add a post-fetch guard that returns a 404 if the retrieved file.file_type is not FileType.AUDIO; this ensures both the listing and single-file retrieval endpoints only expose audio files.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@backend/app/api/routes/stt_evaluations/files.py`:
- Around line 77-81: The audio endpoints currently call list_files without
restricting by type, so non-audio rows can be returned; update the list_files
calls used in the audio GET handlers to include file_type=FileType.AUDIO
(referencing the existing list_files call and FileType enum) and, in get_audio
(the handler that fetches a single file), add a post-fetch guard that returns a
404 if the retrieved file.file_type is not FileType.AUDIO; this ensures both the
listing and single-file retrieval endpoints only expose audio files.
ℹ️ Review info
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
backend/app/api/docs/stt_evaluation/get_audio.mdbackend/app/api/docs/stt_evaluation/list_audios.mdbackend/app/api/routes/stt_evaluations/files.py
✅ Files skipped from review due to trivial changes (1)
- backend/app/api/docs/stt_evaluation/list_audios.md
merge conflicts solve
Co-authored-by: Prajna Prayas <gituprajna20@gmail.com>
fixing merge conflicts :wq merg_Coflict_2
merge conflict 3
|
closing this PR because merge conflicts made it to much of a mess |
Summary
Target issue is
Checklist
Before submitting a pull request, please ensure that you mark these task.
fastapi run --reload app/main.pyordocker compose upin the repository root and test.Notes
New Features
Tests
Summary by CodeRabbit
Release Notes
New Features
Improvements