Skip to content

audio files: list endpoint#630

Closed
nishika26 wants to merge 15 commits intomainfrom
enhancement/get_files_api
Closed

audio files: list endpoint#630
nishika26 wants to merge 15 commits intomainfrom
enhancement/get_files_api

Conversation

@nishika26
Copy link
Copy Markdown
Collaborator

@nishika26 nishika26 commented Feb 27, 2026

Summary

Target issue is

Checklist

Before submitting a pull request, please ensure that you mark these task.

  • Ran fastapi run --reload app/main.py or docker compose up in the repository root and test.

Notes

  • New Features

    • Added file listing endpoint to retrieve all audio files with optional cloud storage signed URLs
    • Added file retrieval endpoint to get individual audio files with optional signed URLs for secure access
    • Improved API route organization for cleaner endpoint structures
  • Tests

    • Added comprehensive test suite for audio file listing and retrieval functionality, including signed URL handling and authentication validation

Summary by CodeRabbit

Release Notes

  • New Features

    • Added audio file management APIs with optional pre-signed URLs for direct access
    • Added sequential LLM chain execution with intermediate callbacks
    • Extended LLM support for multimodal inputs (images, PDFs)
    • Added SarvamAI as a new LLM provider
    • Automatic STT metrics computation for evaluation runs
    • Added signed URLs to dataset and evaluation result APIs
  • Improvements

    • Enhanced GitHub issue automation with OpenAI formatting

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Feb 27, 2026

Caution

Review failed

Pull request was closed or merged during review

📝 Walkthrough

Walkthrough

This PR introduces comprehensive LLM chain orchestration with sequential block execution and callbacks, extends LLM providers with multimodal input support (images, PDFs), adds SarvamAI STT/TTS provider integration, implements STT evaluation metrics computation (WER, CER, WIP), and enhances STT evaluations with signed URL support for audio files. Includes database schema migration, extensive API routes, service layer implementations, and comprehensive test coverage.

Changes

Cohort / File(s) Summary
STT Evaluations - Signed URLs
backend/app/api/routes/stt_evaluations/dataset.py, backend/app/api/routes/stt_evaluations/evaluation.py, backend/app/crud/stt_evaluations/result.py
Added include_signed_url query parameter to dataset and run endpoints; integrated cloud storage to fetch signed URLs for audio samples with error handling and logging.
STT Audio Files - New Endpoints
backend/app/api/routes/stt_evaluations/files.py, backend/app/crud/file.py, backend/app/services/stt_evaluations/helpers.py
New endpoints list_audio and get_audio with optional signed URL generation; added list_files CRUD function and helper functions build_file_schema/build_file_schemas for constructing FilePublic responses with signed URLs.
LLM Chain Orchestration
backend/app/api/routes/llm_chain.py, backend/app/crud/llm_chain.py, backend/app/services/llm/chain/chain.py, backend/app/services/llm/chain/executor.py, backend/app/services/llm/chain/types.py
New LLM chain API endpoint and complete chain execution system with ChainContext, ChainBlock, LLMChain, ChainExecutor classes; supports sequential block execution, per-block callbacks, error handling, and status tracking.
LLM Job Services - Chain Support
backend/app/services/llm/jobs.py
Added start_chain_job and execute_chain_job for async chain execution; refactored guardrails into reusable helpers apply_input_guardrails and apply_output_guardrails; introduced modular execute_llm_call for single block execution.
Multimodal LLM Support
backend/app/models/llm/request.py, backend/app/models/llm/response.py, backend/app/utils.py
Extended LLM request models with ImageContent, PDFContent, ImageInput, PDFInput; added PromptTemplate and multimodal list support; new response models LLMChainResponse and IntermediateChainResponse; updated input resolution to handle image/PDF/multimodal inputs.
Provider Implementations - Multimodal
backend/app/services/llm/providers/base.py, backend/app/services/llm/providers/gai.py, backend/app/services/llm/providers/oai.py
Introduced ContentPart type alias and MultiModalInput class; added format_parts helpers in Google AI and OpenAI providers; updated execute signatures to accept multimodal inputs with proper content formatting.
SarvamAI Provider
backend/app/services/llm/providers/sai.py, backend/app/services/llm/mappers.py, backend/app/services/llm/providers/registry.py
New SarvamAIProvider class supporting STT and TTS with parameter mapping; added map_kaapi_to_sarvam_params mapper; extended registry with SARVAMAI_NATIVE provider support; integrated into provider discovery.
STT Metrics System
backend/app/services/stt_evaluations/metrics.py, backend/app/services/stt_evaluations/metric_job.py, backend/app/crud/stt_evaluations/cron.py
Implemented WER, CER, lenient WER, and WIP metric calculation with language-aware normalization; added execute_metric_computation Celery task for batch metric computation; integrated metric scheduling into STT run completion workflow.
Database Schema
backend/app/alembic/versions/048_create_llm_chain_table.py, backend/app/models/job.py
Created llm_chain table with cascade relationships to job/project/organization; added chain_id FK to llm_call; added LLM_CHAIN job type enum; includes indices for efficient querying.
Data Models - Core Updates
backend/app/models/file.py, backend/app/models/stt_evaluation.py, backend/app/models/__init__.py, backend/app/models/llm/__init__.py
Added AudioUploadResponse and FileIDList to file models; extended FilePublic with signed_url field; moved AudioUploadResponse from stt_evaluation to file module; added STTSamplePublic.signed_url and STTEvaluationRunWithResults.results_total; expanded llm model exports for chain types.
Evaluation Type Filtering
backend/app/crud/evaluations/core.py, backend/app/crud/evaluations/dataset.py
Added EvaluationType filtering to evaluation and dataset CRUD operations; datasets/runs now set and filter by TEXT type to distinguish from STT evaluations.
Documentation
backend/app/api/docs/stt_evaluation/get_dataset.md, backend/app/api/docs/stt_evaluation/get_run.md, backend/app/api/docs/stt_evaluation/get_audio.md, backend/app/api/docs/stt_evaluation/list_audios.md, backend/app/api/docs/llm/llm_call.md, backend/app/api/docs/llm/llm_chain.md
Added documentation for new audio file endpoints and signed URL behavior; expanded LLM call docs for multimodal inputs and ad-hoc configuration; added comprehensive LLM chain API documentation.
API Integration
backend/app/api/main.py, backend/app/api/routes/stt_evaluations/result.py, backend/app/services/stt_evaluations/audio.py
Registered llm_chain router in main API; consolidated imports for cleaner code organization.
Test Coverage
backend/app/tests/api/routes/test_stt_evaluation_.py, backend/app/tests/crud/.../*.py, backend/app/tests/services/llm/.../*.py, backend/app/tests/services/stt_evaluations/.../*.py
Comprehensive test suites for STT evaluation CRUD/API (1500+ lines), LLM chain CRUD/execution (750+ lines), SarvamAI provider (531 lines), multimodal input handling (534 lines), metrics computation (564 lines), and mapper transformations.
GitHub Actions & Dependencies
.github/issue-formatter.yml, .github/workflows/benchmark.yml, backend/pyproject.toml
Added OpenAI-based issue formatter workflow; upgraded artifact upload to v7; added runtime dependencies: sarvamai, jiwer, indic-nlp-library, whisper-normalizer.

Sequence Diagram(s)

sequenceDiagram
    actor User
    participant API as API Endpoint
    participant JobService as Job Service
    participant CeleryTask as Celery Task
    participant ChainExec as Chain Executor
    participant LLMChain as LLM Chain
    participant Provider as LLM Provider
    participant Callback as Callback Webhook

    User->>API: POST /llm/chain (with request)
    API->>JobService: start_chain_job()
    JobService->>CeleryTask: Schedule execute_chain_job
    JobService-->>API: Return success message
    API-->>User: Job accepted
    
    Note over CeleryTask: Async execution
    CeleryTask->>ChainExec: Create ChainExecutor
    ChainExec->>LLMChain: Initialize with blocks
    ChainExec->>ChainExec: Setup (mark job PROCESSING)
    
    loop For each block
        ChainExec->>LLMChain: execute(query)
        LLMChain->>LLMChain: Get current block
        LLMChain->>Provider: execute_llm_call()
        Provider-->>LLMChain: BlockResult
        LLMChain->>ChainExec: on_block_completed callback
        ChainExec->>ChainExec: Aggregate usage, update DB
        alt Intermediate callback enabled
            ChainExec->>Callback: Send intermediate response
        end
        alt Block failed
            LLMChain-->>ChainExec: Error result, break loop
        else Block succeeded
            LLMChain->>LLMChain: Convert output to next query
        end
    end
    
    alt Execution succeeded
        ChainExec->>ChainExec: Teardown (build final response)
        ChainExec->>Callback: Send final callback
        ChainExec->>ChainExec: Update chain COMPLETED
    else Execution failed
        ChainExec->>ChainExec: Handle error
        ChainExec->>Callback: Send error callback
        ChainExec->>ChainExec: Update chain FAILED
    end
Loading
sequenceDiagram
    actor User
    participant API as API Endpoint
    participant JobService as Job Service
    participant Utils as resolve_input()
    participant Provider as LLM Provider
    participant BaseProvider as BaseProvider

    User->>API: POST with multimodal input
    API->>JobService: execute_llm_call(query_input)
    JobService->>Utils: resolve_input(query_input)
    
    alt Input is TextInput
        Utils->>Utils: Extract text content
        Utils-->>JobService: (text_string, None)
    else Input is ImageInput
        Utils->>Utils: resolve_image_content()
        Utils-->>JobService: (list[ImageContent], None)
    else Input is PDFInput
        Utils->>Utils: resolve_pdf_content()
        Utils-->>JobService: (list[PDFContent], None)
    else Input is list (multimodal)
        Utils->>Utils: Aggregate parts (Text/Image/PDF)
        Utils->>Utils: Build MultiModalInput
        Utils-->>JobService: (MultiModalInput, None)
    end
    
    JobService->>Provider: execute(resolved_input)
    alt Provider is OpenAI
        Provider->>Provider: format_parts(contents)
        Provider->>Provider: Build user message with parts
    else Provider is GoogleAI
        Provider->>Provider: format_parts(contents)
        Provider->>Provider: Build Gemini contents
    end
    Provider->>BaseProvider: Send formatted input
    BaseProvider-->>Provider: Response
    Provider-->>JobService: LLMCallResponse
Loading
sequenceDiagram
    participant Cron as STT Cron Job
    participant CeleryTask as Celery Task
    participant MetricJob as execute_metric_computation
    participant DB as Database
    participant MetricsUtil as Metrics Utilities

    Cron->>Cron: Check completed STT runs
    Cron->>CeleryTask: Schedule metric computation
    
    Note over MetricJob: Async metric calculation
    MetricJob->>DB: Fetch all results for run
    MetricJob->>DB: Fetch all samples for run
    
    loop For each result with transcription
        MetricJob->>DB: Get matching sample
        alt Sample found with ground_truth
            MetricJob->>MetricsUtil: calculate_stt_metrics(hypothesis, reference, language)
            MetricsUtil->>MetricsUtil: normalize_text() per language
            MetricsUtil->>MetricsUtil: Compute WER, CER, lenient_WER, WIP
            MetricsUtil-->>MetricJob: metric_dict
            MetricJob->>MetricJob: Store per-result scores
        else Missing sample or ground_truth
            MetricJob->>MetricJob: Count as skipped
        end
    end
    
    MetricJob->>DB: Bulk update STTResult scores
    MetricJob->>MetricsUtil: compute_run_aggregate_scores()
    MetricsUtil->>MetricsUtil: Calculate avg, std per metric
    MetricsUtil-->>MetricJob: aggregate_scores
    MetricJob->>DB: Update STTEvaluationRun aggregates
    MetricJob-->>Cron: Return summary (scored, skipped, failed)
Loading

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~120+ minutes

Possibly related PRs

Suggested labels

enhancement, feature, llm-chain, multimodal, stt-evaluations

Suggested reviewers

  • AkhileshNegi
  • kartpop

🐰 A chain of blocks now flows,
With images, PDFs that glow,
Metrics compute with careful care,
Sarvam speaks everywhere! ✨
URLs signed, the audio's clear—
What a feature-filled frontier! 🚀

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch enhancement/get_files_api

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🧹 Nitpick comments (2)
backend/app/services/stt_evaluations/helpers.py (2)

12-13: Strengthen storage typing and use explicit None checks.

object | None hides the required interface (get_signed_url) and weakens static guarantees. Use a protocol and storage is not None.

♻️ Proposed change
+from typing import Protocol
+
+class SignedUrlStorage(Protocol):
+    def get_signed_url(self, object_store_url: str) -> str: ...
+
 def build_file_schema(
     *,
     file: File,
     include_url: bool,
-    storage: object | None,
+    storage: SignedUrlStorage | None,
 ) -> FilePublic:
@@
-    if include_url and storage:
+    if include_url and storage is not None:
         schema.signed_url = storage.get_signed_url(file.object_store_url)
@@
 def build_file_schemas(
@@
-    storage: object | None,
+    storage: SignedUrlStorage | None,
 ) -> list[FilePublic]:
@@
-        if include_url and storage:
+        if include_url and storage is not None:
             schema.signed_url = storage.get_signed_url(file.object_store_url)

Also applies to: 25-26, 34-35, 49-50

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/app/services/stt_evaluations/helpers.py` around lines 12 - 13,
Replace the loose storage: object | None parameter with a typed Protocol that
declares the get_signed_url(...) method (e.g., class StorageProtocol: def
get_signed_url(self, ...): ...) and use that protocol as the parameter type
(StorageProtocol | None); then change all truthy checks of storage to explicit
comparisons using storage is not None in the helper functions that return
FilePublic (and the other occurrences noted at lines 25-26, 34-35, 49-50) so
callers and static checkers know the required interface and None is handled
explicitly.

3-3: Prefer collections.abc.Iterable for Python 3.11+ style typing.

This aligns with modern stdlib typing style and matches Ruff’s UP035 guidance.

♻️ Proposed change
-from typing import Iterable
+from collections.abc import Iterable
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/app/services/stt_evaluations/helpers.py` at line 3, Replace usage of
typing.Iterable with collections.abc.Iterable in this module: update the import
line to import Iterable from collections.abc and update any type annotations
that reference Iterable (e.g., function signatures or variable annotations in
helpers.py) to use the imported Iterable from collections.abc so the code
follows Python 3.11+ typing style and Ruff UP035 guidance.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@backend/app/api/routes/stt_evaluations/files.py`:
- Around line 62-66: The docstring for list_audio is outdated: remove references
to the removed file_ids request body and update it to describe the current GET
signature that only accepts the include_url query parameter; specifically, in
the list_audio function's docstring mention that the endpoint returns all audio
files for the project and that include_url (boolean query param) controls
whether each returned file includes a presigned URL, and remove any language
about accepting file_ids in the request body.
- Around line 79-83: The audio endpoints currently call list_files (and the
single-file fetch) using only tenant/project scope which allows non-audio files
through; update the calls to pass file_type=FileType.AUDIO to list_files (the
invocation with session=_session, organization_id=..., project_id=...) and, for
the single-file fetch handler, verify the returned file's type (e.g., file.type
or file.file_type) equals FileType.AUDIO and return a 404/raise if not; ensure
FileType is imported/accessible in the module and adjust both places (the
list_files call around the list handler and the single-file fetch around the
get-by-id handler) so only audio files are returned.
- Around line 49-54: The GET handlers that currently set summary="List audio
files" (and the other GET at lines ~94-99) need external markdown descriptions:
add description=load_description("files/list.md") to the router.get(...) for the
"List audio files" endpoint and likewise add
description=load_description("files/<appropriate-action>.md") to the other GET
handler; ensure load_description is imported and use the same relative markdown
naming convention used elsewhere (e.g., domain/action.md) so Swagger pulls
descriptions from external files instead of inline text.

In `@backend/app/tests/api/routes/stt_evaluations/test_files.py`:
- Around line 77-80: Tests are calling the obsolete POST
/stt-evaluations/files/list contract; update the tests to hit the current GET
/stt-evaluations/files endpoint by replacing client.post(...) calls with
client.get(...) and move any JSON body payload (e.g., "file_ids") into query
params or omit as appropriate, ensuring calls match the route implemented in
backend/app/api/routes/stt_evaluations/files.py (GET /stt-evaluations/files) and
test helpers that reference the listing behavior (the client invocation sites
around the client.post lines in test_files.py).
- Around line 18-60: The three fixtures audio_file_1, audio_file_2, and
audio_file_3 duplicate create_file calls; replace them with a single factory
fixture (e.g., audio_file_factory) that wraps create_file and accepts parameters
like object_store_url, filename, size_bytes, content_type, file_type,
organization_id, and project_id, then update existing fixtures or tests to call
audio_file_factory(...) (or keep audio_file_1/2/3 as thin wrappers that call
audio_file_factory with the original arguments) so all file creation logic is
centralized in one place for consistency and easier extension.

---

Nitpick comments:
In `@backend/app/services/stt_evaluations/helpers.py`:
- Around line 12-13: Replace the loose storage: object | None parameter with a
typed Protocol that declares the get_signed_url(...) method (e.g., class
StorageProtocol: def get_signed_url(self, ...): ...) and use that protocol as
the parameter type (StorageProtocol | None); then change all truthy checks of
storage to explicit comparisons using storage is not None in the helper
functions that return FilePublic (and the other occurrences noted at lines
25-26, 34-35, 49-50) so callers and static checkers know the required interface
and None is handled explicitly.
- Line 3: Replace usage of typing.Iterable with collections.abc.Iterable in this
module: update the import line to import Iterable from collections.abc and
update any type annotations that reference Iterable (e.g., function signatures
or variable annotations in helpers.py) to use the imported Iterable from
collections.abc so the code follows Python 3.11+ typing style and Ruff UP035
guidance.

ℹ️ Review info

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ecbae15 and 8b16363.

📒 Files selected for processing (12)
  • backend/app/api/routes/stt_evaluations/dataset.py
  • backend/app/api/routes/stt_evaluations/evaluation.py
  • backend/app/api/routes/stt_evaluations/files.py
  • backend/app/api/routes/stt_evaluations/result.py
  • backend/app/crud/file.py
  • backend/app/models/__init__.py
  • backend/app/models/file.py
  • backend/app/models/stt_evaluation.py
  • backend/app/services/stt_evaluations/audio.py
  • backend/app/services/stt_evaluations/helpers.py
  • backend/app/tests/api/routes/stt_evaluations/test_files.py
  • backend/app/tests/api/routes/stt_evaluations/test_stt_evaluation.py

@nishika26 nishika26 self-assigned this Feb 27, 2026
@nishika26 nishika26 added the enhancement New feature or request label Feb 27, 2026
@nishika26 nishika26 linked an issue Feb 27, 2026 that may be closed by this pull request
@codecov
Copy link
Copy Markdown

codecov bot commented Feb 27, 2026

Codecov Report

❌ Patch coverage is 60.81081% with 29 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
backend/app/api/routes/stt_evaluations/files.py 38.46% 16 Missing ⚠️
backend/app/services/stt_evaluations/helpers.py 26.66% 11 Missing ⚠️
backend/app/crud/file.py 60.00% 2 Missing ⚠️

📢 Thoughts on this report? Let us know!

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (1)
backend/app/api/routes/stt_evaluations/files.py (1)

77-81: ⚠️ Potential issue | 🟠 Major

Enforce FileType.AUDIO in both audio GET handlers.

Line 77 and Line 114 currently scope by org/project only, so non-audio rows can leak through these audio-specific endpoints. Filter list results to audio and return 404 for non-audio in get_audio.

🐛 Proposed fix
     files = list_files(
         session=_session,
         organization_id=auth_context.organization_.id,
         project_id=auth_context.project_.id,
     )
+    files = [f for f in files if f.file_type == FileType.AUDIO.value]
@@
-    if not file:
+    if not file or file.file_type != FileType.AUDIO.value:
         raise HTTPException(status_code=404, detail=f"File with ID {file_id} not found")

Also applies to: 114-122

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/app/api/routes/stt_evaluations/files.py` around lines 77 - 81, The
audio endpoints currently call list_files without restricting by type, so
non-audio rows can be returned; update the list_files calls used in the audio
GET handlers to include file_type=FileType.AUDIO (referencing the existing
list_files call and FileType enum) and, in get_audio (the handler that fetches a
single file), add a post-fetch guard that returns a 404 if the retrieved
file.file_type is not FileType.AUDIO; this ensures both the listing and
single-file retrieval endpoints only expose audio files.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@backend/app/api/routes/stt_evaluations/files.py`:
- Around line 77-81: The audio endpoints currently call list_files without
restricting by type, so non-audio rows can be returned; update the list_files
calls used in the audio GET handlers to include file_type=FileType.AUDIO
(referencing the existing list_files call and FileType enum) and, in get_audio
(the handler that fetches a single file), add a post-fetch guard that returns a
404 if the retrieved file.file_type is not FileType.AUDIO; this ensures both the
listing and single-file retrieval endpoints only expose audio files.

ℹ️ Review info

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 8b16363 and 5fe8e5c.

📒 Files selected for processing (3)
  • backend/app/api/docs/stt_evaluation/get_audio.md
  • backend/app/api/docs/stt_evaluation/list_audios.md
  • backend/app/api/routes/stt_evaluations/files.py
✅ Files skipped from review due to trivial changes (1)
  • backend/app/api/docs/stt_evaluation/list_audios.md

@nishika26 nishika26 changed the title audio files: get endpoint and signed url audio files: list endpoint Feb 28, 2026
@nishika26 nishika26 added on hold and removed enhancement New feature or request labels Feb 28, 2026
@nishika26 nishika26 removed the on hold label Mar 9, 2026
@nishika26
Copy link
Copy Markdown
Collaborator Author

closing this PR because merge conflicts made it to much of a mess

@nishika26 nishika26 closed this Mar 10, 2026
@Ayush8923 Ayush8923 deleted the enhancement/get_files_api branch April 1, 2026 08:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants