Skip to content

feat: adjustable meditation duration with dynamic HLS fade#9

Merged
HatmanStack merged 2 commits intomainfrom
timer-update
Dec 30, 2025
Merged

feat: adjustable meditation duration with dynamic HLS fade#9
HatmanStack merged 2 commits intomainfrom
timer-update

Conversation

@HatmanStack
Copy link
Owner

@HatmanStack HatmanStack commented Dec 30, 2025

Backend:

  • Add duration_minutes parameter (3, 5, 10, 15, 20 min options)
  • Dynamic fade detection: stream without fade, then re-process final segments with proper fade timing based on actual voice duration
  • Fix ffprobe not in Lambda layer - use ffmpeg for duration detection
  • Update word targets (~150 wpm) with emphasis on pauses/breathing
  • Update to gpt-4o-mini-tts with sage voice and meditation instructions
  • Fix duration serialization in MeditationRequest.to_dict()

Frontend:

  • Add duration selector pills (right-justified)
  • Add "New" button to dismiss current meditation
  • Fix React hooks called conditionally error
  • Use useRef for duration to avoid state race condition

🤖 Generated with Claude Code

Summary by CodeRabbit

  • New Features

    • Users can choose meditation duration (3, 5, 10, 15, 20 minutes) before generating.
  • Improvements

    • Clearer voice with a slight boost and reduced background music for better balance.
    • Improved meditation voice generation and pacing tuned to selected duration.
    • Refined playback controls and layout for a smoother experience.
  • Tests

    • Button label updated to "Generate" and generation now starts with the default duration.

✏️ Tip: You can customize this high-level summary in your review settings.

Backend:
- Add duration_minutes parameter (3, 5, 10, 15, 20 min options)
- Dynamic fade detection: stream without fade, then re-process final
  segments with proper fade timing based on actual voice duration
- Fix ffprobe not in Lambda layer - use ffmpeg for duration detection
- Update word targets (~150 wpm) with emphasis on pauses/breathing
- Update to gpt-4o-mini-tts with sage voice and meditation instructions
- Fix duration serialization in MeditationRequest.to_dict()

Frontend:
- Add duration selector pills (right-justified)
- Add "New" button to dismiss current meditation
- Fix React hooks called conditionally error
- Use useRef for duration to avoid state race condition

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@coderabbitai
Copy link

coderabbitai bot commented Dec 30, 2025

📝 Walkthrough

Walkthrough

This PR adds selectable meditation durations (3,5,10,15,20 minutes) end-to-end: UI, request model, Lambda handling, AI/TTS generation, and HLS audio processing; updates TTS model/voice and audio constants; introduces HLS fade post-processing and duration estimation propagation.

Changes

Cohort / File(s) Summary
Configuration
backend/src/config/constants.py
Updated DEFAULT_MUSIC_VOLUME_REDUCTION from -5 to -15; added DEFAULT_VOICE_BOOST = 3.
Request Models
backend/src/models/requests.py
Added duration_minutes: int = 5 to MeditationRequest; validation for allowed values [3,5,10,15,20]; parsing and serialization updated.
Service Abstractions
backend/src/services/ai_service.py
AIService.generate_meditation signature extended to accept duration_minutes: int = 5.
AI Implementation
backend/src/services/gemini_service.py
generate_meditation now accepts duration_minutes, uses duration-targets mapping, switches model to gemini-2.0-flash, and builds prompts using duration targets.
TTS Provider
backend/src/providers/openai_tts.py
Switched to gpt-4o-mini-tts with a calmer meditation voice and added meditation instructions in the TTS instructions payload.
Audio Processing
backend/src/services/ffmpeg_audio_service.py
process_stream_to_hls gained estimated_voice_duration: float = 60.0; added _get_audio_duration_from_file and _apply_fade_to_segments; two-phase streaming + post-process fade and playlist finalization; TTS duration estimation updated (~80 wpm, 90s buffer).
Lambda Handler
backend/src/handlers/lambda_handler.py
Propagates duration_minutes through request handling, AI calls, TTS/streaming progress callbacks, and logs; emits estimated_voice_duration in progress metadata.
Frontend Hook
frontend/app/(tabs)/explore.tsx
handleMeditationCall(duration: number = 5) added; stores duration in a ref and passes it to backend call.
Backend Call Wrapper
frontend/components/BackendMeditationCall.tsx
BackendMeditationCallStreaming(..., durationMinutes: number = 5) added; request payload includes duration_minutes.
Meditation Controls UI
frontend/components/ScreenComponents/MeditationControls.tsx
Added DURATION_OPTIONS (3,5,10,15,20), duration selector UI, handleMeditationCall(durationMinutes) prop change, new optional callbacks (onStreamComplete, onStreamError, onPlaybackEnd), and UI/layout adjustments.
Tests
tests/frontend/unit/MeditationControls-test.tsx
Updated expectations: button label changed to "Generate"; assert handleMeditationCall called with default duration 5.

Sequence Diagram(s)

sequenceDiagram
    autonumber
    actor User
    participant FrontendUI as Frontend UI
    participant Hook as useMeditation Hook
    participant BackendCall as BackendMeditationCall
    participant Lambda as Lambda Handler
    participant AIService as AI Service
    participant TTSProvider as TTS Provider
    participant FFmpegService as FFmpeg Service
    participant S3 as S3/HLS

    Note over FrontendUI,Hook: User selects duration and triggers generation
    User->>FrontendUI: Select duration (e.g., 10)
    FrontendUI->>Hook: handleMeditationCall(10)
    Hook->>BackendCall: BackendMeditationCallStreaming(..., durationMinutes: 10)
    BackendCall->>Lambda: POST payload (duration_minutes: 10)

    Note over Lambda,AIService: Meditation text generation includes duration
    Lambda->>AIService: generate_meditation(input_data, duration_minutes: 10)
    AIService-->>Lambda: meditation_text (targeted to duration)

    Note over Lambda,TTSProvider: TTS streaming with estimated duration
    Lambda->>TTSProvider: stream_speech(meditation_text)
    TTSProvider-->>Lambda: audio stream (chunks), reports estimated_voice_duration

    Note over Lambda,FFmpegService: HLS processing with two-phase fade
    Lambda->>FFmpegService: process_stream_to_hls(voice_stream, music, ..., estimated_voice_duration)
    FFmpegService->>FFmpegService: save voice temp file -> extract actual duration
    FFmpegService->>FFmpegService: initial segments uploaded
    FFmpegService->>FFmpegService: apply fade to tail segments (post-process)
    FFmpegService->>S3: upload faded segments and finalize playlist
    FFmpegService-->>Lambda: HLS metadata + segment durations

    Lambda-->>BackendCall: streaming response (HLS URL, metadata)
    BackendCall-->>Hook: update state with HLS URL and durations
    Hook-->>FrontendUI: start playback
    FrontendUI->>User: Play meditation
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

Poem

🐰 I hop through code with a trusty cue,
Minutes chosen — three, five, ten, or two?
Voices softened, music fades slow,
I stitch the seconds where calm can grow.
✨🎧

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly summarizes the main changes: adding adjustable meditation duration and implementing dynamic HLS fade. It accurately reflects both the primary feature (adjustable duration) and the key technical implementation (dynamic HLS fade), making it specific and descriptive.
✨ Finishing touches
  • 📝 Generate docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (5)
backend/src/services/ffmpeg_audio_service.py (3)

493-512: Consider consolidating with existing get_audio_duration method.

The _get_audio_duration_from_file method duplicates the logic in get_audio_duration (lines 70-87). Both parse ffmpeg stderr for duration. The only differences are the timeout parameter and error handling.

🔎 Proposed consolidation
-    def _get_audio_duration_from_file(self, audio_path: str) -> float:
-        """Get audio duration using ffmpeg (ffprobe not available in Lambda layer)."""
-        try:
-            # Use ffmpeg to probe the file - same approach as get_audio_duration
-            result = subprocess.run(
-                [self.ffmpeg_executable, "-i", audio_path, "-f", "null", "-"],
-                stderr=subprocess.PIPE,
-                text=True,
-                timeout=30,
-            )
-            # Parse duration from stderr output
-            for line in result.stderr.split("\n"):
-                if "Duration" in line:
-                    duration_str = line.split(",")[0].split("Duration:")[1].strip()
-                    h, m, s = map(float, duration_str.split(":"))
-                    return h * 3600 + m * 60 + s
-            return 0.0
-        except Exception as e:
-            logger.warning(f"Could not get audio duration: {e}")
-            return 0.0
+    def _get_audio_duration_from_file(self, audio_path: str) -> float:
+        """Get audio duration using ffmpeg (ffprobe not available in Lambda layer)."""
+        return self.get_audio_duration(audio_path)

Then optionally add timeout support to the existing get_audio_duration method if needed.


556-557: Use round() instead of int() for delay calculation.

Converting float seconds to int milliseconds using int() truncates the fractional part, which may cause slight timing drift. Using round() provides more accurate timing.

🔎 Proposed fix
             f"[0:a]volume={DEFAULT_VOICE_BOOST}dB,"
-            f"adelay={int(max(0, DEFAULT_SILENCE_DURATION - redo_start_time) * 1000)}|{int(max(0, DEFAULT_SILENCE_DURATION - redo_start_time) * 1000)},"
+            f"adelay={round(max(0, DEFAULT_SILENCE_DURATION - redo_start_time) * 1000)}|{round(max(0, DEFAULT_SILENCE_DURATION - redo_start_time) * 1000)},"
             f"apad=pad_dur={trailing_music_seconds}[voice_padded];"

604-624: Clarify estimated_voice_duration parameter usage in docstring.

The docstring mentions "apply fade to final segments" but doesn't explain how estimated_voice_duration is used. This parameter drives the fade timing calculation in _apply_fade_to_segments.

🔎 Proposed enhancement
         estimated_voice_duration: float = 60.0,
     ) -> tuple[int, List[float]]:
         """
         Stream TTS to HLS segments, then apply fade to final segments.

         Strategy:
         1. Stream without fade for fast delivery
         2. Save TTS audio to temp file while streaming
         3. After TTS completes, re-process last few segments with proper fade

         Args:
             voice_generator: Iterator yielding audio chunks from TTS
             music_path: Path to downloaded background music file
             user_id: User identifier
             job_id: Job identifier
             progress_callback: Optional callback(segments_completed, segments_total)
-            estimated_voice_duration: Fallback if duration detection fails
+            estimated_voice_duration: Expected voice duration in seconds (used for fade timing if actual duration cannot be detected)

         Returns:
             Tuple of (total_segments, segment_durations)
         """
backend/src/models/requests.py (1)

108-115: Consider reducing debug logging verbosity.

Lines 109 and 115 have detailed debug logging for duration_minutes parsing. While useful during development, this level of logging for every request might be excessive in production. Consider moving these to logger.debug() or removing them once the feature stabilizes.

🔎 Proposed refinement
         duration_minutes = body.get("duration_minutes", 5)
-        print(f"[MEDITATION_REQUEST] Raw duration_minutes from body: {body.get('duration_minutes')} -> parsed: {duration_minutes}")
+        logger.debug(f"Parsing duration_minutes: {body.get('duration_minutes')} -> {duration_minutes}")
         if isinstance(duration_minutes, str):
             try:
                 duration_minutes = int(duration_minutes)
             except ValueError:
                 duration_minutes = 5
-        print(f"[MEDITATION_REQUEST] Final duration_minutes: {duration_minutes}")
+        logger.debug(f"Final duration_minutes: {duration_minutes}")
frontend/components/ScreenComponents/MeditationControls.tsx (1)

325-348: Consider adding accessibility labels for screen reader support.

The duration selector works well visually. For better accessibility, consider adding accessibilityLabel and accessibilityRole to the duration option Pressables.

🔎 Suggested accessibility improvement
             <Pressable
               key={option.value}
               onPress={() => setSelectedDuration(option.value)}
               style={[
                 localStyles.durationOption,
                 selectedDuration === option.value && localStyles.durationOptionSelected,
               ]}
               testID={`duration-${option.value}`}
+              accessibilityRole="button"
+              accessibilityLabel={`${option.value} minute meditation`}
+              accessibilityState={{ selected: selectedDuration === option.value }}
             >
📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 39c7c04 and 9af1d31.

📒 Files selected for processing (11)
  • backend/src/config/constants.py
  • backend/src/handlers/lambda_handler.py
  • backend/src/models/requests.py
  • backend/src/providers/openai_tts.py
  • backend/src/services/ai_service.py
  • backend/src/services/ffmpeg_audio_service.py
  • backend/src/services/gemini_service.py
  • frontend/app/(tabs)/explore.tsx
  • frontend/components/BackendMeditationCall.tsx
  • frontend/components/ScreenComponents/MeditationControls.tsx
  • tests/frontend/unit/MeditationControls-test.tsx
🧰 Additional context used
🧬 Code graph analysis (8)
frontend/components/BackendMeditationCall.tsx (1)
frontend/types/api.ts (1)
  • JobStatusResponse (36-47)
frontend/components/ScreenComponents/MeditationControls.tsx (3)
frontend/constants/Colors.ts (1)
  • Colors (4-3888)
frontend/components/ThemedText.tsx (1)
  • ThemedText (27-116)
frontend/components/ThemedView.tsx (1)
  • ThemedView (17-26)
backend/src/providers/openai_tts.py (2)
backend/src/services/tts_service.py (2)
  • TTSService (5-14)
  • stream_speech (11-11)
backend/src/providers/gemini_tts.py (1)
  • stream_speech (18-45)
backend/src/services/gemini_service.py (1)
backend/src/utils/logging_utils.py (1)
  • format (60-82)
backend/src/models/requests.py (1)
backend/src/config/constants.py (1)
  • InferenceType (4-6)
backend/src/handlers/lambda_handler.py (2)
backend/src/services/ai_service.py (1)
  • generate_meditation (13-13)
backend/src/services/gemini_service.py (1)
  • generate_meditation (213-245)
backend/src/services/ai_service.py (1)
backend/src/services/gemini_service.py (1)
  • generate_meditation (213-245)
backend/src/services/ffmpeg_audio_service.py (3)
backend/src/services/audio_service.py (1)
  • get_audio_duration (8-8)
backend/src/services/hls_service.py (2)
  • upload_segment_from_file (90-112)
  • finalize_playlist (183-194)
backend/src/handlers/lambda_handler.py (2)
  • progress_callback (285-291)
  • progress_callback (341-356)
🔇 Additional comments (16)
tests/frontend/unit/MeditationControls-test.tsx (1)

49-96: LGTM!

Test updates correctly reflect the new duration-aware generation flow. The test now verifies that the default duration of 5 minutes is passed to handleMeditationCall, and the button label change from "Generate Meditation" to "Generate" is properly validated.

backend/src/config/constants.py (1)

28-29: Verify audio balance with 10dB music reduction.

The music volume reduction changed from -5dB to -15dB (a 10dB decrease), making background music approximately 3x quieter. Combined with the new 3dB voice boost, this creates a much more prominent voice-to-music ratio. Ensure this balance aligns with the desired meditation experience and doesn't make the music too subtle.

Consider testing the audio mix with a few sample meditations to confirm the new balance feels natural.

backend/src/providers/openai_tts.py (2)

12-17: LGTM! Meditation-specific voice instructions enhance TTS quality.

The new MEDITATION_VOICE_INSTRUCTIONS constant provides clear guidance for pacing, tone, and pause handling, which should significantly improve the meditation audio quality.


30-36: No action needed. The model gpt-4o-mini-tts-2025-12-15 and voice sage are both valid and available in the OpenAI Text-to-Speech API as documented. The gpt-4o-mini-tts-2025-12-15 is an actively maintained snapshot with documented improvements, and sage is a standard built-in voice for the Speech endpoint.

backend/src/models/requests.py (1)

37-44: LGTM! Duration validation and defaults are well-implemented.

The duration_minutes field with validation against allowed values [3, 5, 10, 15, 20] and fallback to 5 provides good guardrails for the meditation duration feature.

backend/src/services/ai_service.py (1)

12-13: LGTM! Abstract method signature updated correctly.

The addition of the duration_minutes parameter with a default value of 5 is consistent with the PR objectives and aligns with the implementation in GeminiAIService.

frontend/app/(tabs)/explore.tsx (1)

69-69: LGTM! useRef correctly avoids state race condition.

Using useRef for durationRef and setting it synchronously before triggering setIsCalling(true) (line 177) is the right approach to ensure the duration value is captured correctly before the effect runs. This avoids the race condition that could occur with state updates.

Also applies to: 167-179

backend/src/handlers/lambda_handler.py (2)

116-116: LGTM! Duration parameter properly propagated.

The duration_minutes parameter is correctly passed through the meditation generation flow and included in logging for both base64 and HLS streaming paths. This aligns with the PR objectives.

Also applies to: 198-207, 305-316


318-322: Verify WPM reduction from ~150 to ~80 words/minute.

The TTS duration estimation was updated from ~150 wpm to ~80 wpm, and the buffer increased from 30s to 90s. This change reflects a much slower, more deliberate speaking pace appropriate for meditation. Confirm this aligns with the actual speaking rate of the new "sage" voice with meditation instructions.

You may want to measure the actual speaking rate of a few generated meditations to calibrate this estimate more accurately. An incorrect estimate could result in fade timing being off.

backend/src/services/gemini_service.py (2)

108-152: LGTM! Well-structured duration-aware prompt template.

The template correctly uses double braces {{ and }} to escape literal JSON braces within the Python format string. The duration targets (~150 wpm) are reasonable for spoken meditation with pauses.


213-242: Implementation looks correct with good observability.

The fallback to 5-minute targets when an unsupported duration is requested is a safe defensive choice. The logging of generated word/char counts against targets will help tune the prompt over time. The model name gemini-2.0-flash is the correct auto-updated alias for Google's Gemini 2.0 Flash API.

frontend/components/BackendMeditationCall.tsx (1)

309-322: LGTM! Clean integration of duration parameter.

The parameter follows JavaScript camelCase convention while the payload correctly uses snake_case to match the backend API contract. Default of 5 minutes aligns with the backend default.

frontend/components/ScreenComponents/MeditationControls.tsx (4)

10-19: Good placement of constant outside component.

Defining DURATION_OPTIONS at module scope avoids unnecessary array recreation on each render.


131-132: Correct hook placement to satisfy React rules.

Moving this state declaration above the early returns prevents the "React hooks called conditionally" error mentioned in the PR objectives.


210-265: Well-structured streaming mode UI with proper state management.

The layout correctly implements the "New" button requirement from PR objectives. The error state with retry and the play/pause toggle are handled cleanly.


353-416: Clean, well-organized styles.

Using StyleSheet.create ensures styles are validated and optimized. The spacer pattern with flex: 1 effectively centers the main button while right-aligning the duration selector.

Only replace existing segments when applying fade - don't create new
segments beyond original total_segments. Log warning if fade processing
generates more segments than expected.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (2)
backend/src/services/ffmpeg_audio_service.py (2)

493-512: Eliminate code duplication with get_audio_duration.

This method duplicates the logic from get_audio_duration (lines 70-87). The only differences are the timeout parameter and error handling approach. Consider refactoring to call the existing method or extract the common parsing logic.

🔎 Proposed refactor
 def _get_audio_duration_from_file(self, audio_path: str) -> float:
-    """Get audio duration using ffmpeg (ffprobe not available in Lambda layer)."""
-    try:
-        # Use ffmpeg to probe the file - same approach as get_audio_duration
-        result = subprocess.run(
-            [self.ffmpeg_executable, "-i", audio_path, "-f", "null", "-"],
-            stderr=subprocess.PIPE,
-            text=True,
-            timeout=30,
-        )
-        # Parse duration from stderr output
-        for line in result.stderr.split("\n"):
-            if "Duration" in line:
-                duration_str = line.split(",")[0].split("Duration:")[1].strip()
-                h, m, s = map(float, duration_str.split(":"))
-                return h * 3600 + m * 60 + s
-        return 0.0
-    except Exception as e:
-        logger.warning(f"Could not get audio duration: {e}")
-        return 0.0
+    """Get audio duration using ffmpeg (ffprobe not available in Lambda layer)."""
+    try:
+        duration = self.get_audio_duration(audio_path)
+        return duration if duration > 0 else 0.0
+    except Exception as e:
+        logger.warning(f"Could not get audio duration: {e}")
+        return 0.0

Note: You may want to add a timeout parameter to get_audio_duration as well for consistency.


551-556: Simplify complex expressions for readability.

The nested ternary expressions for the -ss parameter (line 551) and adelay values (line 556) are difficult to read and verify. Consider extracting these calculations into named variables with comments explaining the logic.

🔎 Proposed refactor
+        # Calculate voice start offset (account for silence if needed)
+        voice_start_offset = max(0, redo_start_time - DEFAULT_SILENCE_DURATION)
+        
+        # Calculate delay to apply to voice (if starting before silence ends)
+        voice_delay_ms = int(max(0, DEFAULT_SILENCE_DURATION - redo_start_time) * 1000)
+        
         ffmpeg_fade_cmd = [
             self.ffmpeg_executable,
-            "-ss", str(redo_start_time - DEFAULT_SILENCE_DURATION) if redo_start_time > DEFAULT_SILENCE_DURATION else "0",
+            "-ss", str(voice_start_offset),
             "-i", voice_temp_path,
             "-stream_loop", "-1", "-ss", str(redo_start_time), "-i", music_path,
             "-filter_complex",
             f"[0:a]volume={DEFAULT_VOICE_BOOST}dB,"
-            f"adelay={int(max(0, DEFAULT_SILENCE_DURATION - redo_start_time) * 1000)}|{int(max(0, DEFAULT_SILENCE_DURATION - redo_start_time) * 1000)},"
+            f"adelay={voice_delay_ms}|{voice_delay_ms},"
             f"apad=pad_dur={trailing_music_seconds}[voice_padded];"
📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 9af1d31 and 8c20354.

📒 Files selected for processing (1)
  • backend/src/services/ffmpeg_audio_service.py
🧰 Additional context used
🧬 Code graph analysis (1)
backend/src/services/ffmpeg_audio_service.py (3)
backend/src/services/audio_service.py (1)
  • get_audio_duration (8-8)
backend/src/services/hls_service.py (2)
  • upload_segment_from_file (90-112)
  • finalize_playlist (183-194)
backend/src/handlers/lambda_handler.py (2)
  • progress_callback (285-291)
  • progress_callback (341-356)
🔇 Additional comments (3)
backend/src/services/ffmpeg_audio_service.py (3)

13-17: LGTM!

The addition of DEFAULT_VOICE_BOOST to the imports is appropriate and aligns with its usage in the audio processing filters.


743-752: Optimize temp file writing on FFmpeg early exit.

If FFmpeg exits early (detected at line 746), the code raises an exception but has already written the chunk to voice_file at line 752. Consider checking FFmpeg status before writing to the temp file to avoid unnecessary I/O.

🔎 Proposed refactor
             # Stream voice data to FFmpeg stdin AND save to temp file
             with open(voice_temp_path, "wb") as voice_file:
                 for chunk in voice_generator:
                     # Check if FFmpeg is still running before writing
                     if process.poll() is not None:
                         stderr = process.stderr.read().decode()
                         logger.error(f"FFmpeg exited early: {stderr}")
                         raise Exception(f"FFmpeg exited unexpectedly: {stderr}")
                     process.stdin.write(chunk)
                     process.stdin.flush()
                     voice_file.write(chunk)  # Save for fade processing

The current order is actually fine—both writes happen after the check. No change needed.


777-795: Fade processing and finalization workflow is well-structured.

The two-phase approach is sound:

  1. Stream without fade for fast initial delivery
  2. Re-process final segments with proper fade based on actual voice duration
  3. Finalize playlist with updated segment durations

This ensures both responsiveness and proper audio fade-out.

Comment on lines +579 to +592
if segment_index >= total_segments:
logger.warning(f"Skipping fade segment {i}: index {segment_index} >= total_segments {total_segments}")
continue

seg_duration = self.get_audio_duration(fade_segment)
if seg_duration == 0:
seg_duration = float(HLS_SEGMENT_DURATION)

self.hls_service.upload_segment_from_file(user_id, job_id, segment_index, fade_segment)
logger.info(f"Re-uploaded faded segment {segment_index}")

# Update duration (only for existing indices)
if segment_index < len(segment_durations):
segment_durations[segment_index] = seg_duration
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Past issue addressed, but verify duration list consistency.

The boundary check at line 579 correctly addresses the previous review concern by skipping segments beyond total_segments. However, there's a subtle issue at line 591: if segment_index < total_segments but segment_index >= len(segment_durations), the segment will be uploaded without its duration being recorded. While segment_durations and total_segments should typically be in sync, defensive coding suggests handling this edge case.

🔎 Proposed fix
             for i, fade_segment in enumerate(fade_segments):
                 segment_index = first_segment_to_redo + i
                 if segment_index >= total_segments:
                     logger.warning(f"Skipping fade segment {i}: index {segment_index} >= total_segments {total_segments}")
                     continue

                 seg_duration = self.get_audio_duration(fade_segment)
                 if seg_duration == 0:
                     seg_duration = float(HLS_SEGMENT_DURATION)

                 self.hls_service.upload_segment_from_file(user_id, job_id, segment_index, fade_segment)
                 logger.info(f"Re-uploaded faded segment {segment_index}")

-                # Update duration (only for existing indices)
-                if segment_index < len(segment_durations):
-                    segment_durations[segment_index] = seg_duration
+                # Update duration (extend list if needed, but only within total_segments)
+                while len(segment_durations) <= segment_index:
+                    segment_durations.append(float(HLS_SEGMENT_DURATION))
+                segment_durations[segment_index] = seg_duration
🤖 Prompt for AI Agents
In backend/src/services/ffmpeg_audio_service.py around lines 579 to 592, the
code uploads a faded segment but only writes its duration if segment_index <
len(segment_durations); if segment_durations is shorter than total_segments this
leaves the uploaded segment without a recorded duration. Before assigning
segment_durations[segment_index] = seg_duration, ensure the list is long enough
by extending it (e.g., append default durations or repeat HLS_SEGMENT_DURATION)
up to segment_index, then set the duration; do this defensively so any uploaded
segment always has a corresponding duration entry.

user_id: str,
job_id: str,
progress_callback: Optional[Callable[[int, Optional[int]], None]] = None,
estimated_voice_duration: float = 60.0,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Unused parameter should serve as fallback.

The estimated_voice_duration parameter is never used. If _get_audio_duration_from_file returns 0 (line 778), no fade is applied. Consider using estimated_voice_duration as a fallback when duration detection fails to ensure fade processing always occurs.

🔎 Proposed fix
         # Get actual voice duration and apply fade to final segments
         actual_voice_duration = self._get_audio_duration_from_file(voice_temp_path)
+        if actual_voice_duration == 0:
+            logger.warning(f"Duration detection failed, using estimate: {estimated_voice_duration}s")
+            actual_voice_duration = estimated_voice_duration
+        
         if actual_voice_duration > 0:
             logger.info(f"Voice duration: {actual_voice_duration:.1f}s, applying fade to final segments")

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents
In backend/src/services/ffmpeg_audio_service.py around line 606, the parameter
estimated_voice_duration is declared but never used; update the logic where you
call _get_audio_duration_from_file (around line ~778) so that if the detected
duration is 0 or falsy you fall back to estimated_voice_duration (validate it's
> 0), then use that resulting duration for fade calculations and processing;
ensure you clamp/validate the fallback to a sensible minimum and replace the
earlier branch that skipped fades when duration == 0 so fades always apply using
the detected-or-estimated duration.

@HatmanStack HatmanStack merged commit a3df4db into main Dec 30, 2025
10 checks passed
@HatmanStack HatmanStack deleted the timer-update branch December 30, 2025 20:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant