try to fix transcript stuff#43
Conversation
There was a problem hiding this comment.
Pull request overview
This PR refactors the transcript processing logic in the get-clip handler to improve how speaker labels are handled in SRT transcript files. The change addresses issues with inline speaker prefixes by parsing them out and grouping consecutive dialogue by speaker into clean labeled blocks.
Changes:
- Import and use the
detectSpeakerutility function for parsing speaker information from SRT entries - Replace simple text concatenation with speaker-aware grouping logic that consolidates consecutive entries by the same speaker
- Move speaker label formatting to the fallback code path where it's actually needed
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| } | ||
|
|
||
| currentSpeaker = entrySpeaker; | ||
| if (dialogue.trim()) currentLines.push(dialogue); |
There was a problem hiding this comment.
When an SRT entry contains only a speaker label with no dialogue (e.g., "Speaker:" with just whitespace after), the detectSpeaker function returns the entire text as dialogue with speaker set to null. This causes the speaker label itself to be included as dialogue text in the transcript output. Consider checking if the dialogue contains only the speaker pattern before adding it, or handle this edge case explicitly.
| if (dialogue.trim()) currentLines.push(dialogue); | |
| const trimmedDialogue = (dialogue || '').trim(); | |
| // Skip empty dialogue lines. | |
| if (!trimmedDialogue) { | |
| continue; | |
| } | |
| // If detectSpeaker did not find a speaker, but the dialogue itself | |
| // looks like just a "Speaker:" label with no content, skip it so | |
| // that the label is not included as transcript text. | |
| const speakerLabelOnlyPattern = /^\s*[^:]+:\s*$/; | |
| if (!speaker && speakerLabelOnlyPattern.test(trimmedDialogue)) { | |
| continue; | |
| } | |
| currentLines.push(trimmedDialogue); |
No description provided.