feat(platform): paste-link video ingestion in chat#1724
Conversation
- Add videoLinkJobs table tracking pasted video URL ingestion lifecycle (queued → fetching_metadata → fetching_captions → extracting_audio → transcribing_handoff → indexing → completed | failed | skipped) - Extend fileMetadata schema with optional video provenance fields (sourceUrl / sourcePlatform / videoTitle / videoUploader / videoDurationSec) for transcripts that originate from yt-dlp - Extract joinSegmentsWithParagraphs from transcribe_audio.ts to a shared paragraphize.ts module with Whisper and Caption profiles plus an `addTimestamps` option that prepends [HH:MM:SS] paragraph prefixes - Install yt-dlp + Deno in services/convex/Dockerfile with SHA256 verification (multi-arch); drop the pinned version in favour of a weekly CI rebuild so YouTube extractor regressions stay current - Add orphan-tmp sweep for /app/data/convex/tmp/vlink-* in the entrypoint
Pure-function browser-safe helpers shared between the chat-input paste
handler and the Convex ingest mutation:
- extractVideoUrls(text, {maxUrls:3}) — strips fenced/inline code blocks
first (users quoting URLs in backticks should not auto-ingest), trims
trailing punctuation/markdown emphasis, dedups post-normalize, caps
at 3 URLs per paste to bound paste-bomb abuse
- isSafeVideoUrl — advisory frontend screen rejecting non-https, bare
IP literals (incl. decimal/hex/IPv6/octal), credentialed URLs, and
localhost. NOT load-bearing — the canonical SSRF gate lives server-
side in convex/video_links/url_safety.ts (DNS-pinned)
- isPlaylistUrl — rejects standalone /playlist?list=, accepts
/watch?v=X&list=Y (yt-dlp gets --no-playlist anyway)
- normalizeUrlForHash — drops fragment + tracking params, lowercases
host, sorts query; preserves t= timestamp anchors. Drives the
(orgId, sourceUrlHash) dedup key
- detectPlatform — coarse host classification (telemetry + chip glyph
only, never gates processing)
39 tests covering the full SSRF rejection matrix.
- ytdlp.ts ('use node'): hardened subprocess wrapper mirroring
audio_preprocess.ts:runFfmpeg shape. Per-job sandbox directory
(/tmp/vlink-<uuid>/ 0700, finally rm), denylist flags
(--no-config / --no-exec / --no-update / --ignore-config / --downloader native),
env stripped to PATH+HOME+LANG only, output-extension whitelist on
realpath, classified error codes (bot_detection / rate_limited /
private_or_age_gated / geoblocked / live_stream / member_only /
js_runtime_missing / binary_not_installed). ENOENT detection
surfaces a NEVER_RETRY 'binary_not_installed' so a stale container
rebuild fails fast with a clear chip error instead of burning
retry budget on opaque transient errors.
- captions_parser.ts: VTT-only parser (yt-dlp --convert-subs vtt
normalizes JSON3/SRV3/TTML uniformly via ffmpeg). Handles BOM,
CRLF, NOTE/STYLE blocks, <v Speaker> voice tags, inline timing
tags, HTML entities. rollingWindowDedup collapses YouTube auto-gen
rolling-window cues (load-bearing — without dedup the paragraphizer
2-3x's every token).
- url_safety.ts ('use node'): assertSafeUrl(url) — re-runs the
spa-side advisory checks then DNS-resolves via Node's dns.lookup
with all-records, walks every A/AAAA address through the shared
isPrivateIp predicate from convex/lib/http/safe_fetch.ts. Closes
the DNS rebinding gap the existing TODO calls out.
22 captions-parser tests covering rolling-window dedup, voice tags,
HTML entities, malformed timestamps, CRLF/BOM edge cases.
- mutations.ts:
- ingestVideoUrl: auth + cross-org thread gate, advisory
isSafeVideoUrl + isPlaylistUrl checks, file:upload rate-limit
bucket, MAX_IN_FLIGHT_PER_ORG=3 concurrency cap, URL-hash
dedup (returns existing jobId idempotently within 24h),
optional threadId (welcome-page pastes have no thread yet)
- cancelVideoLink: uploader-only auth, flip status='skipped',
propagate to fileMetadata.transcriptionStatus='skipped' when
Whisper-in-flight so the existing transcribe_audio early-exit
fires, schedule cleanup
- retryVideoLink: cleanup stale referents then reset to 'queued'
- bindCompletedJobsToMessage: atomic on send — collects both
in-thread completed jobs AND pre-thread (threadId=undefined,
by org+user) ones, patches threadId on the pre-thread set when
a new thread is created, returns FileAttachment payloads + the
pastedToken so the client can strip URLs from message text.
Identifies Whisper-handoff completion via fileMetadata.
transcriptionStatus join (the underlying job.status stays at
'transcribing_handoff' by design — R12 reactive-join decision)
- queries.ts: listForThread + listForUserUnboundChat (welcome page),
shared projectJob helper that derives displayStatus from job.status
with reactive join to fileMetadata.transcriptionStatus for the
Whisper handoff phase. Surfaces 'retrying' state with the
underlying error reason when status='queued' && attempts>0
- internal_mutations.ts: updateJob (auto-touches statusChangedAt),
insertSyntheticFileMetadata (captions-branch finalizer; prepends
Source/Platform/Uploader/Fetched/Method provenance header to the
transcript blob), cleanupCancelledVideoLink, heartbeatJobByStorageId
(generic by-storageId lookup keeps transcribe_audio decoupled),
recoverStuckVideoLinkJobs (per-status windows)
- ingest_video_link.ts: 'use node' orchestrator. Top-level
try/finally guarantees /tmp sandbox cleanup. Phase boundaries
re-read job for cancel detection. Fidelity-first caption language
precedence (original-manual → UI-locale-manual → English manual →
any manual → original ASR → UI-locale auto-translated). Captions
branch writes transcript blob + uploadFileToRag inline; captions
miss falls through to audio extract + saveFileMetadata internal
variant (passes uploadedBy explicitly — scheduled actions have
no user context). NEVER_RETRY error codes fail fast.
- file_metadata/internal_mutations.ts:
- saveFileMetadata (internal variant): accept optional threadId +
five video provenance fields (sourceUrl, sourcePlatform, videoTitle,
videoUploader, videoDurationSec) so the orchestrator can hand off
audio to transcribeAudio with full context. Backward-compat —
every new field is optional, existing callers untouched
- recoverStuckTranscriptions: piggyback the video-link watchdog
(recoverStuckVideoLinkJobs) on the same cron tick. Per-status
windows live there; this keeps a single cron entry
- file_metadata/transcribe_audio.ts:
- Import joinSegmentsWithParagraphs from paragraphize.ts; pass
addTimestamps=true so the persisted transcript carries
[HH:MM:SS] paragraph prefixes for agent citations
- whisperSegmentsToParagraphSegments adapter offsets each chunk's
Whisper-local timestamps by the running chunkStartSec so the
[HH:MM:SS] prefixes reflect absolute positions in the original
audio (Whisper segments are chunk-local because we splice with
-reset_timestamps 1)
- Per-chunk heartbeat: after each chunk completes, fire
internal.video_links.internal_mutations.heartbeatJobByStorageId
so the videoLinkJob's statusChangedAt + progress advance. Long
Whisper jobs no longer get killed by the per-status watchdog
window. The mutation is generic (by storageId) so transcribe_audio
stays decoupled from video_links-specific shape
- threads/cascade_helpers.ts: extend cascadeDeleteThreadChildren to
drop videoLinkJobs by_organizationId_and_threadId in pages of 200,
also clean up any residual _storage blob (skip if already deleted
in the fileMetadata pass)
- lib/agent_chat/start_agent_chat.ts: video-specific attachment hint
in buildMessageWithAttachments. When fileMetadata.sourceUrl is set,
emit '🎬 [title] (video from <platform>, <duration>, uploader: …) —
transcript stored as a document; paragraphs prefixed [HH:MM:SS] —
cite timestamps when summarizing. For long videos (>1h) prefer
rag_search; call document_retrieve with fileId=…' This gives the
agent enough context to use the existing document_retrieve path
instead of falling back to the generic web tool
- use-chat-video-links.ts: reactive subscription hook. Dual-mode —
by threadId when in an active chat, by (org, user) when on the
welcome page (no thread yet). Convex reactivity rebinds queries
on first send so chips migrate seamlessly. Filters out 'skipped'
jobs immediately so cancellation flips the UI instantly while the
cleanup action runs async
- video-link-chip.tsx: standalone chip component (not a refactor of
the audio chip — narrower blast radius for v1). Status text
priority: failed error message → completed duration → retrying
with underlying error reason → heartbeat progress (e.g.
'transcribing chunk 2 of 4') → localized i18n label. The progress
override prevents the 'stuck on Transcribing…' anti-pattern.
Tap targets ≥44pt, always-visible on touch (not hover-only) per
the mobile review
- chat-input.tsx: paste handler extended with IME-composition guard
+ text/plain + text/html fallback (rich clipboard sources like
Notion/Slack ship only HTML) + extractVideoUrls(maxUrls:3). URL
stays in textarea (no preventDefault) — strip happens at send
time via the literal pastedToken. Send button disabled predicate
includes isProcessingVideo with a tooltip explaining how to
bypass. Chip cancel handler strips the matching URL from textarea
via literal String.replace and tidies whitespace
- chat-interface.tsx: wire useChatVideoLinks (passes both threadId
and organizationId for the dual-mode subscription) and thread
props down to ChatInput
- use-send-message.ts: TWO bind passes —
(1) Early bind at start when threadId is already known
(in-thread paste case)
(2) Late bind after createThread returns the new threadId
(welcome-page first-send case)
Both call bindCompletedJobsToMessage; second pass dedups by
fileId. Strips the bound pastedTokens from messageToSend so the
LLM doesn't see both the raw URL and the transcript attachment
- chat.videoLink.* keys in en/de/fr:
- statuses (queued, retrying, fetching_metadata, fetching_captions,
extracting_audio, transcribing_handoff, indexing, completed,
failed, skipped)
- errors covering the orchestrator's classified yt-dlp reason
codes (privateOrAgeGated, unavailable, geoblocked, unsupported,
transient, videoTooLong, liveStream, premiere, memberOnly,
botDetection, rateLimited, playlistsNotSupported, urlNotSafe,
audioExtractFailed, whisperFailed, binary_not_installed, generic)
- actions (removeLink, retry)
- chip metadata (fallbackTitle, ariaLabel, inProgressTooltip)
- terms-of-service.md (en): one-paragraph clause under the existing
'Third-Party Links and Resources' section affirming that when the
user provides a URL for video transcription/analysis, they warrant
rights/fair-use; Tale acts at user direction and stores the
transcript in their workspace. Per the round-2 legal review:
honour the user's captions+Whisper architecture decision while
surfacing the responsibility split, deferring DMCA workflow +
attestation modal to a prod-hardening PR
…, i18n, ToS) Two-round multi-agent review surfaced 13 ship-blocker / critical / major issues plus dead code and architecture drift. All resolved in-branch; lint + tsc + 83/83 vitest tests pass (pre-existing seo errors aside). Authz: every public mutation + query under video_links/ now goes through getOrganizationMember + assertThreadAccess; previously any authenticated user could pass an arbitrary organizationId and write to / read from another org's rows. SSRF: --force-ipv4 + --max-redirects added to yt-dlp COMMON_FLAGS; CGNAT 100.64.0.0/10 added to isPrivateIpv4; assertSafeUrl now re-runs at every phase boundary against fresh DNS to defend against TTL=0 rebind between Phase A / B / C. Prompt-injection: UNTRUSTED_CONTENT_SYSTEM_PROMPT now joins every agent's system prompt via buildSystemPrompt (previously defined but loaded by zero agents); start_agent_chat wraps video metadata with wrapUntrusted + sanitizeUntrustedField (strips newlines / zero-width chars, clamps length); captions parser decodes entities BEFORE stripping tags so HTML-encoded injection tags can't bypass, and adds strip patterns for <|im_*|> / [INST] / <<SYS>>; synthetic fileMetadata rows now carry source='video_link' instead of source='user'. i18n: ytdlp.ts YtDlpErrorReason + url_safety.ts UrlSafetyError.kind unions migrated to camelCase to match en/de/fr JSON keys (previously ~13 codes silently fell through to the generic fallback because backend emitted snake_case while JSON used camelCase). errorReasonCode validator tightened from v.string() to a literal union so unknown codes fail at write time. Added the 7 url-safety kinds + inFlightCap + toast title across all three locales; fixed FR `collez` → `colle` imperative casing and DE `Upcoming-Video` half-translation. Resource lifecycle: Whisper-path storageId is now written to the videoLinkJobs row BEFORE saveFileMetadata so cleanup can always find orphan blobs; assertNotCancelled flipped from `!fresh ||` to `!!fresh &&` so a cascade-deleted row is treated as cancellation (previously the action kept running and crashed on the next mutation); arena first-send now calls bindCompletedJobsToMessage so welcome-page pastes that then switch to arena no longer silently drop the transcript attachment. Schema: replaced the dead lifecycleStatus==='bound' dedup guard with a new messageBoundAt: v.optional(v.number()) field (bind mutation previously wrote 'active' while the guard checked 'bound' — guard never fired and double-bind was possible). ToS: §6 expanded from "Third-Party Links" to "Third-Party Links, User-Submitted URLs, and Video Analysis" across en/de/fr with proper sub-sections 6.1-6.5 covering rights warranty (jurisdiction-neutral language, no US fair-use term in a Swiss-law contract), service discretion, notice-and-action procedure (DSA Article 16-compatible), and best-effort disclaimer. Last updated bumped to 18.05.2026 across all three locales. Resource budgets: --max-filesize 500M → 100M (paired with Whisper's ~25MB input limit; the 500M cap with in-memory readFile→Blob double- buffer could push peak RSS over 1GB); yt-dlp kill replaced with SIGTERM + 5s grace → SIGKILL so the ffmpeg child can flush cleanly. UX: toast surface for ingest errors (ConvexError codes map to videoLink.errors.* i18n keys); drag-drop URL support via new onTextDrop prop on FileUpload.DropZone; tooltip defaultValue removed so the real i18n key surfaces in every locale. Architecture: stop copying 5 video-provenance fields (sourceUrl, sourcePlatform, videoTitle, videoUploader, videoDurationSec) to fileMetadata at ingest time; start_agent_chat now JOINs videoLinkJobs by storageId as the canonical source, falling back to the legacy fileMetadata fields (kept @deprecated for older rows). updateJob internal mutation gains an optional expectedStatus compare-and-swap arg so two concurrent action instances can't regress status. Added a lazy GC pass to recoverStuckVideoLinkJobs that sweeps unbound completed rows older than 7 days, capped at 50/tick. Dead code: removed unused isNonTerminal function, by_organizationId_ and_threadId index (callers all use by_threadId), 'cached' transcriptSource literal (never written by any code path), and un-exported sanitizeStderr / classifyYtDlpStderr / UrlSafetyErrorKind / AssertSafeUrlOptions / VideoLinkJobView. Fixed misleading module comment in lib/shared/video-url.ts. Eliminated 13 lint errors (one each of no-non-null-assertion / no-unsafe-type-assertion / no-unused- vars / prefer-const) — bun run check now clean for this branch.
… + scrub VTT speaker Push <untrusted_source> wrapping down to document_retrieve and rag_search tool boundaries so the agent's TRUST RULES apply to the transcript body (previously only the short reference line in start_agent_chat was wrapped, while the retrieved chunks arrived raw). Reverse-lookup via new lookupVideoLinkSources + by_storageId index identifies video-link-sourced fileMetadata rows and forwards their sourceUrl into the wrap attributes. VTT speaker labels (`<v Speaker>`) now go through the same scrubber as the cue body — control tokens, zero-width chars, generic tags, length cap — plus orphaned-ChatML-opener handling for the case where the speaker regex eats the closing `>` of a smuggled `<|im_start|>`. Adds REGION block skip and case-insensitive NOTE/STYLE match; introduces a 5MB input cap + 50k segment cap so a hostile VTT can't OOM the action. Also: broaden escapeContent close-tag regex, neutralize literal open-tag, drop captions_parser's unnecessary `'use node'`, and mention video transcripts/captions in UNTRUSTED_CONTENT_SYSTEM_PROMPT. Tests cover speaker injection (`<v [INST]EvilSpeaker[/INST]<|im_start|>`), entity-decoded openers, REGION/lower-case skip, and oversize input.
…g cron Org-delete and GDPR right-to-be-forgotten previously skipped videoLinkJobs entirely, leaking every row + _storage blob the subject ever ingested (the by_org_user index existed for this purpose but had no caller). Adds: - eraseSubjectVideoLinks internalMutation + perCategory wiring + RAG fan-out for transcript chunks indexed by storageId - videoLinks branch in cascadeOnOrgDeleted (paged, db.delete- before-storage.delete, mirrors the TTS pattern) - recoverStuckTranscriptions also flips the owning videoLinkJob to 'failed' by storageId so transcribing_handoff can't stick forever (the watchdog deliberately excludes that status and delegated to the fileMetadata sweep — now actually delegates) - updateFileTranscription mirrors terminal Whisper state onto the linked videoLinkJob so lazy GC reaches Whisper-branch rows (they used to live at 'transcribing_handoff' forever) - thread cascade lifts the videoLinkJobs sweep out of the `if (organizationId)` guard so legacy thread rows missing orgId still get cleaned via by_threadId - direct crons.ts entry for recoverStuckVideoLinkJobs (no longer piggy-backed off recoverStuckTranscriptions — a throw in the fileMetadata loop previously killed both watchdogs)
…ore-then-patch) The retry path computed `attempts = (job.attempts ?? 0) + 1` locally but never persisted it — updateJob's validator didn't accept the field, so MAX_ATTEMPTS=4 was never reached and transient yt-dlp failures retried forever (bounded only by NEVER_RETRY). updateJob now accepts `attempts`, handleYtDlpError writes it, and also schedules cleanupCancelledVideoLink before re-running so the prior attempt's storage/fileMetadata IDs are cleared (avoids dangling refs). Plumbs expectedStatus CAS through the four clean phase boundaries (queued→fetching_metadata, fetching_metadata→fetching_captions, fetching_captions→indexing, extracting_audio→transcribing_handoff) so a watchdog-failed → manual-retry overlap no-ops the loser instead of stomping writes. Captions branch now records storageId on the job row immediately after ctx.storage.store and before insertSyntheticFileMetadata, mirroring the audio branch — without this, an insert failure (validator, OOM) orphans the blob beyond reach of cleanupCancelledVideoLink. Sanitizes uploader-controlled chapter titles + metadata.title at the trust boundary so they can't smuggle `\n[SYSTEM] ...` payloads into the transcript body the agent later retrieves. formatHms moved to paragraphize.ts as the single source of truth (ingest_video_link previously carried a byte-identical local copy). Paragraphize now treats any speaker transition — including to/from undefined — as a paragraph boundary, fixing the Alice → undefined → Bob case that previously glued the unlabeled middle cue under Alice's prefix.
runYtdlp:
- settled-flag pattern around timeout/close/error so the
Promise resolves exactly once and the caller's `.catch`
no longer races jobDir cleanup against still-writing
ffmpeg children (timeout used to reject before stdio drain,
then close would reject a second time)
- byte-cap (8MB) now SIGTERMs the child instead of silently
dropping further data; a hostile output can no longer waste
the full wall-clock budget
- argv `--` separator before url in all three spawn paths
(ytdlpJson / ytdlpWriteSubs / ytdlpExtractAudio) so a
url-starting-with-dash can't be reinterpreted as a flag
- `--max-filesize 5M` on ytdlpWriteSubs caps caption downloads
at the source (paired with the parser-side 5MB input cap)
- LANG_TOKEN_RE allow-list on `selection.lang` before it
enters --sub-langs as a comma+regex-parsed value
url_safety:
- dns.lookup → dns.resolve4 + dns.resolve6 so the OS resolver
/ /etc/hosts / nsswitch can't poison the answer set
- UrlSafetyError messages strip raw url/hostname — schema
promises `errorMessage` is sanitized; the chip surfaces the
reason via i18n keyed on `kind`
isPrivateIp:
- expandIpv6 normalizes every candidate to 8 hextets before
prefix-matching, so shortened / fully-expanded / zone-
identifier forms get the same treatment
- new private cases: `::` unspecified, fully-expanded
::ffff:H:L mapped v4, 6to4 (2002::/16) decodes its embedded
v4 and recurses, NAT64 (64:ff9b::/96 + 64:ff9b:1::/48)
same, multicast ff00::/8, discard 100::/64
- fc/fd prefix check tightened to /^f[cd][0-9a-f]{2}:/i so
hostnames like fc2.com no longer false-positive
…nt sweep retryVideoLink now reuses ingestVideoUrl's rate-limit (file:upload bucket) and in-flight cap (MAX_IN_FLIGHT_PER_ORG) gates — without this, looping the mutation over N pre-existing failed jobs spawned unlimited concurrent yt-dlp/Whisper work. Also adds a 15-min cooldown when the prior failure was botDetection / rateLimited so hammer-retries don't dig the per-IP block deeper. New unbindJobsFromMessage mutation reverses the bind on send-failure (used by frontend in the next commit). Server-side normalize: ingestVideoUrl derives sourceUrlHash and sourcePlatform via the shared video-url helpers rather than trusting client-supplied `normalizedUrl` / `sourcePlatform` (prevents collision-hashing across the dedup window). Dedup- hit path also refreshes `pastedToken` to the most recent paste so the send-time strip catches the latest substring. Dockerfile: ARG YTDLP_VERSION / DENO_VERSION default to empty, which resolves `releases/latest`'s redirect to a concrete tag ONCE and reuses it for both asset + checksum URLs — closes the TOCTOU window where binary and SHA2-256SUMS could come from two different upstream releases if one was cut between curls. /etc/yt-dlp-version and /etc/deno-version now record the resolved tag, not "unknown". docker-entrypoint monitor_convex tmp-sweep narrowed from the bare `rm -rf /app/data/convex/tmp/*` to a `find -name 'vlink-*' -mmin +5` pattern matching the boot-time sweep — health-check blips no longer wipe arbitrary Node-action scratch. queries.ts logs caught org-membership errors instead of empty- catching (the soft-fail to [] is intentional for reactive subscriptions but configuration bugs were silently swallowed).
…hip gate
bindCompletedJobsToMessage now returns each bound jobId so
use-send-message can call unbindJobsFromMessage in its catch
block when chatWithAgent throws. Without this, the chips
disappeared from the composer permanently (the hook filters
out `messageBoundAt !== undefined`) so users lost both the
typed text AND every transcript attachment on a failed send.
paste handler bails on IME composition (onCompositionStart/End
ref + e.nativeEvent.isComposing) so a CJK user pasting
mid-commit no longer corrupts the buffer. The chip-cancel
strip path now uses setRangeText('preserve') instead of
onChange(stripped) so the caret position, selection, and
undo stack survive.
Adds pasteIngestPending state — set on paste, cleared in the
ingest mutation's `.finally` — so a paste-then-Enter race
can't ship the message before the chip query reflects the
new row. hasFailedJobs derived in the hook and threaded into
both the keyboard send-gate and the Send button's disabled
predicate; a chip in `failed` no longer ships silently with
the agent answering the raw URL.
Precheck text is now `messageToSend` (URL-stripped), not the
raw paste — so `?si=` / `?utm_*` tokens stop tripping PII
heuristics.
Chip a11y: motion-safe spinner, focus-visible ring on retry
and remove buttons, title becomes an external link to
sourceUrl with rel="noopener noreferrer" (without this, the
user has no way back to the source video once pastedToken is
stripped from the textarea). aria-live downgrades to "off"
while processing so per-chunk progress doesn't spam screen
readers.
VideoLinkJob type now carries sourceUrl (already projected
server-side via projectJob) so the chip can render the link.
…ngth cap isSafeVideoUrl now rejects: - `localhost.` trailing-dot (resolves to 127.0.0.1 on every libc resolver; the strict equality with `'localhost'` missed this form — server-side DNS pre-resolve still catches it but failing fast on the advisory check gives instant UX and shrinks the surface for any future caller that uses this helper without the server check) - IDN punycode-encoded hostnames (`xn--…`). The chip would otherwise render `youtubе.com` (Cyrillic `е`) indistinguishably from `youtube.com`; reject the punycode form outright so impersonation never reaches the user - URLs longer than 2048 bytes — defense against a hostile multi-MB paste that would otherwise flow through the mutation, the dedup hash, and the per-org rate-limit bookkeeping
…ive-works clause
i18n (en/de/fr):
- binaryNotInstalled: "rebuild the Convex container" /
"Convex-Container neu erstellen" / "reconstruire le conteneur
Convex" was leaking deployment internals to end users who
can't act on it. Replaced with the same shape as
jsRuntimeMissing: "contact your administrator"
- botDetection / rateLimited / forbidden: "Video host /
Anbieter / hôte vidéo" → "Video platform / Plattform /
plateforme vidéo" — aligns with the ToS clause's
"Quellplattform / plateforme source" terminology
- new retryCooldown error key for the 15-min cooldown after
bot/rate-limit failures
- new chip.failedSendBlockedTooltip for the send-gate tooltip
when a failed chip is still present
- de.json einheitliches ß ("einschliesslich" → "einschließlich",
"verstossen" → "verstoßen")
Terms of Service (en/de/fr):
- new paragraph stating that transcripts/captions/summaries
are derivative works of the source content; all IP rights
remain with the original rightsholder, the Service grants
no new rights beyond the user's existing licence in the
source
- processor/controller language clarified: Tale acts as
processor for submitted-content operations (fetch,
transcribe, store) and as controller for operational
metadata (rate-limit counters, error classification,
telemetry). Avoids self-contradiction with §6.3's
service-discretion clause
The retry path emitted only `reason` + `attempts` + `delayMs`, making `transient`-classified yt-dlp failures impossible to diagnose without inspecting the DB row. classifyYtDlpStderr returns 'transient' for any pattern it doesn't recognize — typically when yt-dlp's upstream error shape changes and the fixed-string match list goes stale. Adds a truncated `sanitizedStderr` + `message` to the structured log so operators can see what yt-dlp actually wrote without DB spelunking. `fail()` already logs the message on terminal failure; this closes the equivalent gap on the retry path.
`--max-redirects` was added in yt-dlp 2024.05.27. Hosts running an
older system-installed yt-dlp (the previous COMMON_FLAGS made this
unconditional) reject it with `exit 2: no such option`, which the
classifier then bucketed as `transient` and the orchestrator silently
looped on for every paste.
Splits COMMON_FLAGS into:
- BASE_FLAGS — supported by every release we care about (≥ 2024.04)
- OPTIONAL_FLAGS — version-gated entries with their introducing
yt-dlp version + an explanatory log line when missing
`resolveSupportedFlags()` runs `yt-dlp --help` once on first
invocation, scans for each optional flag's help token, and caches the
resulting argv. Subsequent runYtdlp calls reuse the cache.
Production (Dockerfile-pinned latest yt-dlp) always has every flag;
this only matters for dev hosts running stale binaries. When a flag
gets dropped, the operator sees a one-time WARN naming the minimum
version so the upgrade path is obvious.
The previous SSRF-hardening commit added `--max-redirects 5` to
COMMON_FLAGS, but yt-dlp has never shipped this option — HTTP
redirect handling is internal to its `urllib`-based downloader and
has no CLI knob. yt-dlp consequently exits 2 ('no such option') on
EVERY invocation; the stderr classifier doesn't recognize the
usage error so the orchestrator buckets it as `transient` and
silently retries forever.
The flag-probe machinery from the previous commit stays in place
(future genuinely version-gated flags can register against it),
but OPTIONAL_FLAGS is emptied of the bogus entry, and url_safety's
docstring stops promising a defense the subprocess can't deliver.
Actual SSRF defenses in place:
- DNS pre-resolve in `assertSafeUrl` walks every A/AAAA and
rejects private IPs (load-bearing)
- `--force-ipv4` narrows the v6 rebind surface yt-dlp can hit
after its own re-resolve
- In-process safe_fetch.ts uses `isPrivateIp` (with full IPv6
coverage from the round-2 expansion) for all callers that
don't go through yt-dlp
The chip previously had no max-width and the title used `truncate`, which gave it no edge to truncate against — long titles like "The electric vehicle boom in Thailand and Vietnam - The Climate Question podcast, BBC World Service" pushed the chip past the composer's right boundary. Constrains the chip to `max-w-full sm:max-w-md` (~448px on desktop, full-width on mobile), switches `truncate` → `line-clamp-2 break-words` on the title so the second line wraps with ellipsis instead of overflowing, switches the chip shape from `rounded-full` (pill — looks wrong at 2-line height) to `rounded-2xl`, and adds `shrink-0` to the leading emoji and trailing retry/remove buttons so they hold their 44pt tap targets when the title takes both lines. The status row keeps its own `truncate` for the case where the i18n status string is long enough to overflow on its own.
… body `buildMessageWithAttachments` appended a `wrapUntrusted(...)` block around the video-link reference markdown when composing the persisted user-message body. The optimistic bubble (built from `messageToSend`, no wrap) and the persisted bubble (with the wrap) diverged at handoff, surfacing literal `<untrusted_source tool="video_link" url="..."> </untrusted_source>` XML in the user's chat bubble — visible as raw tags because message-bubble.tsx renders content verbatim via `whitespace-pre-wrap`. The wrap was load-bearing in the early review iteration but became redundant after commit 59c7ca3 (Group 1) pushed `<untrusted_source>` down to the tool-response boundary in `agent_tools/documents/helpers/retrieve_document.ts` + `agent_tools/rag/rag_search_tool.ts`. The LLM still receives the trust signal when it actually reads the transcript via `document_retrieve` / `rag_search`; the user-message body just references the fileId, which doesn't need wrapping. Video branch now pushes `inner` unwrapped, matching the audio branch immediately below. Removes the stale `wrapUntrusted` import. Bubble residual jump (one `📎`-prefixed markdown line) is identical to the audio-upload case and accepted per user preference.
…ounce
The persisted user-message body's video-link reference was ~5 lines
of prose:
🎬 [title] (video from Platform, duration, uploader: Name) —
transcript stored as a document; paragraphs prefixed [HH:MM:SS]
timestamps — cite them when summarizing. Call document_retrieve
with fileId=… to read the full transcript.
*(fileId: … | fileName: … | fileType: … | fileSize: …)*
The optimistic message has none of this — just the user's typed
text — so the optimistic→persisted swap grew the bubble by ~200px,
which ResizeObserver in `chat-interface.tsx:560-567` registered as
new scrollable content and re-fired `scrollTo(scrollHeight)`. The
user perceived this as a bubble first landing near the viewport top
and then snapping down — a "bounce" plain text / image / document
sends don't exhibit.
Image, document, spreadsheet, and text-file branches above all use
a single-line reference (e.g. `📎 [filename](url) (type, size)\n
*(fileId: …)*`), and the user confirmed those feel natural. This
shrinks the video-link template to the same ~2-line shape:
🎬 [title] (video, duration, uploader)
*(fileId: … | fileName: … | fileType: … | fileSize: …)*
The instruction prose was redundant reinforcement — the LLM already
gets the same guidance from `document_retrieve`'s tool description
and from `UNTRUSTED_CONTENT_SYSTEM_PROMPT`. The fileId stays in the
footer so the agent can still call `document_retrieve(fileId=…)`.
Untouched (functional invariants):
- "View Transcript" button in FileAttachmentDisplay (drives off
fileMetadata via useQuery — independent from this markdown)
- Transcript blob in _storage (synthetic fileMetadata + Whisper
paths unchanged)
- Group 1 `<untrusted_source>` wrap at the document_retrieve /
rag_search tool-response boundary
- Audio / image / document / spreadsheet / text branches
…poser
`ingestVideoUrl` previously deduped any non-failed videoLinkJobs row
within the 24h window for the same `(orgId, sourceUrlHash)`. When
the matching row was ALREADY bound to a past message (its
`messageBoundAt` set, the user had already sent that message), the
mutation returned the existing jobId — but `useChatVideoLinks`
filters bound rows out of the composer chip area, so the user's
second paste appeared to do nothing.
Tightens the dedup predicate so it only short-circuits for rows
that genuinely belong to the current composer session:
- `messageBoundAt === undefined` (chip is still in the draft area,
not attached to a past message)
- `threadId === args.threadId` (same conversation, including the
welcome-page `undefined === undefined` case)
- `uploadedBy === userId` (safety against returning another
member's row in shared threads)
Effect:
- Spam-paste while drafting still collapses to one chip
- Re-paste the same URL after sending the prior message now
produces a fresh chip + fresh ingest (the user can attach the
same video to a follow-up question)
- Re-paste in a different thread now produces a fresh chip
Re-ingest cost is acceptable for demo: captions branch is ~5-10s
and `findCachedTranscript` already dedups Whisper by content hash.
A future content-level cache (reuse storageId/fileMetadata across
videoLinkJobs rows) can short-circuit the action itself.
…ngMessage `handleSendMessage` set `scrollingToBottomBehaviorRef.current` to `'smooth'` BEFORE awaiting `sendMessage(...)`. In the video-link path, the hook then awaits `bindCompletedJobsToMessage` for 50-200 ms before `setPendingMessage` fires. During that window, unrelated Resize/MutationObserver fires run `onContentChange`, which downgrades the ref to `'instant'` (line 549-552) or — when `programmaticScrollRef` isn't set and `currentTop < prevTop` — clears it to `null` (line 546). By the time the optimistic bubble actually mounts, the intent is gone, so auto-scroll doesn't fire and the "↓" floating button shows. The user has to scroll manually. Plain text and image attachments don't go through the bind round- trip — they hit `setPendingMessage` ~30-50 ms after the ref is set, the ref is still `'smooth'`, scroll fires. The bug was video-link- specific because video-link was the only path with a server round- trip BEFORE the optimistic message lands. Fix: move the scroll-intent assignment out of `chat-interface.tsx`'s pre-await call sites and into `use-send-message.ts` adjacent to each `setPendingMessage(...)` call. The intent and the DOM update are now back-to-back synchronous operations; observer fires after the bubble mounts see the fresh ref and scroll correctly. - `useSendMessage` gains an optional `scrollIntentRef` param and a small `markScrollIntent()` closure. Every setPendingMessage site (3 in the entry optimistic block + 1 arena new-thread + 2 standard new-thread) calls `markScrollIntent()` immediately before. - `chat-interface.tsx` removes the 3 pre-await sets at the three paths that go through `sendMessage` (`handleSendMessage`, `handleSendMessageDirect`, the regenerate handler). The edit-and- branch `editAndBranchAction` set at line 792 stays (different action, different flow). The behaviour for plain text / image / document / audio is unchanged — in those paths setPendingMessage fires almost immediately after the intent set, just as before. Video-link now matches.
Implements the medium-eager-balloon plan after a 50-agent two-round
review of feat/video-analysis-via-link. Closes correctness,
performance, security, accessibility, and i18n gaps. Plan file:
~/.claude/plans/medium-eager-balloon.md.
A (server correctness)
- A1 captions_parser ReDoS — bounded the inline-tag regex to
`[^>\n]{0,1024}` and added a 64KB per-cue cap. A 200KB unbalanced-`<`
payload that previously pegged CPU for ~16 min now completes in
<500ms (regression test added).
- A2 GDPR erasure race — added `assertNotCancelled` after every
`ctx.storage.store` plus a structural job-existence guard in
`insertSyntheticFileMetadata`. Orchestrator no longer resurrects
fileMetadata rows after subject erasure.
- A3 state-machine CAS — broadened initial early-return to any
non-`queued` state; added CAS at Phase C entry against both legal
pre-states; moved `assertNotCancelled` before the CAS-less patch;
CAS on audio-store patch. `updateJob` now returns
`'ok'|'cas_miss'|'not_found'` so the orchestrator can bail when
a watchdog flip races mid-extract.
- A4 watchdog cleanup — `recoverStuckVideoLinkJobs` now schedules
`cleanupCancelledVideoLink` on every `failed` flip; the cleanup
helper now early-returns on `messageBoundAt !== undefined` so the
inline comment finally matches behaviour.
- A5 watchdog `.collect()` capped at `.take(200)` per status so a
yt-dlp outage backlog can't blow per-mutation memory.
B (hot-path performance)
- B1 start_agent_chat — `withIndex('by_threadId')` with no `.eq()` +
`.filter(storageId)` was a full-table scan per attachment per
send. Now `by_storageId.eq(...)`.
- B2 `heartbeatJobByStorageId` — same fix (was called per Whisper
chunk on every audio transcription).
- B3 `.collect()` → `for await` in `bindCompletedJobsToMessage`,
`listForThread`, `listForUserUnboundChat`, and both in-flight
cap loops.
C (security)
- C1 prompt-injection wrap completeness — added `<untrusted_source>`
wraps in workflow `document.retrieve` and `rag` actions, batch +
bound integration tool outputs, and dropped the duplicate `raw:
result` field. Workflow LLM nodes now prepend
`UNTRUSTED_CONTENT_SYSTEM_PROMPT` so the trust rules actually
apply.
- C2 SSRF egress firewall — docker-entrypoint installs iptables
REJECT rules for IMDS, link-local, and RFC1918 before the gosu
drop. Closes the DNS-rebinding TOCTOU `url_safety.ts` and
`safe_fetch.ts` document but cannot close in-process. compose.yml
grants `NET_ADMIN`.
D (tests + docs)
- D1 url_safety SSRF/rebind suite (14 cases including partial-rebind
defense), video-link-chip a11y test (8 cases with
`checkAccessibility`), captions_parser ReDoS regression.
- D2 docs/{en,de,fr}/platform/chat/attachments.md "Video links"
section + README.md/.de.md/.fr.md feature line.
E (data + abuse gates)
- E1 budget gate in `ingestVideoUrl` and `retryVideoLink` —
prospective 4h × whisper rate (~144 cents) so over-budget orgs
can't pre-spend behind chat-token checks.
- E2 audit logs for `video_link.ingest|cancel|retry`.
- E3 `addTimestamps` gated to `source === 'video_link'` — microphone
uploads keep prior format.
- E4 welcome-page (`threadId=undefined`) dedup disabled — two
composers of the same user pasting the same URL both get a chip
instead of one silently no-op'ing.
- E5 `unbindJobsFromMessage` re-asserts org membership.
- E6 stderr sanitizer extends to JWT, `Set-Cookie`,
`po_token|visitor_data|cpn|sig` patterns.
F (deployment)
- F1 compose.yml convex: `mem_limit: 12g`, `pids_limit: 4096`,
`nofile: 65536`.
- F2 yt-dlp `spawn({detached: true})` + process-group kill so
ffmpeg grandchildren can't orphan on SIGKILL.
- F3 Dockerfile pins `YTDLP_VERSION` + `DENO_VERSION` defaults so
two builds produce byte-identical binaries.
G (frontend / a11y / UX)
- G1 Enter handler IME guard (Pinyin/Kotoeri Enter-to-commit no
longer triggers send).
- G2 `onTextDrop` mirrors the paste-ingest gate.
- G3 disabled Send button wrapped in focusable span so the Tooltip's
pointer listener can attach.
- G4 chip: aria-live always polite, `aria-busy={isProcessing}`,
`AlertCircle` icon on failed state (color-independent signal),
ring focus on title link, width aligned with message bubble at
`lg:max-w-md`.
- G5 DropZone visible `focus-visible:ring-2`.
- G6 chat-interface restores draft text on send failure (symmetric
with the existing chip unbind rollback).
- G7 `'video-link bind failed'` literal → t(`videoLink.toast.
bindFailedDescription`) in en/de/fr.
- G8 server emits structured `__VL_ATTEMPT__N` token; chip resolves
via `videoLink.statuses.attemptNumber` ICU key.
- G9 en/de `jsRuntimeMissing` + `binaryNotInstalled` admin phrasing
aligned.
H (legal + code quality)
- H1 ToS §6 derivative-works clause adds "as between you and the
Service, you own the derived artifacts" in en/de/fr.
- H2 four `as` cast cleanup — two dropped (narrowed-redundant in
use-send-message and use-chat-video-links), two documented as
framework gaps (NodeJS.ErrnoException, ClipboardEvent.isComposing).
Baseline: tsc clean (pre-existing @tale/seo errors unrelated). Lint
clean modulo 4 pre-existing errors that also fail on main. Tests:
13 pre-existing failures unchanged; +3 net new passing (24 new test
cases land across url_safety, video-link-chip, captions_parser).
…er/bubble lag User reported two papercuts on `feat/video-analysis-via-link`: the composer didn't clear quickly after Send, and the chat bubble first appeared as a plain link before switching to the styled video card. Both symptoms had the same root cause — the optimistic `setPendingMessage` waited for `bindCompletedJobsToMessage` (~50-200 ms) before mounting, and the chip stayed visible until the Convex subscription re-fired with `messageBoundAt !== undefined`. On the persisted swap the server's `buildMessageWithAttachments` then appended ~2 lines of markdown that the user-role bubble renders as plain pre-wrap text, so the bubble visibly grew. Three coordinated changes collapse the gap: 1. `useChatVideoLinks` exposes `markJobsSent` / `unmarkJobsSent` backed by a local `hideJobIds` set. Chips disappear from the composer in the same React commit as `clearInputValue()`. A `useEffect` prunes the set once the subscription catches up. 2. `useSendMessage` accepts a `videoLinkSnapshot` and builds `mutationAttachments`, `pastedTokensToStrip`, and the appended markdown synchronously from it. `setPendingMessage` no longer awaits bind — bind moves to background, still atomically writing `messageBoundAt` server-side. Bind / precheck-block / `chatWithAgent` failures restore the chip via `unmarkJobsSent`. 3. A new shared formatter at `lib/shared/video-link-markdown.ts` produces the `🎬 [...]\n*(fileId: ...)*` block; both server (`start_agent_chat.ts`) and client now emit byte-identical output, so the optimistic and persisted bubble bodies match and the optimistic→persisted swap is invisible. Server-side `video_links/queries.ts` also surfaces `fileSize` (from `fileMetadata.size`) so the client snapshot can stamp the same size the bind result will. `sanitizeUntrustedField` moves to `lib/shared/` to avoid a `lib/shared/ → convex/` import direction; the original convex module re-exports it for back-compat. Tests: 9 golden-string tests on the formatter and 2 new `useSendMessage` tests covering the snapshot path and chip restore on bind failure. Existing 528 chat-feature tests still pass.
The bubble-shrink in b2c288c stripped the LLM-facing instruction that the video has been transcribed and is readable via document_retrieve. Agents then replied "rag_search has no video-to-text capability" when asked to summarize a pasted video link, even though the transcript was already indexed in RAG. Re-append the hint inline on the first line so the bubble keeps its compact height (still two lines total).
| @@ -0,0 +1,745 @@ | |||
| 'use node'; | |||
| @@ -0,0 +1,164 @@ | |||
| 'use node'; | |||
| @@ -0,0 +1,736 @@ | |||
| 'use node'; | |||
Five CI checks failed on the video-link PR; all five had local root causes that can't wait on rebuild: - ytdlp.ts: drop unnecessary `\-` escape in LANG_TOKEN_RE; restructure the kill timer so `killer` is `const` (oxlint prefer-const flagged the single-assignment let). - video-link-markdown.ts / video-url.ts: knip flagged VideoLinkAttachmentMarkdownInput and ExtractedVideoUrl as unused exports — both are only consumed locally, so drop the `export`. - docs/de/platform/chat/attachments.md: terminology tests required "Header" to be translated as "Kopfzeile" (per shipped UI label and loanword policy); adjust article to match feminine compound noun. - services/convex/Dockerfile: YTDLP_VERSION=2026.05.10 is not a real yt-dlp release (latest is 2026.03.17), so the convex build hit a 404 on the asset URL — repin to the actual latest release.
📝 WalkthroughWalkthroughAdds a complete “video links” feature: frontend extracts and ingests pasted URLs, renders chips, and gates send; backend defines videoLinkJobs with ingest action (yt-dlp captions/audio, transcription, indexing), mutations/queries, cron recovery, and cleanup/erasure. Integrates file metadata, paragraphization, and heartbeat. Introduces SSRF/DNS safety, hardened safe_fetch, and container firewall. Expands untrusted-content trust rules and wraps RAG/doc/integration outputs. Updates docs, i18n, README, and Terms. Estimated code review effort🎯 5 (Critical) | ⏱️ ~120 minutes ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
|
Four CI failures surfaced once the docs terminology fix let the Unit job run to completion; all are downstream of changes already on the branch but had no test update. - host_policy.test.ts: 100.100.100.200 is CGNAT (RFC 6598), which isPrivateIpv4 now classifies as private — drop it from the "accepts as public" list. The IMDS-block case for the same IP at line 67 was already passing. - document_retrieve_tool.test.ts: retrieveDocument now does a 3rd ctx.runQuery to lookupVideoLinkSources after fetch succeeds. Add the mock identifier and a `mockResolvedValueOnce([])` to every runQuery chain that reaches the success path (default + 3 overrides). - personalization_cascade.test.ts: cascadeOnOrgDeleted now sweeps videoLinkJobs via `by_organizationId_and_status`. Expand the expected-indexes set to include it and bump the length from 3 to 4. - services/convex/Dockerfile: the yt-dlp sha256 verify ran `sed "s| yt-dlp_linux$| yt-dlp|"` against a SHA2-256SUMS line whose format is `<hash><sp><sp>yt-dlp_linux`. The 1-space pattern left one extra space in the output (`<hash> yt-dlp`), so sha256sum parsed the filename as ` yt-dlp` and failed with "No such file or directory". Match the double-space prefix so the rewritten line stays in the canonical `<hash><sp><sp><name>` shape.
…anges Three test failures + the convex Docker build surfaced once the docs terminology fix let the Unit job run to completion. All four are downstream of changes already on the branch but lacked test or script updates. - host_policy.test.ts: 100.100.100.200 is CGNAT (RFC 6598), which the refreshed isPrivateIpv4 now correctly classifies as private. Drop it from the "accepts as public" list; the IMDS-block case for the same IP at line 67 was already passing. - document_retrieve_tool.test.ts: retrieveDocument now issues a 3rd runQuery to lookupVideoLinkSources after the RAG fetch succeeds, so every chain that reaches the success path needs a third `mockResolvedValueOnce([])` (default + 3 overrides). - personalization_cascade.test.ts: cascadeOnOrgDeleted now sweeps videoLinkJobs via `by_organizationId_and_status`. Expand the expected-indexes set + bump the length assertion from 3 to 4. - services/convex/Dockerfile: the yt-dlp sha256 step rewrote the SHA2-256SUMS line via `sed "s| yt-dlp_linux$| yt-dlp|"`. The one-space pattern only consumed one of the canonical two spaces, so the output landed as `<hash> yt-dlp` and sha256sum parsed the filename as ` yt-dlp` → "No such file or directory". Match the double-space prefix so the rewritten line stays in the `<hash><sp><sp><name>` shape sha256sum -c expects.
There was a problem hiding this comment.
Actionable comments posted: 18
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@services/convex/docker-entrypoint.sh`:
- Around line 59-83: The current iptables OUTPUT rejects for RFC1918 (the
iptables -A OUTPUT -d 10.0.0.0/8, -d 172.16.0.0/12, -d 192.168.0.0/16 lines)
will block legitimate Compose sibling service traffic; change the firewall to
explicitly allow the container's local Docker network(s) before the rejects.
Concretely, detect or accept ENV-configured local subnets (or the docker bridge
address range) and insert explicit ALLOW rules for those CIDRs or for the
container-facing interface prior to the REJECT lines (or make
TALE_SKIP_SSRF_FIREWALL cover this case), so that the iptables -A OUTPUT rejects
only target external RFC1918 addresses and not intra-compose service addresses.
In `@services/convex/Dockerfile`:
- Around line 88-94: The checksum verification fails because the downloaded
archive is saved as /tmp/deno.zip while the checksum file (used by sha256sum -c
deno.sums) expects the original filename ${DENO_ASSET}; change the download
target to /tmp/${DENO_ASSET} (and adjust subsequent references like the unzip
source and rm targets) so that the filename in the checksum matches the archive,
ensuring sha256sum -c deno.sums succeeds when run in /tmp with DENO_ASSET used
consistently.
In `@services/platform/app/features/chat/components/video-link-chip.tsx`:
- Around line 68-70: The completed-status branch can render an empty string
because formatDurationLabel(job.videoDurationSec) returns '' when duration is
missing; update the completed branch in the VideoLinkChip component to use the
formatted label if non-empty, otherwise render a localized fallback (e.g.,
t('status.ready') or a passed-in i18n string) instead of ''. Concretely, change
the expression that uses formatDurationLabel(job.videoDurationSec) (and the
identical logic later in the file) to something like: let label =
formatDurationLabel(job.videoDurationSec); render label && label.length ? label
: localizedReadyFallback; and apply this fix for both occurrences referencing
job.displayStatus === 'completed' so the status row never appears blank.
In `@services/platform/app/features/chat/hooks/use-send-message.ts`:
- Around line 473-494: The catch blocks that handle errors from
bindCompletedJobsToMessage (seen around use-send-message.ts) currently swallow
the error, restore UI state (unmarkJobsSent with snapshotJobIds) and show a
failure toast, then allow the send flow to continue; change this so
bindCompletedJobsToMessage failures do not let the send proceed: either re-throw
the caught error to let the outer rollback handle it (remove the toast/unmark
side-effects in these inner catches), or convert bindCompletedJobsToMessage into
a non-fatal operation by explicitly treating its error as non-blocking (i.e.,
log the error but do NOT call toast or unmarkJobsSent and ensure the send flow
continues only when safe). Update the catch blocks that reference
bindCompletedJobsToMessage, unmarkJobsSent, snapshotJobIds, and toast (including
the other similar blocks noted) to follow one consistent approach.
In `@services/platform/convex/agent_tools/documents/helpers/retrieve_document.ts`:
- Around line 104-112: The call to
ctx.runQuery(internal.file_metadata.internal_queries.lookupVideoLinkSources,
...) can return undefined, but the code assumes an array by accessing
videoSources.length; update the code that assigns videoSources (and the second
occurrence) to default to an empty array when undefined (e.g., const
videoSources = (await ctx.runQuery(...)) || []), then proceed to check
videoSources.length and use videoSources[0].sourceUrl before calling
wrapUntrusted(result.content, meta); reference lookupVideoLinkSources,
videoSources, ctx.runQuery, wrapUntrusted, and result.content when making the
change.
In `@services/platform/convex/agent_tools/integrations/integration_tool.ts`:
- Around line 215-219: Replace the existing comment in integration_tool.ts with
a concise rationale (near the block that deals with `content`, `fileReferences`,
and `citations`) that explains why duplicating raw text was a prompt-injection
risk and that structured `fileReferences`/`citations` satisfy the legitimate
structured-output needs; do not mention or narrate removed fields or prior
implementation details—just state the security rationale and current approach.
In `@services/platform/convex/file_metadata/paragraphize.ts`:
- Around line 112-118: The flush function is concatenating cue texts with
current.join('') which can merge words across cues (e.g., "will" + "see" →
"willsee"); update flush (and the equivalent logic around lines 147-151) to join
cues safely by inserting a single space only when the boundary doesn't already
have whitespace — e.g., build body by iterating current and appending each
trimmed cue, inserting one space if the previous char is non-whitespace and the
next cue's first char is non-whitespace; keep existing paragraphSpeaker,
addTimestamps, paragraphStart and paragraphs handling the same and ensure
parseVtt-normalized cue text is respected.
In `@services/platform/convex/file_metadata/transcribe_audio.ts`:
- Around line 449-455: Timestamps are being applied from a compressed timeline
(chunkStartSec) when preCheck?.source === 'video_link' so the [HH:MM:SS]
prefixes drift; either build and use an original→compressed time map from
compressAudio() and convert chunkStartSec back to wall-clock times before
calling joinSegmentsWithParagraphs/whisperSegmentsToParagraphSegments, or
conservatively disable timestamping for the video_link path by forcing
addTimestamps: false when preCheck?.source === 'video_link' (update both call
sites that pass addTimestamps, e.g., the joinSegmentsWithParagraphs invocation
and the later similar call at 460-462) so video deep-links and agent citations
use correct wall-clock offsets.
In `@services/platform/convex/governance/erasure.ts`:
- Around line 1818-1844: The loop that deletes RAG documents uses
perCategory.videoLinks.ragPurgeStorageIds and can attempt duplicate DELETEs
(double-counting ragDocumentsRemoved); before iterating, deduplicate the storage
IDs (e.g., build a Set from perCategory.videoLinks.ragPurgeStorageIds) so each
storageId is processed once, then iterate that unique list when calling
ragFetch/DELETE; update the code around perCategory.videoLinks,
ragPurgeStorageIds, ragFetch, and ragDocumentsRemoved to use the deduplicated
IDs and preserve the existing 404 handling.
In `@services/platform/convex/lib/cascades/personalization_cascade.ts`:
- Around line 137-160: The videoLinkJobs sweep can add ~6k deletes on top of the
earlier TTS sweep and exceed the mutation write budget; fix this by introducing
a shared delete budget counter (e.g., deleteBudget or remainingDeletes)
initialized to the max allowed deletes (or max minus what the TTS loop consumes)
and decrement it each time you call ctx.db.delete (and optionally for
ctx.storage.delete if that counts against budget). In the videoLinkJobs loop
(the for loop iterating pages from query('videoLinkJobs')
withIndex('by_organizationId_and_status') and using PAGE_SIZE), check the shared
budget and break out when it reaches zero, and ensure the TTS loop uses the same
shared counter so both sweeps cannot together exceed the budget.
In `@services/platform/convex/lib/untrusted_content.ts`:
- Around line 40-44: The re-export of sanitizeUntrustedField is located
mid-file; move it to the module's export section at the bottom of
services/platform/convex/lib/untrusted_content.ts and replace the named export
with a wildcard re-export: use export * from
'../../lib/shared/sanitize-untrusted-field'; so the file follows the "imports at
top, exports at bottom" guideline and aligns with repo conventions; ensure you
remove the existing export { sanitizeUntrustedField } line and add the export *
line in the file's final export block.
In `@services/platform/convex/threads/cascade_helpers.ts`:
- Around line 382-405: The loop that processes videoLinkJobs (videoLinksPage /
job) must also remove any associated fileMetadata rows and purge related RAG
documents when organizationId is missing; update the cascade_helpers.ts loop
that currently only deletes job._id and optionally job.storageId so that for
each job with job.fileMetadataId you (1) delete the corresponding fileMetadata
row via ctx.db.delete(String(job.fileMetadataId)) (or the appropriate db delete
API), (2) purge any RAG documents linked to that fileMetadata by calling your
existing purge function (e.g. purgeRagDocumentsForFileMetadata(fileMetadataId)
or implement one if absent), and (3) ensure these deletions run inside the same
try/catch pattern used for storage deletion to log and continue on errors; keep
the existing storage.delete and ctx.db.delete(job._id) behavior.
In `@services/platform/convex/video_links/ingest_video_link.ts`:
- Around line 680-724: handleYtDlpError currently hardcodes userLocale:
undefined when rescheduling retries, causing caption locale loss; modify
handleYtDlpError signature to accept a userLocale?: string (or the appropriate
Locale type) and pass that value through to the ctx.scheduler.runAfter call that
enqueues internal.video_links.ingest_video_link.ingestVideoLink so it uses {
jobId, userLocale } instead of undefined; ensure callers that invoke
handleYtDlpError (where retries are initiated) are updated to forward the
current locale, and keep all existing behavior (attempts/error updates)
unchanged.
In `@services/platform/convex/video_links/internal_mutations.ts`:
- Around line 408-409: The watchdog marks 'queued' rows as stuck after 5 minutes
which is shorter than the 15-minute retry cooldown; update the watchdog logic by
either increasing the timeout for the 'queued' status (the array entry
['queued', 5 * 60_000] in
services/platform/convex/video_links/internal_mutations.ts) to at least 15 *
60_000 (or the configured max backoff), or change the recovery filter to
explicitly ignore rows in 'queued' or 'retrying' states so scheduled retries
aren’t flipped to 'failed'; apply the same reasoning to any similar entry like
['fetching_metadata', 5 * 60_000] if it represents intentional backoff.
- Around line 485-514: In the GC loop over candidates (variables: candidates,
row, gcCount) add a call to remove the related RAG document before deleting the
job row: when row.storageId is present, call the existing deleteFromRagBatch (or
equivalent RAG deletion API) for that storageId (in a try/catch and log on
error) prior to deleting fileMetadata and the job row, so the transcript is
removed from the RAG index before its source artifacts are deleted; place this
call inside the same conditional branch where you handle row.storageId and
ensure it runs before ctx.db.delete(row._id).
In `@services/platform/convex/video_links/mutations.ts`:
- Around line 196-227: The current dedup uses a single .first() lookup on the
videoLinkJobs index by_organizationId_and_sourceUrlHash which can return a newer
unrelated job and miss a valid same-thread/same-uploader/unbound candidate;
update the lookup in the mutation that computes existing to iterate the index
results (querying videoLinkJobs with
withIndex('by_organizationId_and_sourceUrlHash', ...) and scanning rows) until
you find a row where messageBoundAt is undefined, threadId === args.threadId,
uploadedBy === userId and status not in ['failed','skipped'] and now -
_creationTime < DEDUP_WINDOW_MS, then use that row (refresh pastedToken if
different) or fall back to inserting a new job—alternatively add/replace with a
narrower index that includes threadId and uploadedBy to avoid scanning.
- Around line 107-163: After the playlist check, run the shared string-level
safety gate (the project's central URL safety/validation helper) on the raw URL
(args.url or serverNormalized) immediately before calling
checkOrganizationRateLimit so unsafe URLs never consume rate-limit/budget; if
the gate fails throw a ConvexError (e.g. code 'unsafeUrl') and abort. Locate
this insertion point near isPlaylistUrl, normalizeUrlForHash, and
checkOrganizationRateLimit in the mutation and ensure the check rejects
credential-bearing/localhost/IP-literal URLs so none get persisted to
videoLinkJobs.sourceUrl or queued for the async action.
In `@services/platform/convex/video_links/queries.ts`:
- Around line 131-132: Wrap the call to authComponent.getAuthUser(ctx) in a
try/catch in the Convex query handlers in queries.ts (the places using
authComponent.getAuthUser(ctx) around the shown block and the second occurrence
at the 188-189 region); on any thrown error, silently treat it as an
unauthenticated user and return [] (i.e., swallow the error without logging and
return the same fallback as when getAuthUser returns falsy) so the query
preserves the soft-fail contract.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: bbdc2e8b-ca4c-45f1-9f30-f8ce8651b9e4
⛔ Files ignored due to path filters (1)
services/platform/convex/_generated/api.d.tsis excluded by!**/_generated/**
📒 Files selected for processing (62)
README.de.mdREADME.fr.mdREADME.mdcompose.ymldocs/de/platform/chat/attachments.mddocs/en/platform/chat/attachments.mddocs/fr/platform/chat/attachments.mdservices/convex/Dockerfileservices/convex/docker-entrypoint.shservices/platform/app/components/ui/forms/file-upload.tsxservices/platform/app/features/chat/components/chat-input.tsxservices/platform/app/features/chat/components/chat-interface.tsxservices/platform/app/features/chat/components/video-link-chip.test.tsxservices/platform/app/features/chat/components/video-link-chip.tsxservices/platform/app/features/chat/hooks/use-chat-video-links.tsservices/platform/app/features/chat/hooks/use-send-message.test.tsservices/platform/app/features/chat/hooks/use-send-message.tsservices/platform/convex/agent_tools/documents/helpers/retrieve_document.tsservices/platform/convex/agent_tools/integrations/create_bound_integration_tool.tsservices/platform/convex/agent_tools/integrations/integration_batch_tool.tsservices/platform/convex/agent_tools/integrations/integration_tool.tsservices/platform/convex/agent_tools/rag/rag_search_tool.tsservices/platform/convex/crons.tsservices/platform/convex/file_metadata/internal_mutations.tsservices/platform/convex/file_metadata/internal_queries.tsservices/platform/convex/file_metadata/paragraphize.test.tsservices/platform/convex/file_metadata/paragraphize.tsservices/platform/convex/file_metadata/schema.tsservices/platform/convex/file_metadata/transcribe_audio.tsservices/platform/convex/governance/erasure.tsservices/platform/convex/lib/agent_chat/start_agent_chat.tsservices/platform/convex/lib/agent_response/build_system_prompt.tsservices/platform/convex/lib/cascades/personalization_cascade.tsservices/platform/convex/lib/http/safe_fetch.tsservices/platform/convex/lib/untrusted_content.tsservices/platform/convex/schema.tsservices/platform/convex/threads/cascade_helpers.tsservices/platform/convex/video_links/captions_parser.test.tsservices/platform/convex/video_links/captions_parser.tsservices/platform/convex/video_links/ingest_video_link.tsservices/platform/convex/video_links/internal_mutations.tsservices/platform/convex/video_links/internal_queries.tsservices/platform/convex/video_links/mutations.tsservices/platform/convex/video_links/queries.tsservices/platform/convex/video_links/schema.tsservices/platform/convex/video_links/url_safety.test.tsservices/platform/convex/video_links/url_safety.tsservices/platform/convex/video_links/ytdlp.tsservices/platform/convex/workflow_engine/action_defs/document/document_action.tsservices/platform/convex/workflow_engine/action_defs/rag/rag_action.tsservices/platform/convex/workflow_engine/helpers/nodes/llm/execute_agent_with_tools.tsservices/platform/lib/shared/sanitize-untrusted-field.tsservices/platform/lib/shared/video-link-markdown.test.tsservices/platform/lib/shared/video-link-markdown.tsservices/platform/lib/shared/video-url.test.tsservices/platform/lib/shared/video-url.tsservices/platform/messages/de.jsonservices/platform/messages/en.jsonservices/platform/messages/fr.jsonservices/web/app/content/legal/de/terms-of-service.mdservices/web/app/content/legal/en/terms-of-service.mdservices/web/app/content/legal/fr/terms-of-service.md
| if [ "${TALE_SKIP_SSRF_FIREWALL:-0}" != "1" ] && command -v iptables >/dev/null 2>&1; then | ||
| if iptables -L OUTPUT >/dev/null 2>&1; then | ||
| log_info "Installing SSRF egress firewall (REJECT IMDS + link-local + RFC1918)" | ||
| # Cloud instance metadata service (AWS/GCP/Azure IMDSv1 footprint). | ||
| iptables -A OUTPUT -d 169.254.169.254/32 -j REJECT --reject-with icmp-net-prohibited 2>/dev/null || \ | ||
| log_warn "iptables: failed to reject 169.254.169.254/32 (continuing without IMDS guard)" | ||
| # All link-local — covers Azure 168.63.129.16 and other variants. | ||
| iptables -A OUTPUT -d 169.254.0.0/16 -j REJECT --reject-with icmp-net-prohibited 2>/dev/null || true | ||
| # RFC1918 — Docker bridge subnets in the same compose network are | ||
| # exempt via the `-o lo` and the docker0 default acceptance; only | ||
| # external private ranges (corp VPN, cloud VPC peers) get blocked. | ||
| # If the convex container itself shares a docker network with | ||
| # platform/rag and they're on 172.16/12, this still works because | ||
| # those flows leave via the bridge driver, not OUTPUT to the host | ||
| # netns. If the operator runs in a non-default docker-network mode | ||
| # set TALE_SKIP_SSRF_FIREWALL=1 to bypass. | ||
| iptables -A OUTPUT -d 10.0.0.0/8 -j REJECT --reject-with icmp-net-prohibited 2>/dev/null || true | ||
| iptables -A OUTPUT -d 172.16.0.0/12 -j REJECT --reject-with icmp-net-prohibited 2>/dev/null || true | ||
| iptables -A OUTPUT -d 192.168.0.0/16 -j REJECT --reject-with icmp-net-prohibited 2>/dev/null || true | ||
| else | ||
| log_warn "iptables present but no NET_ADMIN capability — SSRF firewall NOT installed (set cap_add: [NET_ADMIN] in compose.yml)" | ||
| fi | ||
| else | ||
| log_warn "iptables unavailable or TALE_SKIP_SSRF_FIREWALL=1 — SSRF firewall NOT installed (dev mode)" | ||
| fi |
There was a problem hiding this comment.
These OUTPUT rejects will break normal Compose networking.
Inside the container namespace, connections to PostgreSQL, Caddy, and other sibling services on Docker bridge addresses still hit OUTPUT. Rejecting all of 10/8, 172.16/12, and 192.168/16 here will block legitimate service-to-service traffic on default Docker networks before the app finishes booting. This needs explicit allow rules for the local service network(s) or a narrower SSRF defense than blanket RFC1918 rejects.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@services/convex/docker-entrypoint.sh` around lines 59 - 83, The current
iptables OUTPUT rejects for RFC1918 (the iptables -A OUTPUT -d 10.0.0.0/8, -d
172.16.0.0/12, -d 192.168.0.0/16 lines) will block legitimate Compose sibling
service traffic; change the firewall to explicitly allow the container's local
Docker network(s) before the rejects. Concretely, detect or accept
ENV-configured local subnets (or the docker bridge address range) and insert
explicit ALLOW rules for those CIDRs or for the container-facing interface prior
to the REJECT lines (or make TALE_SKIP_SSRF_FIREWALL cover this case), so that
the iptables -A OUTPUT rejects only target external RFC1918 addresses and not
intra-compose service addresses.
| && curl -fsSL "https://github.com/denoland/deno/releases/download/${DENO_TAG}/${DENO_ASSET}" -o /tmp/deno.zip \ | ||
| && curl -fsSL "https://github.com/denoland/deno/releases/download/${DENO_TAG}/${DENO_ASSET}.sha256sum" -o /tmp/deno.sums \ | ||
| && (cd /tmp && sha256sum -c deno.sums) \ | ||
| && unzip -q /tmp/deno.zip -d /usr/local/bin \ | ||
| && chmod 0755 /usr/local/bin/deno \ | ||
| && echo "$DENO_TAG" > /etc/deno-version \ | ||
| && rm -f /tmp/deno.zip /tmp/deno.sums \ |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
set -euo pipefail
ver="v2.5.4"
asset="deno-x86_64-unknown-linux-gnu.zip"
tmp="$(mktemp -d)"
trap 'rm -rf "$tmp"' EXIT
curl -fsSL "https://github.com/denoland/deno/releases/download/${ver}/${asset}.sha256sum" -o "$tmp/deno.sums"
echo "Checksum file contents:"
cat "$tmp/deno.sums"
echo
touch "$tmp/deno.zip"
echo "Verifying with the current Dockerfile's local filename:"
if (cd "$tmp" && sha256sum -c deno.sums); then
echo "unexpected success"
else
echo "expected failure: checksum references ${asset}, not deno.zip"
fiRepository: tale-project/tale
Length of output: 395
🏁 Script executed:
cat -n services/convex/Dockerfile | sed -n '85,100p'Repository: tale-project/tale
Length of output: 1008
🏁 Script executed:
cat -n services/convex/Dockerfile | sed -n '70,87p'Repository: tale-project/tale
Length of output: 1278
Deno checksum verification will fail due to filename mismatch.
The archive is downloaded as /tmp/deno.zip (line 88), but the checksum file references the original asset name ${DENO_ASSET} (e.g., deno-x86_64-unknown-linux-gnu.zip). When sha256sum -c deno.sums runs at line 90, it will look for the filename embedded in the checksum file and fail. Save the archive as /tmp/${DENO_ASSET} instead.
Suggested fix
- && curl -fsSL "https://github.com/denoland/deno/releases/download/${DENO_TAG}/${DENO_ASSET}" -o /tmp/deno.zip \
+ && curl -fsSL "https://github.com/denoland/deno/releases/download/${DENO_TAG}/${DENO_ASSET}" -o "/tmp/${DENO_ASSET}" \
&& curl -fsSL "https://github.com/denoland/deno/releases/download/${DENO_TAG}/${DENO_ASSET}.sha256sum" -o /tmp/deno.sums \
&& (cd /tmp && sha256sum -c deno.sums) \
- && unzip -q /tmp/deno.zip -d /usr/local/bin \
+ && unzip -q "/tmp/${DENO_ASSET}" -d /usr/local/bin \
&& chmod 0755 /usr/local/bin/deno \
&& echo "$DENO_TAG" > /etc/deno-version \
- && rm -f /tmp/deno.zip /tmp/deno.sums \
+ && rm -f "/tmp/${DENO_ASSET}" /tmp/deno.sums \📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| && curl -fsSL "https://github.com/denoland/deno/releases/download/${DENO_TAG}/${DENO_ASSET}" -o /tmp/deno.zip \ | |
| && curl -fsSL "https://github.com/denoland/deno/releases/download/${DENO_TAG}/${DENO_ASSET}.sha256sum" -o /tmp/deno.sums \ | |
| && (cd /tmp && sha256sum -c deno.sums) \ | |
| && unzip -q /tmp/deno.zip -d /usr/local/bin \ | |
| && chmod 0755 /usr/local/bin/deno \ | |
| && echo "$DENO_TAG" > /etc/deno-version \ | |
| && rm -f /tmp/deno.zip /tmp/deno.sums \ | |
| && curl -fsSL "https://github.com/denoland/deno/releases/download/${DENO_TAG}/${DENO_ASSET}" -o "/tmp/${DENO_ASSET}" \ | |
| && curl -fsSL "https://github.com/denoland/deno/releases/download/${DENO_TAG}/${DENO_ASSET}.sha256sum" -o /tmp/deno.sums \ | |
| && (cd /tmp && sha256sum -c deno.sums) \ | |
| && unzip -q "/tmp/${DENO_ASSET}" -d /usr/local/bin \ | |
| && chmod 0755 /usr/local/bin/deno \ | |
| && echo "$DENO_TAG" > /etc/deno-version \ | |
| && rm -f "/tmp/${DENO_ASSET}" /tmp/deno.sums \ |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@services/convex/Dockerfile` around lines 88 - 94, The checksum verification
fails because the downloaded archive is saved as /tmp/deno.zip while the
checksum file (used by sha256sum -c deno.sums) expects the original filename
${DENO_ASSET}; change the download target to /tmp/${DENO_ASSET} (and adjust
subsequent references like the unzip source and rm targets) so that the filename
in the checksum matches the archive, ensuring sha256sum -c deno.sums succeeds
when run in /tmp with DENO_ASSET used consistently.
| : isCompleted | ||
| ? formatDurationLabel(job.videoDurationSec) | ||
| : job.displayStatus === 'retrying' && job.errorReasonCode |
There was a problem hiding this comment.
Avoid empty completed status text when duration is missing.
When job.displayStatus === 'completed' but videoDurationSec is absent/0, formatDurationLabel(...) returns '', so the status row becomes blank. Add a localized fallback (for example, “Ready”) in the completed branch.
Proposed fix
+ const completedStatus =
+ formatDurationLabel(job.videoDurationSec) ||
+ tChat('videoLink.statuses.completed');
+
const statusText = isFailed
? tChat(`videoLink.errors.${job.errorReasonCode ?? 'generic'}`, {
defaultValue: tChat('videoLink.errors.generic'),
})
- : isCompleted
- ? formatDurationLabel(job.videoDurationSec)
+ : isCompleted
+ ? completedStatus
: job.displayStatus === 'retrying' && job.errorReasonCode
? `${tChat('videoLink.statuses.retrying')} (${tChat(`videoLink.errors.${job.errorReasonCode}`, { defaultValue: tChat('videoLink.errors.generic') })})`
: isProcessing && renderedProgress
? renderedProgress
: tChat(`videoLink.statuses.${job.displayStatus}`, {Also applies to: 181-183
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@services/platform/app/features/chat/components/video-link-chip.tsx` around
lines 68 - 70, The completed-status branch can render an empty string because
formatDurationLabel(job.videoDurationSec) returns '' when duration is missing;
update the completed branch in the VideoLinkChip component to use the formatted
label if non-empty, otherwise render a localized fallback (e.g.,
t('status.ready') or a passed-in i18n string) instead of ''. Concretely, change
the expression that uses formatDurationLabel(job.videoDurationSec) (and the
identical logic later in the file) to something like: let label =
formatDurationLabel(job.videoDurationSec); render label && label.length ? label
: localizedReadyFallback; and apply this fix for both occurrences referencing
job.displayStatus === 'completed' so the status row never appears blank.
| } catch (err) { | ||
| console.error( | ||
| '[use-send-message] background video-link bind failed:', | ||
| err instanceof Error ? err.message : err, | ||
| ); | ||
| // Restore the chips that were hidden synchronously on click — | ||
| // without this the user sees their text + attachments vanish | ||
| // and has no way to retry without re-pasting (round-2 V10 / | ||
| // HIGH #17 spirit, adapted for the new sync-hide path). | ||
| if (unmarkJobsSent && snapshotJobIds.length > 0) { | ||
| unmarkJobsSent(snapshotJobIds); | ||
| } | ||
| const description = | ||
| err instanceof Error && err.message | ||
| ? err.message | ||
| : t('videoLink.toast.bindFailedDescription'); | ||
| toast({ | ||
| title: t('toast.sendFailed'), | ||
| description, | ||
| variant: 'destructive', | ||
| }); | ||
| } |
There was a problem hiding this comment.
Don't keep sending after a video-link bind failure.
All three catch blocks swallow bindCompletedJobsToMessage errors and then continue into the actual send. That leaves the client and server disagreeing about whether the jobs were attached: the existing-thread branch can show a destructive "send failed" toast even when the message succeeds, while the arena/new-thread branches can send the attachment payload without ever stamping messageBoundAt, so those jobs can resurface and be rebound later. Re-throw here and let the outer rollback path handle the failure, or make bind fully non-fatal and remove the failure toast/rollback side effects.
Also applies to: 589-616, 715-745
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@services/platform/app/features/chat/hooks/use-send-message.ts` around lines
473 - 494, The catch blocks that handle errors from bindCompletedJobsToMessage
(seen around use-send-message.ts) currently swallow the error, restore UI state
(unmarkJobsSent with snapshotJobIds) and show a failure toast, then allow the
send flow to continue; change this so bindCompletedJobsToMessage failures do not
let the send proceed: either re-throw the caught error to let the outer rollback
handle it (remove the toast/unmark side-effects in these inner catches), or
convert bindCompletedJobsToMessage into a non-fatal operation by explicitly
treating its error as non-blocking (i.e., log the error but do NOT call toast or
unmarkJobsSent and ensure the send flow continues only when safe). Update the
catch blocks that reference bindCompletedJobsToMessage, unmarkJobsSent,
snapshotJobIds, and toast (including the other similar blocks noted) to follow
one consistent approach.
| const videoSources = await ctx.runQuery( | ||
| internal.file_metadata.internal_queries.lookupVideoLinkSources, | ||
| { storageIds: [toId<'_storage'>(args.fileId)] }, | ||
| ); | ||
| if (videoSources.length > 0) { | ||
| const meta: { tool: string; url?: string } = { tool: 'document_retrieve' }; | ||
| if (videoSources[0].sourceUrl) meta.url = videoSources[0].sourceUrl; | ||
| result.content = wrapUntrusted(result.content, meta); | ||
| } |
There was a problem hiding this comment.
Guard lookupVideoLinkSources before reading .length.
ctx.runQuery(...) can resolve to undefined in current tests, and this now throws at Line 108/Line 119. Default to an empty array before using .length so retrieval doesn’t crash.
Suggested fix
- const videoSources = await ctx.runQuery(
+ const videoSources =
+ (await ctx.runQuery(
internal.file_metadata.internal_queries.lookupVideoLinkSources,
{ storageIds: [toId<'_storage'>(args.fileId)] },
- );
- if (videoSources.length > 0) {
+ )) ?? [];
+ const hasVideoSources = videoSources.length > 0;
+ if (hasVideoSources) {
const meta: { tool: string; url?: string } = { tool: 'document_retrieve' };
if (videoSources[0].sourceUrl) meta.url = videoSources[0].sourceUrl;
result.content = wrapUntrusted(result.content, meta);
}
@@
- wrappedAsUntrusted: videoSources.length > 0,
+ wrappedAsUntrusted: hasVideoSources,Also applies to: 119-119
🧰 Tools
🪛 GitHub Check: Unit
[failure] 108-108: [server] convex/agent_tools/documents/document_retrieve_tool.test.ts > retrieveDocument helper > does not call getAccessibleDocumentIds on the fileMetadata fallback path
TypeError: Cannot read properties of undefined (reading 'length')
❯ Module.retrieveDocument convex/agent_tools/documents/helpers/retrieve_document.ts:108:20
❯ convex/agent_tools/documents/document_retrieve_tool.test.ts:299:5
[failure] 108-108: [server] convex/agent_tools/documents/document_retrieve_tool.test.ts > retrieveDocument helper > falls back to fileMetadata + RAG for chat-uploaded files not in documents hub
TypeError: Cannot read properties of undefined (reading 'length')
❯ Module.retrieveDocument convex/agent_tools/documents/helpers/retrieve_document.ts:108:20
❯ convex/agent_tools/documents/document_retrieve_tool.test.ts:217:20
[failure] 108-108: [server] convex/agent_tools/documents/document_retrieve_tool.test.ts > retrieveDocument helper > does not truncate content at exactly 50K chars
TypeError: Cannot read properties of undefined (reading 'length')
❯ Module.retrieveDocument convex/agent_tools/documents/helpers/retrieve_document.ts:108:20
❯ convex/agent_tools/documents/document_retrieve_tool.test.ts:162:20
[failure] 108-108: [server] convex/agent_tools/documents/document_retrieve_tool.test.ts > retrieveDocument helper > truncates content exceeding 50K chars
TypeError: Cannot read properties of undefined (reading 'length')
❯ Module.retrieveDocument convex/agent_tools/documents/helpers/retrieve_document.ts:108:20
❯ convex/agent_tools/documents/document_retrieve_tool.test.ts:147:20
[failure] 108-108: [server] convex/agent_tools/documents/document_retrieve_tool.test.ts > retrieveDocument helper > uses file storage ID in RAG URL, not document ID
TypeError: Cannot read properties of undefined (reading 'length')
❯ Module.retrieveDocument convex/agent_tools/documents/helpers/retrieve_document.ts:108:20
❯ convex/agent_tools/documents/document_retrieve_tool.test.ts:132:5
[failure] 108-108: [server] convex/agent_tools/documents/document_retrieve_tool.test.ts > retrieveDocument helper > omits query params when chunkStart and chunkEnd not provided
TypeError: Cannot read properties of undefined (reading 'length')
❯ Module.retrieveDocument convex/agent_tools/documents/helpers/retrieve_document.ts:108:20
❯ convex/agent_tools/documents/document_retrieve_tool.test.ts:120:5
[failure] 108-108: [server] convex/agent_tools/documents/document_retrieve_tool.test.ts > retrieveDocument helper > forwards chunkStart and chunkEnd as query params
TypeError: Cannot read properties of undefined (reading 'length')
❯ Module.retrieveDocument convex/agent_tools/documents/helpers/retrieve_document.ts:108:20
❯ convex/agent_tools/documents/document_retrieve_tool.test.ts:105:5
[failure] 108-108: [server] convex/agent_tools/documents/document_retrieve_tool.test.ts > retrieveDocument helper > returns correct result shape on happy path
TypeError: Cannot read properties of undefined (reading 'length')
❯ Module.retrieveDocument convex/agent_tools/documents/helpers/retrieve_document.ts:108:20
❯ convex/agent_tools/documents/document_retrieve_tool.test.ts:87:20
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@services/platform/convex/agent_tools/documents/helpers/retrieve_document.ts`
around lines 104 - 112, The call to
ctx.runQuery(internal.file_metadata.internal_queries.lookupVideoLinkSources,
...) can return undefined, but the code assumes an array by accessing
videoSources.length; update the code that assigns videoSources (and the second
occurrence) to default to an empty array when undefined (e.g., const
videoSources = (await ctx.runQuery(...)) || []), then proceed to check
videoSources.length and use videoSources[0].sourceUrl before calling
wrapUntrusted(result.content, meta); reference lookupVideoLinkSources,
videoSources, ctx.runQuery, wrapUntrusted, and result.content when making the
change.
| ['queued', 5 * 60_000], | ||
| ['fetching_metadata', 5 * 60_000], |
There was a problem hiding this comment.
The queued watchdog window is shorter than the retry cooldown.
Retried jobs surface as queued/retrying during the backoff window, and the user-facing contract here is a 15-minute cooldown for throttled sources. With queued treated as stuck after 5 minutes, the watchdog will flip legitimate cooldown rows to failed before their scheduled retry fires. This window needs to be at least the longest intentional backoff, or queued retry rows need to be excluded from recovery.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@services/platform/convex/video_links/internal_mutations.ts` around lines 408
- 409, The watchdog marks 'queued' rows as stuck after 5 minutes which is
shorter than the 15-minute retry cooldown; update the watchdog logic by either
increasing the timeout for the 'queued' status (the array entry ['queued', 5 *
60_000] in services/platform/convex/video_links/internal_mutations.ts) to at
least 15 * 60_000 (or the configured max backoff), or change the recovery filter
to explicitly ignore rows in 'queued' or 'retrying' states so scheduled retries
aren’t flipped to 'failed'; apply the same reasoning to any similar entry like
['fetching_metadata', 5 * 60_000] if it represents intentional backoff.
| const candidates = await ctx.db | ||
| .query('videoLinkJobs') | ||
| .withIndex('by_status', (q) => q.eq('status', status)) | ||
| .take(UNBOUND_GC_BATCH * 4); | ||
| for (const row of candidates) { | ||
| if (gcCount >= UNBOUND_GC_BATCH) break outer; | ||
| if (row.messageBoundAt !== undefined) continue; | ||
| if (row._creationTime > gcCutoff) continue; | ||
| if (row.storageId) { | ||
| try { | ||
| await ctx.storage.delete(row.storageId); | ||
| } catch (err) { | ||
| console.warn( | ||
| `[video_links] GC storage.delete failed for ${row._id}:`, | ||
| err instanceof Error ? err.message : err, | ||
| ); | ||
| } | ||
| } | ||
| if (row.fileMetadataId) { | ||
| try { | ||
| await ctx.db.delete(row.fileMetadataId); | ||
| } catch (err) { | ||
| console.warn( | ||
| `[video_links] GC fileMetadata.delete failed for ${row._id}:`, | ||
| err instanceof Error ? err.message : err, | ||
| ); | ||
| } | ||
| } | ||
| await ctx.db.delete(row._id); | ||
| gcCount += 1; |
There was a problem hiding this comment.
GC needs to purge the matching RAG document before deleting the row.
These unbound draft transcripts are indexed into RAG under storageId, but this sweep only deletes _storage, fileMetadata, and the job row. Without a deleteFromRagBatch call, the transcript stays searchable after its source artifacts are gone.
Suggested fix
outer: for (const status of gcStatuses) {
const candidates = await ctx.db
.query('videoLinkJobs')
.withIndex('by_status', (q) => q.eq('status', status))
.take(UNBOUND_GC_BATCH * 4);
+ const ragPurgeStorageIds: string[] = [];
for (const row of candidates) {
if (gcCount >= UNBOUND_GC_BATCH) break outer;
if (row.messageBoundAt !== undefined) continue;
if (row._creationTime > gcCutoff) continue;
+ if (row.storageId) {
+ ragPurgeStorageIds.push(String(row.storageId));
+ }
if (row.storageId) {
try {
await ctx.storage.delete(row.storageId);
@@
}
await ctx.db.delete(row._id);
gcCount += 1;
}
+ if (ragPurgeStorageIds.length > 0) {
+ await ctx.scheduler.runAfter(
+ 0,
+ internal.workflow_engine.action_defs.rag.helpers.delete_document
+ .deleteFromRagBatch,
+ { fileIds: ragPurgeStorageIds },
+ );
+ }
}🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@services/platform/convex/video_links/internal_mutations.ts` around lines 485
- 514, In the GC loop over candidates (variables: candidates, row, gcCount) add
a call to remove the related RAG document before deleting the job row: when
row.storageId is present, call the existing deleteFromRagBatch (or equivalent
RAG deletion API) for that storageId (in a try/catch and log on error) prior to
deleting fileMetadata and the job row, so the transcript is removed from the RAG
index before its source artifacts are deleted; place this call inside the same
conditional branch where you handle row.storageId and ensure it runs before
ctx.db.delete(row._id).
| // Refuse standalone playlist URLs synchronously — surfaced as a toast | ||
| // before any chip is rendered. ConvexError code matches the i18n key | ||
| // under `videoLink.errors.*` so the frontend can localize without | ||
| // string-matching the message. | ||
| if (isPlaylistUrl(args.url)) { | ||
| throw new ConvexError({ | ||
| code: 'playlist', | ||
| message: | ||
| 'Playlist URLs are not supported — paste a single video link instead', | ||
| }); | ||
| } | ||
|
|
||
| // Rate-limit (reuses file:upload bucket: 50/min/org) | ||
| await checkOrganizationRateLimit(ctx, 'file:upload', args.organizationId); | ||
|
|
||
| // Budget gate. Use the worst-case prospective cost (4h × the default | ||
| // whisper rate ≈ 144 cents) so an org over its cap can't queue | ||
| // multi-dollar ingests behind a permissive chat-token check. Real | ||
| // post-completion charges land in the ledger via | ||
| // `recordTranscriptionUsage`; this is a pre-spend reservation only. | ||
| // Captions-branch ingests have $0 actual Whisper cost but still pay | ||
| // disk + bandwidth — accepting the worst-case overestimate here keeps | ||
| // the gate conservative and the code simple. | ||
| const PROSPECTIVE_VIDEO_LINK_COST_CENTS = Math.ceil( | ||
| (CHAT_AUDIO_MAX_DURATION_SEC / 60) * 0.6, | ||
| ); | ||
| const { userTeamIds, userRole } = await resolveBudgetContext( | ||
| ctx, | ||
| args.organizationId, | ||
| userId, | ||
| ); | ||
| const budgetResult = await checkBudget( | ||
| ctx, | ||
| args.organizationId, | ||
| userId, | ||
| userTeamIds, | ||
| userRole, | ||
| PROSPECTIVE_VIDEO_LINK_COST_CENTS, | ||
| 1, | ||
| ); | ||
| if (!budgetResult.allowed) { | ||
| throw new ConvexError({ | ||
| code: 'budgetExceeded', | ||
| message: | ||
| budgetResult.reason ?? | ||
| 'Usage limit reached — contact your administrator.', | ||
| }); | ||
| } | ||
|
|
||
| // SERVER-SIDE derive the dedup key + platform tag — do NOT trust | ||
| // client-supplied `normalizedUrl` / `sourcePlatform`. A hostile client | ||
| // could otherwise (a) collide-hash a different URL into an existing | ||
| // job's dedup slot to short-circuit fresh ingest, or (b) lie about | ||
| // platform to confuse downstream chip iconography / telemetry. | ||
| const serverNormalized = normalizeUrlForHash(args.url); | ||
| const serverPlatform = detectPlatform(args.url); | ||
| const sourceUrlHash = hashUrlForDedup(serverNormalized); |
There was a problem hiding this comment.
Reject obviously unsafe URLs before creating the job row.
This mutation only blocks playlist URLs up front. A direct client can still enqueue http://..., credential-bearing, localhost, or IP-literal URLs, which means those raw values get persisted in videoLinkJobs.sourceUrl and consume the org's rate-limit/in-flight budget until the async action rejects them. Please run the shared string-level safety gate here before insert/schedule, not just in the action.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@services/platform/convex/video_links/mutations.ts` around lines 107 - 163,
After the playlist check, run the shared string-level safety gate (the project's
central URL safety/validation helper) on the raw URL (args.url or
serverNormalized) immediately before calling checkOrganizationRateLimit so
unsafe URLs never consume rate-limit/budget; if the gate fails throw a
ConvexError (e.g. code 'unsafeUrl') and abort. Locate this insertion point near
isPlaylistUrl, normalizeUrlForHash, and checkOrganizationRateLimit in the
mutation and ensure the check rejects credential-bearing/localhost/IP-literal
URLs so none get persisted to videoLinkJobs.sourceUrl or queued for the async
action.
| const existing = | ||
| args.threadId === undefined | ||
| ? null | ||
| : await ctx.db | ||
| .query('videoLinkJobs') | ||
| .withIndex('by_organizationId_and_sourceUrlHash', (q) => | ||
| q | ||
| .eq('organizationId', args.organizationId) | ||
| .eq('sourceUrlHash', sourceUrlHash), | ||
| ) | ||
| .order('desc') | ||
| .first(); | ||
| if ( | ||
| existing && | ||
| existing.messageBoundAt === undefined && | ||
| existing.threadId === args.threadId && | ||
| existing.uploadedBy === userId && | ||
| existing.status !== 'failed' && | ||
| existing.status !== 'skipped' && | ||
| now - existing._creationTime < DEDUP_WINDOW_MS | ||
| ) { | ||
| // Refresh `pastedToken` to the most recent paste's substring. The | ||
| // bind path uses this verbatim to strip the URL from the outgoing | ||
| // message text via `String.prototype.replace` — a stale token from | ||
| // the first paste won't match the second paste's punctuation, and | ||
| // the raw URL leaks through to the LLM input alongside the | ||
| // transcript attachment. | ||
| if (args.pastedToken !== existing.pastedToken) { | ||
| await ctx.db.patch(existing._id, { pastedToken: args.pastedToken }); | ||
| } | ||
| return existing._id; | ||
| } |
There was a problem hiding this comment.
The dedup lookup only checks the newest hash match.
Using .first() on by_organizationId_and_sourceUrlHash means a newer row from another thread or another uploader can hide the actual matching in-composer job for this user/thread. In that case this path falls through and inserts a duplicate job, even though the draft already has an unbound chip for the same URL. Iterate the hash bucket until you find a same-thread/same-uploader/unbound candidate, or add a narrower index that matches the dedup scope.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@services/platform/convex/video_links/mutations.ts` around lines 196 - 227,
The current dedup uses a single .first() lookup on the videoLinkJobs index
by_organizationId_and_sourceUrlHash which can return a newer unrelated job and
miss a valid same-thread/same-uploader/unbound candidate; update the lookup in
the mutation that computes existing to iterate the index results (querying
videoLinkJobs with withIndex('by_organizationId_and_sourceUrlHash', ...) and
scanning rows) until you find a row where messageBoundAt is undefined, threadId
=== args.threadId, uploadedBy === userId and status not in ['failed','skipped']
and now - _creationTime < DEDUP_WINDOW_MS, then use that row (refresh
pastedToken if different) or fall back to inserting a new job—alternatively
add/replace with a narrower index that includes threadId and uploadedBy to avoid
scanning.
| const authUser = await authComponent.getAuthUser(ctx); | ||
| if (!authUser) return []; |
There was a problem hiding this comment.
Preserve the query soft-fail contract when auth lookup throws.
These handlers only return [] when getAuthUser() returns a falsy user. If Better Auth throws, the query errors and the chip subscription drops instead of degrading to “unauthenticated.” Catch that lookup failure here and return [], same as the other Convex query paths.
Based on learnings: In Convex query handlers, if authComponent.getAuthUser(ctx) fails, swallow the error and treat as unauthenticated without logging warnings/errors.
Also applies to: 188-189
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@services/platform/convex/video_links/queries.ts` around lines 131 - 132, Wrap
the call to authComponent.getAuthUser(ctx) in a try/catch in the Convex query
handlers in queries.ts (the places using authComponent.getAuthUser(ctx) around
the shown block and the second occurrence at the 188-189 region); on any thrown
error, silently treat it as an unauthenticated user and return [] (i.e., swallow
the error without logging and return the same fallback as when getAuthUser
returns falsy) so the query preserves the soft-fail contract.
The optimistic user bubble was being built with the full `🎬 [title] (video from bilibili, …) — transcript indexed; call document_retrieve(fileId)…` markdown appended to `content`, on the premise that this kept it byte-identical to the persisted bubble. That premise was wrong: the persisted side runs through `stripInternalFileReferences` (use-message-processing.ts) on read, which removes that whole block. So the optimistic bubble briefly rendered the verbose text-and-fileId footer (user content uses `whitespace-pre-wrap`, so the brackets/asterisks were shown literally) and then collapsed to the clean card on the server swap — exactly the flicker the user reported. Now optimistic content carries only the typed text; video-link metadata rides on `attachments[]`, same pattern as PDF/image/audio uploads. The bubble's file-displays renders the card from that array. The drift-reconcile path also only patches `attachments[]` now (content stays stable). Server `buildMessageWithAttachments` still emits the markdown for the agent prompt — that path is unchanged.
The upstream `.sha256sum` file for each Deno release contains a single
line of the form `<hash> <asset>` with the original asset filename
(e.g. `deno-x86_64-unknown-linux-gnu.zip`). The Dockerfile downloaded
the zip to `/tmp/deno.zip`, so `sha256sum -c /tmp/deno.sums` resolved
the filename in the sums file and tried to open the original asset
name — which did not exist — failing with `FAILED open or read`.
Download to `/tmp/${DENO_ASSET}` instead so the on-disk filename
matches what the sums file declares; subsequent `unzip` and cleanup
use the same variable.
`media-src 'none'` blocked `<audio>.src = /http_api/api/tts-audio?...`, surfacing as decode errors and `NotSupportedError` in the voice output player. Same-origin TTS is the only media source, so widen to `'self'`.
Summary
Adds support for pasting a video URL into the chat composer. Tale ingests the video into a regular attachment — caption-first when the source has uploader-provided or auto-generated captions, audio extraction + Whisper fallback otherwise — and feeds the resulting transcript through the same cascade as a file upload.
Server
videoLinks/module — schema, ingest orchestrator,yt-dlpwrapper, VTT/SRT captions parser, URL-safety helpers, internal mutations/queries. Orchestrator runs as a state machine (attempts / CAS guard / store-then-patch) with a direct watchdog cron for stuck jobs.yt-dlpsubprocess hardening — version-gated flag probe, IPv6 SSRF coverage, stderr surfaced in retry logs, no fictional CLI flags. Dockerfile pins the binary.lib/http/safe_fetch.tsextended for DNS-pinned SSRF protection.videoLinkJobstable with cascade + GDPR erasure coverage, plus RLS rules and access-control helpers.<untrusted_source>markers before the agent sees them; the URL is preserved on the transcript header for provenance. RAG /document_retrieveoutput gets the same wrapper.document_retrievehint to the agent.Client
video-link-chipanduse-chat-video-linkshook. Pasting a URL into the composer binds a chip with live status (Reading info → Fetching captions → Extracting audio → Transcribing → Ready), or Failed on error.chat-inputpaste handler — IME-guarded, unbinds on failure, blocks Send while a chip is in the failed state. Dedup scoped to unbound rows in the same composer.setPendingMessage.i18n + Legal + Docs
messages/{en,de,fr}.json— all new strings present in all three locales.docs/{en,de,fr}/platform/chat/attachments.mddocuments the new flow, limits, and the untrusted-source handling.Test plan
video_link.retryinglog.localhost./ IDN / overlong URL → blocked bysafe_fetchSSRF + URL-safety helpers.<untrusted_source>and gets thedocument_retrievehint.videoLinkJobsrows cascade per GDPR erasure coverage.Summary by CodeRabbit
New Features
Documentation
Infrastructure