Skip to content

add --chat-max-width CSS variable#3

Merged
Israeltheminer merged 1 commit into
mainfrom
update-ui
Dec 3, 2025
Merged

add --chat-max-width CSS variable#3
Israeltheminer merged 1 commit into
mainfrom
update-ui

Conversation

@Israeltheminer

@Israeltheminer Israeltheminer commented Dec 3, 2025

Copy link
Copy Markdown
Collaborator

…ents for consistent sizing

Summary by CodeRabbit

  • Style
    • Standardized maximum width constraints across the chat interface, including message containers, input areas, and rendered content elements for improved visual consistency throughout the conversation.

✏️ Tip: You can customize this high-level summary in your review settings.

@Israeltheminer Israeltheminer merged commit 3c13944 into main Dec 3, 2025
1 of 2 checks passed
@Israeltheminer Israeltheminer deleted the update-ui branch December 3, 2025 09:58
@coderabbitai

coderabbitai Bot commented Dec 3, 2025

Copy link
Copy Markdown
Contributor

Caution

Review failed

The pull request is closed.

📝 Walkthrough

Walkthrough

This PR introduces a CSS custom property to centralize the chat interface's maximum width constraint. The changes replace hardcoded width values (48rem in chat-interface.tsx and 46rem in message-bubble.tsx) with references to a new CSS variable --chat-max-width, which is defined in globals.css with a value of 48rem. The refactoring maintains layout structure and visual behavior while centralizing width configuration for easier future adjustments.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

  • Verify that the CSS custom property --chat-max-width: 48rem is correctly placed in globals.css under the light theme root
  • Check that all hardcoded width references have been replaced with the CSS variable across both component files
  • Note: The original hardcoded values differ (48rem vs 46rem); verify this intentional consolidation to 48rem is desired, as it may affect the visual layout of message-bubble content compared to its previous state
  • Confirm no hardcoded width values remain in the affected files that should use the new variable

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: ASSERTIVE

Plan: Pro (Legacy)

📥 Commits

Reviewing files that changed from the base of the PR and between ebf4146 and a8201b0.

📒 Files selected for processing (3)
  • services/platform/app/(app)/dashboard/[id]/chat/components/chat-interface.tsx (2 hunks)
  • services/platform/app/(app)/dashboard/[id]/chat/components/message-bubble.tsx (4 hunks)
  • services/platform/app/(app)/globals.css (1 hunks)

Comment @coderabbitai help to get the list of available commands and usage tips.

larryro added a commit that referenced this pull request Dec 30, 2025
- Changed import from type-only to value import for JEXL_TRANSFORMS
- Replaced duplicated local jexlTransforms array with spread of shared constant
- Ensures single source of truth for JEXL transform definitions

Addresses CodeRabbit review comments #2 and #3.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
larryro added a commit that referenced this pull request Dec 30, 2025
- Changed import from type-only to value import for JEXL_TRANSFORMS
- Replaced duplicated local jexlTransforms array with spread of shared constant
- Ensures single source of truth for JEXL transform definitions

Addresses CodeRabbit review comments #2 and #3.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
yannickmonney pushed a commit that referenced this pull request Apr 8, 2026
larryro added a commit that referenced this pull request May 3, 2026
…stic nonce

share_thread.ts: when the owner shares a thread, set
threadMetadata.disablePersonalization=true so subsequent owner turns
inside the shared thread do not inject the owner's memories or
customInstructions into replies that share-link viewers can read.
Mirror in unshareThread by clearing the field.

This is a small behavior change (owner sees no personalization in
a shared thread until they unshare or manually re-enable per-thread),
but matches the privacy notice's caveat #3 with a real defense
instead of a doc-only warning.

build_user_personalization.ts: nonce was Math.random() (4 hex / 16
bits), regenerated on every prompt build. Even though prompt caching
is not currently active upstream, randomized nonces would bust it as
soon as it is. Switch to fnv1aHash(memoryId + content) sliced to 4
hex — deterministic for unchanged memory sets, distinct across
memories. Structural injection defense remains escapeForXmlContent.
larryro added a commit that referenced this pull request May 7, 2026
Round-2-confirmed critical bug. The docstring on `requestErasure`
advertised `LEGAL_HOLD_BLOCKS_ERASURE` as a refusal code; the body had
`const blocked: string[] = []` hard-coded as a "phase-8 placeholder"
with `loadActiveHolds` commented out. Result: an org admin (or anyone
with a compromised admin token) could spoliate litigation-hold data via
this mutation, and the audit row would record `threadsBlocked: []` —
actively misleading. Direct GDPR Art 17(3)(e) and FRCP 37(e) violation.

`loadActiveHolds` ships in the same branch (legal_hold.ts:65) and is
already consumed by retention_cleanup; only erasure ignored it.

This commit:
- Calls `loadActiveHolds` and partitions the subject's threads into
  held vs erasable.
- Fails the whole request closed when `holds.orgHeld === true` or any
  of the subject's threads is held. Emits a `gdpr_erasure_blocked_by_hold`
  audit row before throwing so the refusal itself is logged in the
  tamper-evident chain (compliance teams need this).
- Throws `ConvexError({ code: 'LEGAL_HOLD_BLOCKS_ERASURE', orgHeld,
  heldThreadIds })` matching the docstring contract.
- Tracks per-thread cascade completion explicitly. The previous loop
  ran cascadeDeleteThreadChildren up to 50 times then incremented
  `erased` regardless of outcome; a thread with >50 pages of children
  was reported as erased while leaving residue. Now only counts threads
  whose cascade actually returned `done: true`, and emits the audit row
  with `status: 'failure'` + an explicit error message when any thread
  was partial.
- Audit row records `heldThreadIds` (when present) and
  `threadsTargeted` (so reviewers can spot partial erasures).

Out-of-scope here (separate work-streams):
- Sub-thread cascade hold check (W1 #3 — cascade_helpers).
- Snapshot-race fix in deleteExpired* mutations (W1 #4).
- Subject scope expansion (W2: userMemories, BetterAuth, audit-PII
  scrub, RAG propagation, gdprErasureRequests state machine).
larryro added a commit that referenced this pull request May 7, 2026
…ascade

Round-1 reviewer-confirmed: 12 of 14 retention cleanup categories did
not consult holds.orgHeld, and cascadeDeleteThreadChildren recursed
into sub-threads with no hold check. Both let a litigation/preservation
hold be silently bypassed for the table that records *why* the hold
exists (auditLogs), every PII-bearing table, and any held sub-thread
when its parent ages out — direct US FRCP 37(e) / EU GDPR Art 21
spoliation risk.

retention_cleanup.ts (W1 #2)
- Added `holds: ActiveHolds` to every cleanup category that lacked it:
  cleanupTempFiles (user + agent), cleanupAuditLogs, cleanupWorkflowLogs,
  cleanupUsageLedger, cleanupChatFilterEvents, cleanupPromptTemplates,
  cleanupMessageFeedback, cleanupMemoryAudit, cleanupCustomers,
  cleanupVendors, cleanupExternalConversations, cleanupMessageMetadata.
- Each short-circuits with a clear info log when holds.orgHeld is true.
- cleanupWorkflowLogs additionally consults holds.executionIds for
  per-execution holds (`targetType: 'execution'`) — until now those
  rows were silently ignored by cleanup.
- The dispatcher's category list updated to thread `holds` to all 15
  category invocations.

cascade_helpers.ts (W1 #3 + partial W1 #4)
- cascadeDeleteThreadChildren accepts an optional `holds` snapshot. If
  omitted AND organizationId is set, the helper re-loads `legalHolds`
  itself — closing the snapshot-once race where a hold placed mid-run
  provided zero protection because the dispatcher's pre-fetched Set
  was already stale by the time per-thread cascade fired.
- Sub-thread recursion now passes the snapshot through, so the
  per-sub-thread hold check uses the same authoritative read.
- The helper returns `{ done: true, remaining: 0 }` (no-op) on a held
  thread; callers (retention / erasure) treat that as "skip and
  continue" rather than "delete completed".
larryro added a commit that referenced this pull request May 7, 2026
…eout

Round-2 v15 confirmed: /config unauthenticated, /openapi.json + /docs +
/redoc unauthenticated, RAG container ran as root, default token baked
into image ENV, strict-mode env name diverged across the wire,
non-constant-time token compare, plus three SSRF-guard gaps.

services/rag/app/auth.py
- W7 #3: hmac.compare_digest replaces == on the bearer compare. Removes
  the dead-code EXEMPT_PATHS frozenset.

services/rag/app/routers/health.py
- W7 #1: split into public_router (`/`, `/health`) and protected_router
  (`/config`). main.py mounts the protected one under
  Depends(verify_internal_token). Old `router` re-export stays for
  backwards compat.

services/rag/app/main.py
- W7 #2: docs_url / redoc_url / openapi_url are None outside debug.
- W7 #4: CORS allow_credentials flipped to False (bearer rides
  Authorization, never cookies).
- W7 #1 wiring: mount health-public + health-protected separately.

services/rag/app/config.py
- W7 #8: require_custom_internal_token accepts BOTH
  RAG_REQUIRE_CUSTOM_INTERNAL_TOKEN and TALE_REQUIRE_CUSTOM_RAG_TOKEN
  via pydantic AliasChoices.

services/rag/Dockerfile + services/convex/Dockerfile
- W7 #5: RAG container runs as non-root (uid:gid 1001:1001 `app`).
  RAG ingests untrusted PDFs/DOCX through native parsers; biggest
  blast radius in the stack, now hardened.
- W7 #6: removed RAG_INTERNAL_TOKEN=tale-rag-dev-only ENV bake from
  both runtime + scratch-squash stages and the matching bake in
  services/convex/Dockerfile. Operators MUST supply via env / compose
  / k8s secret.

services/platform/convex/lib/helpers/rag_config.ts
- W7 #9 F1: `redirect: 'manual'` on every ragFetch.
- W7 #9 F2: added fc00::/7 (IPv6 ULA) to v6 blocklist (AWS IPv6 IMDSv2).
- W7 #9 F3: strip trailing `.` before hostname blocklist lookup.
- W7 #9 F4: re-validate URL per ragFetch invocation (DNS rebinding +
  env rotation mitigation).
- W7 #9 F9: deleted path.startsWith('http') override branch (future-
  bypass foot-gun).

services/platform/convex/agent_tools/rag/helpers/fetch_document_chunks.ts
- W7 #10: pass timeoutMs=60_000 (default 10s was a regression).
- Plus MAX_ITERATIONS=30 cap and "cursor did not advance" break to
  defend against an adversarial RAG response.
larryro added a commit that referenced this pull request May 8, 2026
Round-2-confirmed critical bug. The docstring on `requestErasure`
advertised `LEGAL_HOLD_BLOCKS_ERASURE` as a refusal code; the body had
`const blocked: string[] = []` hard-coded as a "phase-8 placeholder"
with `loadActiveHolds` commented out. Result: an org admin (or anyone
with a compromised admin token) could spoliate litigation-hold data via
this mutation, and the audit row would record `threadsBlocked: []` —
actively misleading. Direct GDPR Art 17(3)(e) and FRCP 37(e) violation.

`loadActiveHolds` ships in the same branch (legal_hold.ts:65) and is
already consumed by retention_cleanup; only erasure ignored it.

This commit:
- Calls `loadActiveHolds` and partitions the subject's threads into
  held vs erasable.
- Fails the whole request closed when `holds.orgHeld === true` or any
  of the subject's threads is held. Emits a `gdpr_erasure_blocked_by_hold`
  audit row before throwing so the refusal itself is logged in the
  tamper-evident chain (compliance teams need this).
- Throws `ConvexError({ code: 'LEGAL_HOLD_BLOCKS_ERASURE', orgHeld,
  heldThreadIds })` matching the docstring contract.
- Tracks per-thread cascade completion explicitly. The previous loop
  ran cascadeDeleteThreadChildren up to 50 times then incremented
  `erased` regardless of outcome; a thread with >50 pages of children
  was reported as erased while leaving residue. Now only counts threads
  whose cascade actually returned `done: true`, and emits the audit row
  with `status: 'failure'` + an explicit error message when any thread
  was partial.
- Audit row records `heldThreadIds` (when present) and
  `threadsTargeted` (so reviewers can spot partial erasures).

Out-of-scope here (separate work-streams):
- Sub-thread cascade hold check (W1 #3 — cascade_helpers).
- Snapshot-race fix in deleteExpired* mutations (W1 #4).
- Subject scope expansion (W2: userMemories, BetterAuth, audit-PII
  scrub, RAG propagation, gdprErasureRequests state machine).
larryro added a commit that referenced this pull request May 8, 2026
…ascade

Round-1 reviewer-confirmed: 12 of 14 retention cleanup categories did
not consult holds.orgHeld, and cascadeDeleteThreadChildren recursed
into sub-threads with no hold check. Both let a litigation/preservation
hold be silently bypassed for the table that records *why* the hold
exists (auditLogs), every PII-bearing table, and any held sub-thread
when its parent ages out — direct US FRCP 37(e) / EU GDPR Art 21
spoliation risk.

retention_cleanup.ts (W1 #2)
- Added `holds: ActiveHolds` to every cleanup category that lacked it:
  cleanupTempFiles (user + agent), cleanupAuditLogs, cleanupWorkflowLogs,
  cleanupUsageLedger, cleanupChatFilterEvents, cleanupPromptTemplates,
  cleanupMessageFeedback, cleanupMemoryAudit, cleanupCustomers,
  cleanupVendors, cleanupExternalConversations, cleanupMessageMetadata.
- Each short-circuits with a clear info log when holds.orgHeld is true.
- cleanupWorkflowLogs additionally consults holds.executionIds for
  per-execution holds (`targetType: 'execution'`) — until now those
  rows were silently ignored by cleanup.
- The dispatcher's category list updated to thread `holds` to all 15
  category invocations.

cascade_helpers.ts (W1 #3 + partial W1 #4)
- cascadeDeleteThreadChildren accepts an optional `holds` snapshot. If
  omitted AND organizationId is set, the helper re-loads `legalHolds`
  itself — closing the snapshot-once race where a hold placed mid-run
  provided zero protection because the dispatcher's pre-fetched Set
  was already stale by the time per-thread cascade fired.
- Sub-thread recursion now passes the snapshot through, so the
  per-sub-thread hold check uses the same authoritative read.
- The helper returns `{ done: true, remaining: 0 }` (no-op) on a held
  thread; callers (retention / erasure) treat that as "skip and
  continue" rather than "delete completed".
larryro added a commit that referenced this pull request May 8, 2026
…eout

Round-2 v15 confirmed: /config unauthenticated, /openapi.json + /docs +
/redoc unauthenticated, RAG container ran as root, default token baked
into image ENV, strict-mode env name diverged across the wire,
non-constant-time token compare, plus three SSRF-guard gaps.

services/rag/app/auth.py
- W7 #3: hmac.compare_digest replaces == on the bearer compare. Removes
  the dead-code EXEMPT_PATHS frozenset.

services/rag/app/routers/health.py
- W7 #1: split into public_router (`/`, `/health`) and protected_router
  (`/config`). main.py mounts the protected one under
  Depends(verify_internal_token). Old `router` re-export stays for
  backwards compat.

services/rag/app/main.py
- W7 #2: docs_url / redoc_url / openapi_url are None outside debug.
- W7 #4: CORS allow_credentials flipped to False (bearer rides
  Authorization, never cookies).
- W7 #1 wiring: mount health-public + health-protected separately.

services/rag/app/config.py
- W7 #8: require_custom_internal_token accepts BOTH
  RAG_REQUIRE_CUSTOM_INTERNAL_TOKEN and TALE_REQUIRE_CUSTOM_RAG_TOKEN
  via pydantic AliasChoices.

services/rag/Dockerfile + services/convex/Dockerfile
- W7 #5: RAG container runs as non-root (uid:gid 1001:1001 `app`).
  RAG ingests untrusted PDFs/DOCX through native parsers; biggest
  blast radius in the stack, now hardened.
- W7 #6: removed RAG_INTERNAL_TOKEN=tale-rag-dev-only ENV bake from
  both runtime + scratch-squash stages and the matching bake in
  services/convex/Dockerfile. Operators MUST supply via env / compose
  / k8s secret.

services/platform/convex/lib/helpers/rag_config.ts
- W7 #9 F1: `redirect: 'manual'` on every ragFetch.
- W7 #9 F2: added fc00::/7 (IPv6 ULA) to v6 blocklist (AWS IPv6 IMDSv2).
- W7 #9 F3: strip trailing `.` before hostname blocklist lookup.
- W7 #9 F4: re-validate URL per ragFetch invocation (DNS rebinding +
  env rotation mitigation).
- W7 #9 F9: deleted path.startsWith('http') override branch (future-
  bypass foot-gun).

services/platform/convex/agent_tools/rag/helpers/fetch_document_chunks.ts
- W7 #10: pass timeoutMs=60_000 (default 10s was a regression).
- Plus MAX_ITERATIONS=30 cap and "cursor did not advance" break to
  defend against an adversarial RAG response.
larryro added a commit that referenced this pull request May 8, 2026
Round-2-confirmed critical bug. The docstring on `requestErasure`
advertised `LEGAL_HOLD_BLOCKS_ERASURE` as a refusal code; the body had
`const blocked: string[] = []` hard-coded as a "phase-8 placeholder"
with `loadActiveHolds` commented out. Result: an org admin (or anyone
with a compromised admin token) could spoliate litigation-hold data via
this mutation, and the audit row would record `threadsBlocked: []` —
actively misleading. Direct GDPR Art 17(3)(e) and FRCP 37(e) violation.

`loadActiveHolds` ships in the same branch (legal_hold.ts:65) and is
already consumed by retention_cleanup; only erasure ignored it.

This commit:
- Calls `loadActiveHolds` and partitions the subject's threads into
  held vs erasable.
- Fails the whole request closed when `holds.orgHeld === true` or any
  of the subject's threads is held. Emits a `gdpr_erasure_blocked_by_hold`
  audit row before throwing so the refusal itself is logged in the
  tamper-evident chain (compliance teams need this).
- Throws `ConvexError({ code: 'LEGAL_HOLD_BLOCKS_ERASURE', orgHeld,
  heldThreadIds })` matching the docstring contract.
- Tracks per-thread cascade completion explicitly. The previous loop
  ran cascadeDeleteThreadChildren up to 50 times then incremented
  `erased` regardless of outcome; a thread with >50 pages of children
  was reported as erased while leaving residue. Now only counts threads
  whose cascade actually returned `done: true`, and emits the audit row
  with `status: 'failure'` + an explicit error message when any thread
  was partial.
- Audit row records `heldThreadIds` (when present) and
  `threadsTargeted` (so reviewers can spot partial erasures).

Out-of-scope here (separate work-streams):
- Sub-thread cascade hold check (W1 #3 — cascade_helpers).
- Snapshot-race fix in deleteExpired* mutations (W1 #4).
- Subject scope expansion (W2: userMemories, BetterAuth, audit-PII
  scrub, RAG propagation, gdprErasureRequests state machine).
larryro added a commit that referenced this pull request May 8, 2026
…ascade

Round-1 reviewer-confirmed: 12 of 14 retention cleanup categories did
not consult holds.orgHeld, and cascadeDeleteThreadChildren recursed
into sub-threads with no hold check. Both let a litigation/preservation
hold be silently bypassed for the table that records *why* the hold
exists (auditLogs), every PII-bearing table, and any held sub-thread
when its parent ages out — direct US FRCP 37(e) / EU GDPR Art 21
spoliation risk.

retention_cleanup.ts (W1 #2)
- Added `holds: ActiveHolds` to every cleanup category that lacked it:
  cleanupTempFiles (user + agent), cleanupAuditLogs, cleanupWorkflowLogs,
  cleanupUsageLedger, cleanupChatFilterEvents, cleanupPromptTemplates,
  cleanupMessageFeedback, cleanupMemoryAudit, cleanupCustomers,
  cleanupVendors, cleanupExternalConversations, cleanupMessageMetadata.
- Each short-circuits with a clear info log when holds.orgHeld is true.
- cleanupWorkflowLogs additionally consults holds.executionIds for
  per-execution holds (`targetType: 'execution'`) — until now those
  rows were silently ignored by cleanup.
- The dispatcher's category list updated to thread `holds` to all 15
  category invocations.

cascade_helpers.ts (W1 #3 + partial W1 #4)
- cascadeDeleteThreadChildren accepts an optional `holds` snapshot. If
  omitted AND organizationId is set, the helper re-loads `legalHolds`
  itself — closing the snapshot-once race where a hold placed mid-run
  provided zero protection because the dispatcher's pre-fetched Set
  was already stale by the time per-thread cascade fired.
- Sub-thread recursion now passes the snapshot through, so the
  per-sub-thread hold check uses the same authoritative read.
- The helper returns `{ done: true, remaining: 0 }` (no-op) on a held
  thread; callers (retention / erasure) treat that as "skip and
  continue" rather than "delete completed".
larryro added a commit that referenced this pull request May 8, 2026
…eout

Round-2 v15 confirmed: /config unauthenticated, /openapi.json + /docs +
/redoc unauthenticated, RAG container ran as root, default token baked
into image ENV, strict-mode env name diverged across the wire,
non-constant-time token compare, plus three SSRF-guard gaps.

services/rag/app/auth.py
- W7 #3: hmac.compare_digest replaces == on the bearer compare. Removes
  the dead-code EXEMPT_PATHS frozenset.

services/rag/app/routers/health.py
- W7 #1: split into public_router (`/`, `/health`) and protected_router
  (`/config`). main.py mounts the protected one under
  Depends(verify_internal_token). Old `router` re-export stays for
  backwards compat.

services/rag/app/main.py
- W7 #2: docs_url / redoc_url / openapi_url are None outside debug.
- W7 #4: CORS allow_credentials flipped to False (bearer rides
  Authorization, never cookies).
- W7 #1 wiring: mount health-public + health-protected separately.

services/rag/app/config.py
- W7 #8: require_custom_internal_token accepts BOTH
  RAG_REQUIRE_CUSTOM_INTERNAL_TOKEN and TALE_REQUIRE_CUSTOM_RAG_TOKEN
  via pydantic AliasChoices.

services/rag/Dockerfile + services/convex/Dockerfile
- W7 #5: RAG container runs as non-root (uid:gid 1001:1001 `app`).
  RAG ingests untrusted PDFs/DOCX through native parsers; biggest
  blast radius in the stack, now hardened.
- W7 #6: removed RAG_INTERNAL_TOKEN=tale-rag-dev-only ENV bake from
  both runtime + scratch-squash stages and the matching bake in
  services/convex/Dockerfile. Operators MUST supply via env / compose
  / k8s secret.

services/platform/convex/lib/helpers/rag_config.ts
- W7 #9 F1: `redirect: 'manual'` on every ragFetch.
- W7 #9 F2: added fc00::/7 (IPv6 ULA) to v6 blocklist (AWS IPv6 IMDSv2).
- W7 #9 F3: strip trailing `.` before hostname blocklist lookup.
- W7 #9 F4: re-validate URL per ragFetch invocation (DNS rebinding +
  env rotation mitigation).
- W7 #9 F9: deleted path.startsWith('http') override branch (future-
  bypass foot-gun).

services/platform/convex/agent_tools/rag/helpers/fetch_document_chunks.ts
- W7 #10: pass timeoutMs=60_000 (default 10s was a regression).
- Plus MAX_ITERATIONS=30 cap and "cursor did not advance" break to
  defend against an adversarial RAG response.
larryro added a commit that referenced this pull request May 9, 2026
P1-A/B — drop dead `messageMetadata` + `workflowTriggerLog` registry
entries (round-2 V2). The two soft-delete registry rows had latent
correctness problems:
  - `messageMetadata` had no `organizationId`/`lifecycleStatus`/
    `statusChangedAt` columns at the schema layer, so the generic
    Trash flow short-circuited at `not_trashed` — but a future
    column-add would silently activate cross-org restores.
  - `workflowTriggerLog` mapped to `tableName: 'wfExecutions'`
    instead of `wfTriggerLogs`. Restore-by-id-as-trigger-log would
    flip an execution row and emit a misleading audit subtype.
Both already had bespoke retention paths
(`deleteExpiredMessageMetadata` / `deleteExpiredWorkflowTriggerLog`)
that didn't go through the generic Trash flow. Dropped from
`SOFT_DELETE_RESOURCE_TYPES`, the registry config, and the
queries.ts switch. Trash-page alias collapses to the full list.

P1-I — `cleanupTempFiles` mass-delete on `userTempRetentionHours = 0`
(round-1 #3). The Zod schema permits `0` as a legitimate config
value but the cleanup used `?? DEFAULT_TEMP_RETENTION_HOURS` which
doesn't catch `0`; the resulting `cutoffMs = Date.now()` would
delete every temp file in the org. Mirrors the `<= 0` guard the
day-based cleanups already use; treats 0 / negative as
feature-gated-off rather than delete-everything-now, with a loud
console.warn so operators see the misconfig.

P1-M — Trash pagination index narrowing + threadMetadata index
(round-2 V9). `fetchTrashSubpage`'s threadMetadata branch was the
worst-affected (`by_organizationId` index walks active rows first,
saturates the take budget, never reaches trashed/expired tail in
any org with >250 active threads). Added the missing composite
`by_organizationId_and_status` index to threadMetadata schema and
updated the thread branch to take two equality slices ('trashed' +
'expired'). The other 12 trashable categories already have the
composite `by_organizationId_and_lifecycleStatus` index — a future
follow-up extends the same two-slice pattern there.

P1-N — Retention drawer key stability (round-2 V9). The drawer
keyed `RetentionEditFormBody` on `JSON.stringify(savedConfig)`,
which meant ANY reactive Convex update to the saved config would
re-mount the form mid-edit and silently throw away the user's
pending input. Switched to `open`-gated rendering only; the form
unmounts on close and the next open binds a new instance — same
"fresh-snapshot per open" semantics, no parallel-save races.

P1-Z — RAG dbmate password leak (round-2 V9). On a dbmate failure,
the entrypoint streamed the full log to stderr; that log includes
\`postgres://user:password@host\` from the connection URL, which
ends up in container/host journalctl history. Added a sed redaction
pass that masks the password segment before the cat-to-stderr.

Verified: typecheck clean; 615 tests pass across affected dirs.
larryro added a commit that referenced this pull request May 9, 2026
Round-2-confirmed critical bug. The docstring on `requestErasure`
advertised `LEGAL_HOLD_BLOCKS_ERASURE` as a refusal code; the body had
`const blocked: string[] = []` hard-coded as a "phase-8 placeholder"
with `loadActiveHolds` commented out. Result: an org admin (or anyone
with a compromised admin token) could spoliate litigation-hold data via
this mutation, and the audit row would record `threadsBlocked: []` —
actively misleading. Direct GDPR Art 17(3)(e) and FRCP 37(e) violation.

`loadActiveHolds` ships in the same branch (legal_hold.ts:65) and is
already consumed by retention_cleanup; only erasure ignored it.

This commit:
- Calls `loadActiveHolds` and partitions the subject's threads into
  held vs erasable.
- Fails the whole request closed when `holds.orgHeld === true` or any
  of the subject's threads is held. Emits a `gdpr_erasure_blocked_by_hold`
  audit row before throwing so the refusal itself is logged in the
  tamper-evident chain (compliance teams need this).
- Throws `ConvexError({ code: 'LEGAL_HOLD_BLOCKS_ERASURE', orgHeld,
  heldThreadIds })` matching the docstring contract.
- Tracks per-thread cascade completion explicitly. The previous loop
  ran cascadeDeleteThreadChildren up to 50 times then incremented
  `erased` regardless of outcome; a thread with >50 pages of children
  was reported as erased while leaving residue. Now only counts threads
  whose cascade actually returned `done: true`, and emits the audit row
  with `status: 'failure'` + an explicit error message when any thread
  was partial.
- Audit row records `heldThreadIds` (when present) and
  `threadsTargeted` (so reviewers can spot partial erasures).

Out-of-scope here (separate work-streams):
- Sub-thread cascade hold check (W1 #3 — cascade_helpers).
- Snapshot-race fix in deleteExpired* mutations (W1 #4).
- Subject scope expansion (W2: userMemories, BetterAuth, audit-PII
  scrub, RAG propagation, gdprErasureRequests state machine).
larryro added a commit that referenced this pull request May 9, 2026
…ascade

Round-1 reviewer-confirmed: 12 of 14 retention cleanup categories did
not consult holds.orgHeld, and cascadeDeleteThreadChildren recursed
into sub-threads with no hold check. Both let a litigation/preservation
hold be silently bypassed for the table that records *why* the hold
exists (auditLogs), every PII-bearing table, and any held sub-thread
when its parent ages out — direct US FRCP 37(e) / EU GDPR Art 21
spoliation risk.

retention_cleanup.ts (W1 #2)
- Added `holds: ActiveHolds` to every cleanup category that lacked it:
  cleanupTempFiles (user + agent), cleanupAuditLogs, cleanupWorkflowLogs,
  cleanupUsageLedger, cleanupChatFilterEvents, cleanupPromptTemplates,
  cleanupMessageFeedback, cleanupMemoryAudit, cleanupCustomers,
  cleanupVendors, cleanupExternalConversations, cleanupMessageMetadata.
- Each short-circuits with a clear info log when holds.orgHeld is true.
- cleanupWorkflowLogs additionally consults holds.executionIds for
  per-execution holds (`targetType: 'execution'`) — until now those
  rows were silently ignored by cleanup.
- The dispatcher's category list updated to thread `holds` to all 15
  category invocations.

cascade_helpers.ts (W1 #3 + partial W1 #4)
- cascadeDeleteThreadChildren accepts an optional `holds` snapshot. If
  omitted AND organizationId is set, the helper re-loads `legalHolds`
  itself — closing the snapshot-once race where a hold placed mid-run
  provided zero protection because the dispatcher's pre-fetched Set
  was already stale by the time per-thread cascade fired.
- Sub-thread recursion now passes the snapshot through, so the
  per-sub-thread hold check uses the same authoritative read.
- The helper returns `{ done: true, remaining: 0 }` (no-op) on a held
  thread; callers (retention / erasure) treat that as "skip and
  continue" rather than "delete completed".
larryro added a commit that referenced this pull request May 9, 2026
…eout

Round-2 v15 confirmed: /config unauthenticated, /openapi.json + /docs +
/redoc unauthenticated, RAG container ran as root, default token baked
into image ENV, strict-mode env name diverged across the wire,
non-constant-time token compare, plus three SSRF-guard gaps.

services/rag/app/auth.py
- W7 #3: hmac.compare_digest replaces == on the bearer compare. Removes
  the dead-code EXEMPT_PATHS frozenset.

services/rag/app/routers/health.py
- W7 #1: split into public_router (`/`, `/health`) and protected_router
  (`/config`). main.py mounts the protected one under
  Depends(verify_internal_token). Old `router` re-export stays for
  backwards compat.

services/rag/app/main.py
- W7 #2: docs_url / redoc_url / openapi_url are None outside debug.
- W7 #4: CORS allow_credentials flipped to False (bearer rides
  Authorization, never cookies).
- W7 #1 wiring: mount health-public + health-protected separately.

services/rag/app/config.py
- W7 #8: require_custom_internal_token accepts BOTH
  RAG_REQUIRE_CUSTOM_INTERNAL_TOKEN and TALE_REQUIRE_CUSTOM_RAG_TOKEN
  via pydantic AliasChoices.

services/rag/Dockerfile + services/convex/Dockerfile
- W7 #5: RAG container runs as non-root (uid:gid 1001:1001 `app`).
  RAG ingests untrusted PDFs/DOCX through native parsers; biggest
  blast radius in the stack, now hardened.
- W7 #6: removed RAG_INTERNAL_TOKEN=tale-rag-dev-only ENV bake from
  both runtime + scratch-squash stages and the matching bake in
  services/convex/Dockerfile. Operators MUST supply via env / compose
  / k8s secret.

services/platform/convex/lib/helpers/rag_config.ts
- W7 #9 F1: `redirect: 'manual'` on every ragFetch.
- W7 #9 F2: added fc00::/7 (IPv6 ULA) to v6 blocklist (AWS IPv6 IMDSv2).
- W7 #9 F3: strip trailing `.` before hostname blocklist lookup.
- W7 #9 F4: re-validate URL per ragFetch invocation (DNS rebinding +
  env rotation mitigation).
- W7 #9 F9: deleted path.startsWith('http') override branch (future-
  bypass foot-gun).

services/platform/convex/agent_tools/rag/helpers/fetch_document_chunks.ts
- W7 #10: pass timeoutMs=60_000 (default 10s was a regression).
- Plus MAX_ITERATIONS=30 cap and "cursor did not advance" break to
  defend against an adversarial RAG response.
larryro added a commit that referenced this pull request May 9, 2026
P1-A/B — drop dead `messageMetadata` + `workflowTriggerLog` registry
entries (round-2 V2). The two soft-delete registry rows had latent
correctness problems:
  - `messageMetadata` had no `organizationId`/`lifecycleStatus`/
    `statusChangedAt` columns at the schema layer, so the generic
    Trash flow short-circuited at `not_trashed` — but a future
    column-add would silently activate cross-org restores.
  - `workflowTriggerLog` mapped to `tableName: 'wfExecutions'`
    instead of `wfTriggerLogs`. Restore-by-id-as-trigger-log would
    flip an execution row and emit a misleading audit subtype.
Both already had bespoke retention paths
(`deleteExpiredMessageMetadata` / `deleteExpiredWorkflowTriggerLog`)
that didn't go through the generic Trash flow. Dropped from
`SOFT_DELETE_RESOURCE_TYPES`, the registry config, and the
queries.ts switch. Trash-page alias collapses to the full list.

P1-I — `cleanupTempFiles` mass-delete on `userTempRetentionHours = 0`
(round-1 #3). The Zod schema permits `0` as a legitimate config
value but the cleanup used `?? DEFAULT_TEMP_RETENTION_HOURS` which
doesn't catch `0`; the resulting `cutoffMs = Date.now()` would
delete every temp file in the org. Mirrors the `<= 0` guard the
day-based cleanups already use; treats 0 / negative as
feature-gated-off rather than delete-everything-now, with a loud
console.warn so operators see the misconfig.

P1-M — Trash pagination index narrowing + threadMetadata index
(round-2 V9). `fetchTrashSubpage`'s threadMetadata branch was the
worst-affected (`by_organizationId` index walks active rows first,
saturates the take budget, never reaches trashed/expired tail in
any org with >250 active threads). Added the missing composite
`by_organizationId_and_status` index to threadMetadata schema and
updated the thread branch to take two equality slices ('trashed' +
'expired'). The other 12 trashable categories already have the
composite `by_organizationId_and_lifecycleStatus` index — a future
follow-up extends the same two-slice pattern there.

P1-N — Retention drawer key stability (round-2 V9). The drawer
keyed `RetentionEditFormBody` on `JSON.stringify(savedConfig)`,
which meant ANY reactive Convex update to the saved config would
re-mount the form mid-edit and silently throw away the user's
pending input. Switched to `open`-gated rendering only; the form
unmounts on close and the next open binds a new instance — same
"fresh-snapshot per open" semantics, no parallel-save races.

P1-Z — RAG dbmate password leak (round-2 V9). On a dbmate failure,
the entrypoint streamed the full log to stderr; that log includes
\`postgres://user:password@host\` from the connection URL, which
ends up in container/host journalctl history. Added a sed redaction
pass that masks the password segment before the cat-to-stderr.

Verified: typecheck clean; 615 tests pass across affected dirs.
larryro added a commit that referenced this pull request May 17, 2026
Closes #3, #19, #20, #21, #22, #23, #24, #25, #26, #29, #39 — frontend
audio UX + resolver tests.

- `message-bubble.tsx` renders a single stable `<VoiceOutputIndicator>`
  per assistant message instead of three separate mounts (inline-
  streaming + two toolbar copies). The previous shape unmounted the
  inline indicator at streaming-end → triggered `stop()` → mounted a
  fresh toolbar indicator with a `mountTimeRef` captured AFTER all
  chunks were created → auto-play short-circuited and the user heard
  silence at the stream-end boundary. The single mount keeps
  `mountTimeRef` stable across both phases. (#3)
- `use-voice-output.ts` tracks every retry `setTimeout` id in a `Set`
  ref and clears them on unmount + on message change. The prior code
  let the 1.5s backoff timer fire after unmount and re-invoke
  `synthesize` against a dead component. (#19)
- `use-voice-output.ts` caps the synthesis queue at
  `MAX_TTS_QUEUE_DEPTH = 50`. When full, drops the new task and
  surfaces `QUEUE_OVERFLOW` via the error sink so the user sees why
  playback paused. `MAX_IN_FLIGHT` previously throttled concurrent
  dispatch but did not bound queue depth. (#20)
- `use-voice-output.ts` catch branch now falls back to
  `'UNKNOWN_NETWORK'` when `extractConvexErrorCode` returns undefined
  (network drop, action timeout). Previously the only signal was
  `console.error`; the indicator stayed stuck with no actionable
  message. (#21)
- `use-voice-output-player.ts` re-calls `primeAudio(el)` at the start
  of every `play()` invocation and drops the `el.load()` in `stop()`.
  Together these stop iOS Safari from expiring the user-activation
  token between messages of a session. (#22)
- `voice-output-context.tsx` + `prime-audio.ts`: per-provider audio
  element ownership. Each `<VoiceOutputProvider>` constructs its own
  `<audio>` via `useMemo` and exposes it via `useVoiceAudioElement()`.
  The prior module-level singleton meant arena split-view's two
  providers stomped each other's `src` mid-playback. `primeAudio(el?)`
  now takes the element to pre-warm; callers without a provider scope
  (settings page) call it with `undefined` and only the AudioContext
  is banked. (#23)
- `voice-output-indicator.tsx` classifies error codes into
  `retryable | config | terminal`. Config codes (NO_PROVIDER,
  HOST_POLICY, forbidden) render a `<Link>` to Settings → AI
  providers; terminal codes (BUDGET_EXCEEDED, QUEUE_OVERFLOW, char-
  cap) render a non-interactive `<Badge>`. Only retryable codes keep
  the click-to-retry button. Stops the tap→fail→tap→fail loop on
  unrecoverable errors. (#24)
- `voice-output-announcer.tsx` now reads `{ state, errorCode }` from
  the announcer store and speaks the per-code reason on transitions
  into `'error'` (e.g. "Voice provider not configured"). Screen-
  reader users on touch devices — where the indicator's per-code
  tooltip is unreachable — now hear the actionable reason instead of
  the generic "Voice output failed". (#25)
- `personalization-settings.tsx` composes the `providerUnavailable`
  hint into the Switch's `description` prop (a ReactNode) when
  `providerAvailable === false`. The hint now lands in the same
  `aria-describedby` block as the base description, so SR focus on
  the Switch reads it. The duplicate sibling `<Text>` is removed. (#26)
- `voice-output-announcer.tsx` drains announcements through a small
  queue with a 1500ms hold per entry. Rapid transitions
  (playing → blocked → error in <1500ms) no longer clobber the
  previous text mid-utterance; each entry plays in order. (#39)
- `resolve_tts_model.test.ts` adds the missing call-contract assertions
  (tag=text-to-speech, orgSlug propagation, providerName propagation
  on a pinned-provider call) and three failure-path tests that pin
  the resolver's re-throw behaviour for UNKNOWN_MODEL,
  UNKNOWN_PROVIDER, and plain rejections. Without these, a regression
  that hard-coded `tag: 'chat'` or dropped `orgSlug` would have passed
  every prior test silently. (#29)
- i18n: `voiceOutputErrorConfig`, `voiceOutputErrorOpenSettings`,
  `voiceOutputErrorQueueOverflow`, `voiceOutputErrorNetwork` added to
  en/de/fr. The pre-existing orphan `voiceOutputErrorProvider` is
  removed (superseded by `voiceOutputErrorConfig`).
larryro added a commit that referenced this pull request May 17, 2026
Closes #3, #19, #20, #21, #22, #23, #24, #25, #26, #29, #39 — frontend
audio UX + resolver tests.

- `message-bubble.tsx` renders a single stable `<VoiceOutputIndicator>`
  per assistant message instead of three separate mounts (inline-
  streaming + two toolbar copies). The previous shape unmounted the
  inline indicator at streaming-end → triggered `stop()` → mounted a
  fresh toolbar indicator with a `mountTimeRef` captured AFTER all
  chunks were created → auto-play short-circuited and the user heard
  silence at the stream-end boundary. The single mount keeps
  `mountTimeRef` stable across both phases. (#3)
- `use-voice-output.ts` tracks every retry `setTimeout` id in a `Set`
  ref and clears them on unmount + on message change. The prior code
  let the 1.5s backoff timer fire after unmount and re-invoke
  `synthesize` against a dead component. (#19)
- `use-voice-output.ts` caps the synthesis queue at
  `MAX_TTS_QUEUE_DEPTH = 50`. When full, drops the new task and
  surfaces `QUEUE_OVERFLOW` via the error sink so the user sees why
  playback paused. `MAX_IN_FLIGHT` previously throttled concurrent
  dispatch but did not bound queue depth. (#20)
- `use-voice-output.ts` catch branch now falls back to
  `'UNKNOWN_NETWORK'` when `extractConvexErrorCode` returns undefined
  (network drop, action timeout). Previously the only signal was
  `console.error`; the indicator stayed stuck with no actionable
  message. (#21)
- `use-voice-output-player.ts` re-calls `primeAudio(el)` at the start
  of every `play()` invocation and drops the `el.load()` in `stop()`.
  Together these stop iOS Safari from expiring the user-activation
  token between messages of a session. (#22)
- `voice-output-context.tsx` + `prime-audio.ts`: per-provider audio
  element ownership. Each `<VoiceOutputProvider>` constructs its own
  `<audio>` via `useMemo` and exposes it via `useVoiceAudioElement()`.
  The prior module-level singleton meant arena split-view's two
  providers stomped each other's `src` mid-playback. `primeAudio(el?)`
  now takes the element to pre-warm; callers without a provider scope
  (settings page) call it with `undefined` and only the AudioContext
  is banked. (#23)
- `voice-output-indicator.tsx` classifies error codes into
  `retryable | config | terminal`. Config codes (NO_PROVIDER,
  HOST_POLICY, forbidden) render a `<Link>` to Settings → AI
  providers; terminal codes (BUDGET_EXCEEDED, QUEUE_OVERFLOW, char-
  cap) render a non-interactive `<Badge>`. Only retryable codes keep
  the click-to-retry button. Stops the tap→fail→tap→fail loop on
  unrecoverable errors. (#24)
- `voice-output-announcer.tsx` now reads `{ state, errorCode }` from
  the announcer store and speaks the per-code reason on transitions
  into `'error'` (e.g. "Voice provider not configured"). Screen-
  reader users on touch devices — where the indicator's per-code
  tooltip is unreachable — now hear the actionable reason instead of
  the generic "Voice output failed". (#25)
- `personalization-settings.tsx` composes the `providerUnavailable`
  hint into the Switch's `description` prop (a ReactNode) when
  `providerAvailable === false`. The hint now lands in the same
  `aria-describedby` block as the base description, so SR focus on
  the Switch reads it. The duplicate sibling `<Text>` is removed. (#26)
- `voice-output-announcer.tsx` drains announcements through a small
  queue with a 1500ms hold per entry. Rapid transitions
  (playing → blocked → error in <1500ms) no longer clobber the
  previous text mid-utterance; each entry plays in order. (#39)
- `resolve_tts_model.test.ts` adds the missing call-contract assertions
  (tag=text-to-speech, orgSlug propagation, providerName propagation
  on a pinned-provider call) and three failure-path tests that pin
  the resolver's re-throw behaviour for UNKNOWN_MODEL,
  UNKNOWN_PROVIDER, and plain rejections. Without these, a regression
  that hard-coded `tag: 'chat'` or dropped `orgSlug` would have passed
  every prior test silently. (#29)
- i18n: `voiceOutputErrorConfig`, `voiceOutputErrorOpenSettings`,
  `voiceOutputErrorQueueOverflow`, `voiceOutputErrorNetwork` added to
  en/de/fr. The pre-existing orphan `voiceOutputErrorProvider` is
  removed (superseded by `voiceOutputErrorConfig`).
larryro added a commit that referenced this pull request May 20, 2026
)

Mocks _generated/server.internalMutation so the real handler is callable
with a fabricated ctx (matches the file_metadata/internal_mutations.test.ts
pattern). Covers:

- Empty in-flight → row inserted with status='queued', lifecycleStatus='active'.
- Cap reached (4 running) → throws ConvexError (atomic concurrency cap,
  closes the TOCTOU race R1.8/R1.10 flagged).
- Daily CPU budget pre-debit overflow (4 × 500s prior + 30s requested >
  1800s cap) → throws — pre-debit semantics verified, closes R1.10's
  post-debit overshoot.
- recoverStuckSandboxes — only the row whose heartbeatAt is older than
  2×max-timeout gets flipped to failed/SPAWNER_UNAVAILABLE.

All 4 tests pass via vitest. Combined with the 9-test argv builder gate
shipped in M1, that's two of R1.22's five critical regression gates.
The remaining three (in-container privilege assertion, fileMetadata IDOR
via inputFiles, cancellation propagation) require either a running
docker daemon (privilege) or a Convex test harness (IDOR / cancellation);
both are integration-test scope and best added when wiring up CI for the
sandbox stack.
larryro added a commit that referenced this pull request May 21, 2026
… heartbeat

Three coupled fixes to the Convex side of the sandbox state machine that
together close the failure modes round-2 verification confirmed:

R2-B7 #1: `codeStorageId` was stored before `reserveSlotAndInsert` but
the rollback set was constructed AFTER reservation. A QUOTA_EXCEEDED
throw orphaned one `_storage` blob per rejected run. Catch the reserve
error and `ctx.storage.delete()` the blob before rethrowing.

R2-B7 #2: the 90-day audit GC dropped audit rows without touching their
code/stdout/stderr storage blobs. Inline-delete those three blob types
before the row delete (mutation contexts CAN call `ctx.storage.delete`,
per `workflows/executions/delete_storage_blob.ts:20`). Watchdog reaps
the same way so a stuck row doesn't sit on its blobs for 90 days.
Output-file blobs are still owned by `fileMetadata` and not touched
here.

R2-B6 #1/#2/#3: `recoverStuckSandboxes` now caps each per-status scan
at 200 rows so the mutation can't blow its doc-read budget mid-sweep
(cron re-runs every 5 min and picks up the trailing rows). The
heartbeat `setInterval` callback wraps the mutation call in
try/catch+console.warn so a stalled heartbeat is visible rather than
silently aging into a watchdog reap. Explicit `await tickHeartbeat()`
between each `ctx.storage.store` keeps `heartbeatAt` fresh during
multi-MB upload tails. Watchdog cutoff is now `max_timeout + 600s` so
those upload tails fit inside the budget by construction.
larryro added a commit that referenced this pull request May 24, 2026
)

Mocks _generated/server.internalMutation so the real handler is callable
with a fabricated ctx (matches the file_metadata/internal_mutations.test.ts
pattern). Covers:

- Empty in-flight → row inserted with status='queued', lifecycleStatus='active'.
- Cap reached (4 running) → throws ConvexError (atomic concurrency cap,
  closes the TOCTOU race R1.8/R1.10 flagged).
- Daily CPU budget pre-debit overflow (4 × 500s prior + 30s requested >
  1800s cap) → throws — pre-debit semantics verified, closes R1.10's
  post-debit overshoot.
- recoverStuckSandboxes — only the row whose heartbeatAt is older than
  2×max-timeout gets flipped to failed/SPAWNER_UNAVAILABLE.

All 4 tests pass via vitest. Combined with the 9-test argv builder gate
shipped in M1, that's two of R1.22's five critical regression gates.
The remaining three (in-container privilege assertion, fileMetadata IDOR
via inputFiles, cancellation propagation) require either a running
docker daemon (privilege) or a Convex test harness (IDOR / cancellation);
both are integration-test scope and best added when wiring up CI for the
sandbox stack.
larryro added a commit that referenced this pull request May 24, 2026
… heartbeat

Three coupled fixes to the Convex side of the sandbox state machine that
together close the failure modes round-2 verification confirmed:

R2-B7 #1: `codeStorageId` was stored before `reserveSlotAndInsert` but
the rollback set was constructed AFTER reservation. A QUOTA_EXCEEDED
throw orphaned one `_storage` blob per rejected run. Catch the reserve
error and `ctx.storage.delete()` the blob before rethrowing.

R2-B7 #2: the 90-day audit GC dropped audit rows without touching their
code/stdout/stderr storage blobs. Inline-delete those three blob types
before the row delete (mutation contexts CAN call `ctx.storage.delete`,
per `workflows/executions/delete_storage_blob.ts:20`). Watchdog reaps
the same way so a stuck row doesn't sit on its blobs for 90 days.
Output-file blobs are still owned by `fileMetadata` and not touched
here.

R2-B6 #1/#2/#3: `recoverStuckSandboxes` now caps each per-status scan
at 200 rows so the mutation can't blow its doc-read budget mid-sweep
(cron re-runs every 5 min and picks up the trailing rows). The
heartbeat `setInterval` callback wraps the mutation call in
try/catch+console.warn so a stalled heartbeat is visible rather than
silently aging into a watchdog reap. Explicit `await tickHeartbeat()`
between each `ctx.storage.store` keeps `heartbeatAt` fresh during
multi-MB upload tails. Watchdog cutoff is now `max_timeout + 600s` so
those upload tails fit inside the budget by construction.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant