docs: remove outdated documentation files#10
Merged
Merged
Conversation
Remove 33 documentation files that are no longer needed: - Admin setup and deployment guides - Workflow-related documentation (architecture, patterns, types) - Integration guides (OneDrive, email providers, Shopify/Circuly) - API documentation - Internal design documents These docs have become stale and are being consolidated or deprecated.
Contributor
📝 WalkthroughWalkthroughThis pull request removes 33 documentation files spanning multiple areas of the repository. Deletions include comprehensive guides for workflow architecture and patterns, email and OneDrive integration documentation, API references, deployment and setup guides, and design documentation for features like variables replacement and row-level security. No functional code is modified—only documentation is deleted. Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Areas requiring attention:
Possibly related PRs
📜 Recent review detailsConfiguration used: CodeRabbit UI Review profile: ASSERTIVE Plan: Pro (Legacy) 📒 Files selected for processing (33)
💤 Files with no reviewable changes (33)
Comment |
yannickmonney
pushed a commit
that referenced
this pull request
Apr 8, 2026
larryro
added a commit
that referenced
this pull request
May 7, 2026
…OCTOU Adds assertSafeRetentionDelete to internal_mutations_retention.ts and threads the new cutoffMs through every retention dispatcher call site. The guard runs in-mutation (V8 transaction) and closes three round-2 findings simultaneously: - W1 #4 — snapshot-race vs legal hold. The dispatcher's loadActiveHoldsForOrg snapshot is up to 25 min stale; a hold placed AFTER snapshot but BEFORE the per-row mutation otherwise had zero protection. Re-reading inside each mutation makes the in-mutation read the only authoritative gate. - W6 #10 — cross-org corruption. Every deleteExpired* trusted args.organizationId blindly; a swapped id silently deleted org A's row while logging the audit under org B (and forking the per-org hash chain). Now we assert row.organizationId === args.organizationId. - W4 #13 — cutoff TOCTOU. Between the dispatcher's listExpired query and the per-row delete, a user can re-touch the row (chat thread.updatedAt bump, document patch). The mutation now re-evaluates cutoff against (updatedAt ?? _creationTime) and skips when the row is no longer eligible. Mutations updated (each accepts new optional cutoffMs): - deleteExpiredDocument (targetType: 'document') - deleteExpiredThread (targetType: 'thread') - deleteExpiredWorkflowExecution (targetType: 'execution') - deleteExpiredWorkflowTriggerLog (orgId + cutoff only) - deleteExpiredCustomer / Vendor / ExternalConversation / PromptTemplate / MessageFeedback / MemoryAuditRow / ChatFilterEvent / UsageLedgerRow (orgId + cutoff) Out-of-scope here: - deleteExpiredTempFile, deleteExpiredMessageMetadata, deleteExpiredTwoFactorAttempt, deleteExpiredLoginAttempt, deleteExpiredLoginBlockCounter — none of these carry organizationId on their args today (they're either cross-org-by-shape or have an indirect link). The orgHeld / cross-org checks are not applicable without an org backreference, which is itself a separate phase-10 follow-up. Their existing absence-of-row idempotence is preserved.
larryro
added a commit
that referenced
this pull request
May 7, 2026
Three round-2-confirmed gaps in the dual-control flow. placeLegalHold (W1 #11) - targetType: 'org' is the nuclear "halt all retention" hold; a single admin placing it freezes cleanup org-wide, defeating dual-control. Refuse by default; deployments running a single admin opt back in via the existing TALE_LEGAL_HOLD_SINGLE_ADMIN_OK env. A future maker- checker placement flow (legalHoldPlaceRequests + approveLegalHoldPlacement) supersedes this; today's commit restores the contract. approveLegalHoldRelease (W1 #12) - 5-min minimum delay between request and approval (constant RELEASE_APPROVAL_MIN_DELAY_MS). Defeats the chained-call attack where one admin requests + a second admin instantly approves via the same automation flow. Skipped when self-approval is in effect (the escape hatch already opted out of dual-control entirely). - Re-check that the original requester is still an org admin at approval time. A demoted/removed requester whose request lingered loses the ability to retroactively gate a destructive change. closeLegalMatter (W1 #10 + W4 #6) - Implements the docstring promise (round-2 v08 bonus): closing a matter now fans out PENDING release requests for every active hold linked via matterRef, leveraging the new by_organizationId_matterRef index. Approval still requires a second admin via approveLegalHoldRelease — matter-close does NOT auto-release, dual-control survives. - Returns releaseRequestsFiled count so the UI can confirm the cascade. Audit row records linkedHolds + releaseRequestsFiled. - Skips holds with an existing pending or approved request (idempotent against re-running close on a half-closed matter).
larryro
added a commit
that referenced
this pull request
May 7, 2026
…eout Round-2 v15 confirmed: /config unauthenticated, /openapi.json + /docs + /redoc unauthenticated, RAG container ran as root, default token baked into image ENV, strict-mode env name diverged across the wire, non-constant-time token compare, plus three SSRF-guard gaps. services/rag/app/auth.py - W7 #3: hmac.compare_digest replaces == on the bearer compare. Removes the dead-code EXEMPT_PATHS frozenset. services/rag/app/routers/health.py - W7 #1: split into public_router (`/`, `/health`) and protected_router (`/config`). main.py mounts the protected one under Depends(verify_internal_token). Old `router` re-export stays for backwards compat. services/rag/app/main.py - W7 #2: docs_url / redoc_url / openapi_url are None outside debug. - W7 #4: CORS allow_credentials flipped to False (bearer rides Authorization, never cookies). - W7 #1 wiring: mount health-public + health-protected separately. services/rag/app/config.py - W7 #8: require_custom_internal_token accepts BOTH RAG_REQUIRE_CUSTOM_INTERNAL_TOKEN and TALE_REQUIRE_CUSTOM_RAG_TOKEN via pydantic AliasChoices. services/rag/Dockerfile + services/convex/Dockerfile - W7 #5: RAG container runs as non-root (uid:gid 1001:1001 `app`). RAG ingests untrusted PDFs/DOCX through native parsers; biggest blast radius in the stack, now hardened. - W7 #6: removed RAG_INTERNAL_TOKEN=tale-rag-dev-only ENV bake from both runtime + scratch-squash stages and the matching bake in services/convex/Dockerfile. Operators MUST supply via env / compose / k8s secret. services/platform/convex/lib/helpers/rag_config.ts - W7 #9 F1: `redirect: 'manual'` on every ragFetch. - W7 #9 F2: added fc00::/7 (IPv6 ULA) to v6 blocklist (AWS IPv6 IMDSv2). - W7 #9 F3: strip trailing `.` before hostname blocklist lookup. - W7 #9 F4: re-validate URL per ragFetch invocation (DNS rebinding + env rotation mitigation). - W7 #9 F9: deleted path.startsWith('http') override branch (future- bypass foot-gun). services/platform/convex/agent_tools/rag/helpers/fetch_document_chunks.ts - W7 #10: pass timeoutMs=60_000 (default 10s was a regression). - Plus MAX_ITERATIONS=30 cap and "cursor did not advance" break to defend against an adversarial RAG response.
larryro
added a commit
that referenced
this pull request
May 7, 2026
Round-2 W8 batch (six independent UI / docs items, all small). i18n keys (W8 #1) - Add governance.retentionPolicy.group.deletionBehavior.{title,description} to en/de/fr — previously missing in all three locales, retention editor fell back to the inline English string in every locale. - Add governance.retentionPolicy.chatFilterEvents.{title,description, placeholder,helper} to en/de/fr — previously the row title rendered the literal id "chatFilterEvents". useTranslation namespace (W8 #2) - use-data-classification-notice.ts now binds useTranslation('dataNotice') so the dataNotice.default fallback string is reachable; the prior default-namespace bind meant the DE/FR fallbacks were unreachable. UI delete copy (W8 #7) - en/de/fr.json deletePermanentMessage now matches the actual user-visible behavior (Trash + grace + retention) instead of saying "permanently deleted" while the code soft-trashes. Stale test (W8 #10) - delete_chat_thread.test.ts was asserting the legacy `{ status: 'deleted' }` patch shape; the helper now writes `{ status: 'trashed', statusChangedAt: <ms> }` in user-trash mode. Docs cron + maker-checker (W8 #8) - docs/{en,de,fr}/self-hosted/configuration/retention.md: cron 03:00 UTC → 04:00 UTC (matches crons.ts), plus a sibling note about the new 01:00 UTC effectReleasesOnly cron. - en docs: replace the non-existent `releaseLegalHold` reference with the correct dual-control flow names plus the new RELEASE_APPROVAL_MIN_DELAY_MS, ORG_HOLD_REQUIRES_DUAL_CONTROL, and closeLegalMatter fan-out behavior. - retention_cleanup.ts stale "03:00 UTC" comment fixed.
larryro
added a commit
that referenced
this pull request
May 8, 2026
…OCTOU Adds assertSafeRetentionDelete to internal_mutations_retention.ts and threads the new cutoffMs through every retention dispatcher call site. The guard runs in-mutation (V8 transaction) and closes three round-2 findings simultaneously: - W1 #4 — snapshot-race vs legal hold. The dispatcher's loadActiveHoldsForOrg snapshot is up to 25 min stale; a hold placed AFTER snapshot but BEFORE the per-row mutation otherwise had zero protection. Re-reading inside each mutation makes the in-mutation read the only authoritative gate. - W6 #10 — cross-org corruption. Every deleteExpired* trusted args.organizationId blindly; a swapped id silently deleted org A's row while logging the audit under org B (and forking the per-org hash chain). Now we assert row.organizationId === args.organizationId. - W4 #13 — cutoff TOCTOU. Between the dispatcher's listExpired query and the per-row delete, a user can re-touch the row (chat thread.updatedAt bump, document patch). The mutation now re-evaluates cutoff against (updatedAt ?? _creationTime) and skips when the row is no longer eligible. Mutations updated (each accepts new optional cutoffMs): - deleteExpiredDocument (targetType: 'document') - deleteExpiredThread (targetType: 'thread') - deleteExpiredWorkflowExecution (targetType: 'execution') - deleteExpiredWorkflowTriggerLog (orgId + cutoff only) - deleteExpiredCustomer / Vendor / ExternalConversation / PromptTemplate / MessageFeedback / MemoryAuditRow / ChatFilterEvent / UsageLedgerRow (orgId + cutoff) Out-of-scope here: - deleteExpiredTempFile, deleteExpiredMessageMetadata, deleteExpiredTwoFactorAttempt, deleteExpiredLoginAttempt, deleteExpiredLoginBlockCounter — none of these carry organizationId on their args today (they're either cross-org-by-shape or have an indirect link). The orgHeld / cross-org checks are not applicable without an org backreference, which is itself a separate phase-10 follow-up. Their existing absence-of-row idempotence is preserved.
larryro
added a commit
that referenced
this pull request
May 8, 2026
Three round-2-confirmed gaps in the dual-control flow. placeLegalHold (W1 #11) - targetType: 'org' is the nuclear "halt all retention" hold; a single admin placing it freezes cleanup org-wide, defeating dual-control. Refuse by default; deployments running a single admin opt back in via the existing TALE_LEGAL_HOLD_SINGLE_ADMIN_OK env. A future maker- checker placement flow (legalHoldPlaceRequests + approveLegalHoldPlacement) supersedes this; today's commit restores the contract. approveLegalHoldRelease (W1 #12) - 5-min minimum delay between request and approval (constant RELEASE_APPROVAL_MIN_DELAY_MS). Defeats the chained-call attack where one admin requests + a second admin instantly approves via the same automation flow. Skipped when self-approval is in effect (the escape hatch already opted out of dual-control entirely). - Re-check that the original requester is still an org admin at approval time. A demoted/removed requester whose request lingered loses the ability to retroactively gate a destructive change. closeLegalMatter (W1 #10 + W4 #6) - Implements the docstring promise (round-2 v08 bonus): closing a matter now fans out PENDING release requests for every active hold linked via matterRef, leveraging the new by_organizationId_matterRef index. Approval still requires a second admin via approveLegalHoldRelease — matter-close does NOT auto-release, dual-control survives. - Returns releaseRequestsFiled count so the UI can confirm the cascade. Audit row records linkedHolds + releaseRequestsFiled. - Skips holds with an existing pending or approved request (idempotent against re-running close on a half-closed matter).
larryro
added a commit
that referenced
this pull request
May 8, 2026
…eout Round-2 v15 confirmed: /config unauthenticated, /openapi.json + /docs + /redoc unauthenticated, RAG container ran as root, default token baked into image ENV, strict-mode env name diverged across the wire, non-constant-time token compare, plus three SSRF-guard gaps. services/rag/app/auth.py - W7 #3: hmac.compare_digest replaces == on the bearer compare. Removes the dead-code EXEMPT_PATHS frozenset. services/rag/app/routers/health.py - W7 #1: split into public_router (`/`, `/health`) and protected_router (`/config`). main.py mounts the protected one under Depends(verify_internal_token). Old `router` re-export stays for backwards compat. services/rag/app/main.py - W7 #2: docs_url / redoc_url / openapi_url are None outside debug. - W7 #4: CORS allow_credentials flipped to False (bearer rides Authorization, never cookies). - W7 #1 wiring: mount health-public + health-protected separately. services/rag/app/config.py - W7 #8: require_custom_internal_token accepts BOTH RAG_REQUIRE_CUSTOM_INTERNAL_TOKEN and TALE_REQUIRE_CUSTOM_RAG_TOKEN via pydantic AliasChoices. services/rag/Dockerfile + services/convex/Dockerfile - W7 #5: RAG container runs as non-root (uid:gid 1001:1001 `app`). RAG ingests untrusted PDFs/DOCX through native parsers; biggest blast radius in the stack, now hardened. - W7 #6: removed RAG_INTERNAL_TOKEN=tale-rag-dev-only ENV bake from both runtime + scratch-squash stages and the matching bake in services/convex/Dockerfile. Operators MUST supply via env / compose / k8s secret. services/platform/convex/lib/helpers/rag_config.ts - W7 #9 F1: `redirect: 'manual'` on every ragFetch. - W7 #9 F2: added fc00::/7 (IPv6 ULA) to v6 blocklist (AWS IPv6 IMDSv2). - W7 #9 F3: strip trailing `.` before hostname blocklist lookup. - W7 #9 F4: re-validate URL per ragFetch invocation (DNS rebinding + env rotation mitigation). - W7 #9 F9: deleted path.startsWith('http') override branch (future- bypass foot-gun). services/platform/convex/agent_tools/rag/helpers/fetch_document_chunks.ts - W7 #10: pass timeoutMs=60_000 (default 10s was a regression). - Plus MAX_ITERATIONS=30 cap and "cursor did not advance" break to defend against an adversarial RAG response.
larryro
added a commit
that referenced
this pull request
May 8, 2026
Round-2 W8 batch (six independent UI / docs items, all small). i18n keys (W8 #1) - Add governance.retentionPolicy.group.deletionBehavior.{title,description} to en/de/fr — previously missing in all three locales, retention editor fell back to the inline English string in every locale. - Add governance.retentionPolicy.chatFilterEvents.{title,description, placeholder,helper} to en/de/fr — previously the row title rendered the literal id "chatFilterEvents". useTranslation namespace (W8 #2) - use-data-classification-notice.ts now binds useTranslation('dataNotice') so the dataNotice.default fallback string is reachable; the prior default-namespace bind meant the DE/FR fallbacks were unreachable. UI delete copy (W8 #7) - en/de/fr.json deletePermanentMessage now matches the actual user-visible behavior (Trash + grace + retention) instead of saying "permanently deleted" while the code soft-trashes. Stale test (W8 #10) - delete_chat_thread.test.ts was asserting the legacy `{ status: 'deleted' }` patch shape; the helper now writes `{ status: 'trashed', statusChangedAt: <ms> }` in user-trash mode. Docs cron + maker-checker (W8 #8) - docs/{en,de,fr}/self-hosted/configuration/retention.md: cron 03:00 UTC → 04:00 UTC (matches crons.ts), plus a sibling note about the new 01:00 UTC effectReleasesOnly cron. - en docs: replace the non-existent `releaseLegalHold` reference with the correct dual-control flow names plus the new RELEASE_APPROVAL_MIN_DELAY_MS, ORG_HOLD_REQUIRES_DUAL_CONTROL, and closeLegalMatter fan-out behavior. - retention_cleanup.ts stale "03:00 UTC" comment fixed.
larryro
added a commit
that referenced
this pull request
May 8, 2026
…OCTOU Adds assertSafeRetentionDelete to internal_mutations_retention.ts and threads the new cutoffMs through every retention dispatcher call site. The guard runs in-mutation (V8 transaction) and closes three round-2 findings simultaneously: - W1 #4 — snapshot-race vs legal hold. The dispatcher's loadActiveHoldsForOrg snapshot is up to 25 min stale; a hold placed AFTER snapshot but BEFORE the per-row mutation otherwise had zero protection. Re-reading inside each mutation makes the in-mutation read the only authoritative gate. - W6 #10 — cross-org corruption. Every deleteExpired* trusted args.organizationId blindly; a swapped id silently deleted org A's row while logging the audit under org B (and forking the per-org hash chain). Now we assert row.organizationId === args.organizationId. - W4 #13 — cutoff TOCTOU. Between the dispatcher's listExpired query and the per-row delete, a user can re-touch the row (chat thread.updatedAt bump, document patch). The mutation now re-evaluates cutoff against (updatedAt ?? _creationTime) and skips when the row is no longer eligible. Mutations updated (each accepts new optional cutoffMs): - deleteExpiredDocument (targetType: 'document') - deleteExpiredThread (targetType: 'thread') - deleteExpiredWorkflowExecution (targetType: 'execution') - deleteExpiredWorkflowTriggerLog (orgId + cutoff only) - deleteExpiredCustomer / Vendor / ExternalConversation / PromptTemplate / MessageFeedback / MemoryAuditRow / ChatFilterEvent / UsageLedgerRow (orgId + cutoff) Out-of-scope here: - deleteExpiredTempFile, deleteExpiredMessageMetadata, deleteExpiredTwoFactorAttempt, deleteExpiredLoginAttempt, deleteExpiredLoginBlockCounter — none of these carry organizationId on their args today (they're either cross-org-by-shape or have an indirect link). The orgHeld / cross-org checks are not applicable without an org backreference, which is itself a separate phase-10 follow-up. Their existing absence-of-row idempotence is preserved.
larryro
added a commit
that referenced
this pull request
May 8, 2026
Three round-2-confirmed gaps in the dual-control flow. placeLegalHold (W1 #11) - targetType: 'org' is the nuclear "halt all retention" hold; a single admin placing it freezes cleanup org-wide, defeating dual-control. Refuse by default; deployments running a single admin opt back in via the existing TALE_LEGAL_HOLD_SINGLE_ADMIN_OK env. A future maker- checker placement flow (legalHoldPlaceRequests + approveLegalHoldPlacement) supersedes this; today's commit restores the contract. approveLegalHoldRelease (W1 #12) - 5-min minimum delay between request and approval (constant RELEASE_APPROVAL_MIN_DELAY_MS). Defeats the chained-call attack where one admin requests + a second admin instantly approves via the same automation flow. Skipped when self-approval is in effect (the escape hatch already opted out of dual-control entirely). - Re-check that the original requester is still an org admin at approval time. A demoted/removed requester whose request lingered loses the ability to retroactively gate a destructive change. closeLegalMatter (W1 #10 + W4 #6) - Implements the docstring promise (round-2 v08 bonus): closing a matter now fans out PENDING release requests for every active hold linked via matterRef, leveraging the new by_organizationId_matterRef index. Approval still requires a second admin via approveLegalHoldRelease — matter-close does NOT auto-release, dual-control survives. - Returns releaseRequestsFiled count so the UI can confirm the cascade. Audit row records linkedHolds + releaseRequestsFiled. - Skips holds with an existing pending or approved request (idempotent against re-running close on a half-closed matter).
larryro
added a commit
that referenced
this pull request
May 8, 2026
…eout Round-2 v15 confirmed: /config unauthenticated, /openapi.json + /docs + /redoc unauthenticated, RAG container ran as root, default token baked into image ENV, strict-mode env name diverged across the wire, non-constant-time token compare, plus three SSRF-guard gaps. services/rag/app/auth.py - W7 #3: hmac.compare_digest replaces == on the bearer compare. Removes the dead-code EXEMPT_PATHS frozenset. services/rag/app/routers/health.py - W7 #1: split into public_router (`/`, `/health`) and protected_router (`/config`). main.py mounts the protected one under Depends(verify_internal_token). Old `router` re-export stays for backwards compat. services/rag/app/main.py - W7 #2: docs_url / redoc_url / openapi_url are None outside debug. - W7 #4: CORS allow_credentials flipped to False (bearer rides Authorization, never cookies). - W7 #1 wiring: mount health-public + health-protected separately. services/rag/app/config.py - W7 #8: require_custom_internal_token accepts BOTH RAG_REQUIRE_CUSTOM_INTERNAL_TOKEN and TALE_REQUIRE_CUSTOM_RAG_TOKEN via pydantic AliasChoices. services/rag/Dockerfile + services/convex/Dockerfile - W7 #5: RAG container runs as non-root (uid:gid 1001:1001 `app`). RAG ingests untrusted PDFs/DOCX through native parsers; biggest blast radius in the stack, now hardened. - W7 #6: removed RAG_INTERNAL_TOKEN=tale-rag-dev-only ENV bake from both runtime + scratch-squash stages and the matching bake in services/convex/Dockerfile. Operators MUST supply via env / compose / k8s secret. services/platform/convex/lib/helpers/rag_config.ts - W7 #9 F1: `redirect: 'manual'` on every ragFetch. - W7 #9 F2: added fc00::/7 (IPv6 ULA) to v6 blocklist (AWS IPv6 IMDSv2). - W7 #9 F3: strip trailing `.` before hostname blocklist lookup. - W7 #9 F4: re-validate URL per ragFetch invocation (DNS rebinding + env rotation mitigation). - W7 #9 F9: deleted path.startsWith('http') override branch (future- bypass foot-gun). services/platform/convex/agent_tools/rag/helpers/fetch_document_chunks.ts - W7 #10: pass timeoutMs=60_000 (default 10s was a regression). - Plus MAX_ITERATIONS=30 cap and "cursor did not advance" break to defend against an adversarial RAG response.
larryro
added a commit
that referenced
this pull request
May 8, 2026
Round-2 W8 batch (six independent UI / docs items, all small). i18n keys (W8 #1) - Add governance.retentionPolicy.group.deletionBehavior.{title,description} to en/de/fr — previously missing in all three locales, retention editor fell back to the inline English string in every locale. - Add governance.retentionPolicy.chatFilterEvents.{title,description, placeholder,helper} to en/de/fr — previously the row title rendered the literal id "chatFilterEvents". useTranslation namespace (W8 #2) - use-data-classification-notice.ts now binds useTranslation('dataNotice') so the dataNotice.default fallback string is reachable; the prior default-namespace bind meant the DE/FR fallbacks were unreachable. UI delete copy (W8 #7) - en/de/fr.json deletePermanentMessage now matches the actual user-visible behavior (Trash + grace + retention) instead of saying "permanently deleted" while the code soft-trashes. Stale test (W8 #10) - delete_chat_thread.test.ts was asserting the legacy `{ status: 'deleted' }` patch shape; the helper now writes `{ status: 'trashed', statusChangedAt: <ms> }` in user-trash mode. Docs cron + maker-checker (W8 #8) - docs/{en,de,fr}/self-hosted/configuration/retention.md: cron 03:00 UTC → 04:00 UTC (matches crons.ts), plus a sibling note about the new 01:00 UTC effectReleasesOnly cron. - en docs: replace the non-existent `releaseLegalHold` reference with the correct dual-control flow names plus the new RELEASE_APPROVAL_MIN_DELAY_MS, ORG_HOLD_REQUIRES_DUAL_CONTROL, and closeLegalMatter fan-out behavior. - retention_cleanup.ts stale "03:00 UTC" comment fixed.
larryro
added a commit
that referenced
this pull request
May 9, 2026
…stodian cascade Round-1 #10/#12/#13 + round-2 V3 confirmed three custodian-cascade gaps (P0-6 retention guard, P0-7 deleteDocumentById, P0-8 cleanupMessageMetadata + cleanupChatFilterEvents). The user's observation made the deeper truth clear: in production, only `org` and `userMembership` hold target types are operator-facing — `thread` / `document` / `execution` were modeled in the original fine-grained hold design but never wired into the place-hold UI (`PICKER_TARGET_TYPES = ['userMembership', 'org']`), and `usePlaceLegalHold` is called from a single dialog. The corresponding `holds.threadIds` / `documentIds` / `executionIds` Sets in `ActiveHolds` therefore stayed empty in every deployment, while ~80 LOC of dead per-row checks proliferated through retention, erasure, restore, folders, and the cascade helper. This commit collapses the model to its actual shape. Schema-level narrowing (legal_hold.ts): - HOLD_TARGET_TYPES = ['userMembership', 'org'] (write-side) - ActiveHolds = { orgHeld, userMembershipIds } - loadActiveHolds switch reduces to 2 cases; legacy thread/document/ execution rows in the table (if any) are intentionally ignored — they're surfaced in admin UI as `(legacy)` via the read-side validator in legal_hold_queries.ts. - resolveAndAssertTarget keeps only `org` + `userMembership` cases. - isHeld() simplified to org + custodian. Guard simplification (legal_hold_guard.ts): - assertNotHeld drops the per-row branches; `targetType`/`targetId` are now error-message context only. The user-custodian cascade via `authorUserId` is the sole remaining gate beyond org-wide. assertSafeRetentionDelete (internal_mutations_retention.ts): - Drops `targetType?` / `targetId?` args; adds `authorUserId?`. - Internal handler is now: cross-org check → TOCTOU check → org-wide hold → user-custodian cascade. Single, clear contract. P0-6 cascade wiring — every retention mutation passes the row's author user id (~15 call sites in internal_mutations_retention.ts): - documents → doc.createdBy - threads → thread.userId - workflowExecution → execution.userId - workflowTriggerLog → exec.userId via wfExecutionId join - messageFeedback → row.userId - memoryAudit → row.subjectUserId (with thread-fallback) - chatFilterEvent → parent thread's userId via index lookup - usageLedger → row.userId - promptTemplate → row.createdBy - messageMetadata → parent thread's userId (closes P0-8 directly) - tempFile, customer, vendor, externalConversation: org-only (no author concept on the row) P0-8 cascade fix — cleanupMessageMetadata + cleanupChatFilterEvents no longer rely on the now-empty `holds.threadIds` Set; both look up the parent thread's `userId` for the user-membership cascade (messageMetadata cleanup mutation was already P0-8 fixed in commit 1). Other dead-check removals: - retention_cleanup.ts: ~10 dead `holds.threadIds.has` / `documentIds.has` / `executionIds.has` checks removed; per-row custodian cascade either replaces them or the mutation re-checks. - governance/restore.ts: drops dead per-row branches; user-custodian cascade in restoreSoftDeletedRow (commit 1) covers the rest. - threads/cascade_helpers.ts: per-thread hold check replaced with a thread-metadata lookup → `userMembershipIds.has(thread.userId)`. - threads/delete_chat_thread.ts + restore_chat_thread.ts: drop `holds.threadIds.has(threadId)`; ownerHeld + orgHeld cover all refusal cases. - folders/mutations.ts assertNoHeldDescendantDocs: per-document hold check replaced with `userMembershipIds.has(doc.createdBy)`. - governance/erasure.ts requestErasure: drops heldThreadIds / heldDocumentIds collection — refusal is now `orgHeld || userCustodianHeld`. eraseSubjectDocuments drops the per-row held-document check (subject is by definition the only author in the iteration; user-custodian gate applied at request time). - legal_hold_internal.ts: loadActiveHoldsForOrg returns the narrowed ActiveHolds shape across the action/query boundary. Tests rewritten — `governance/__tests__/retention_legal_hold_gaps.test.ts`: - Drops thread/document/execution per-row hold cases (behaviour deleted on purpose). - Adds 6 cases covering the simplified contract: authorUserId cascade, TOCTOU + cross-org pre-checks, missing authorUserId grandfather, org-wide hold blanket refusal. - Source-grep regressions: every retention mutation passes the right authorUserId; cleanup actions pre-filter via the right `holds.userMembershipIds.has(...)`; per-row Sets stay deleted from the cleanup / erasure / guard sources. Note: this is a one-way migration. Any deployment with active legacy thread/document/execution holds loses enforcement of those holds going forward; the read-side validator in legal_hold_queries keeps showing them as `(legacy)`. Acceptable per project memory (demo stage, no prod use). Pre-prod check: convert to user/org holds first if any deployment has them. Verified: typecheck clean; 851 tests pass across affected dirs; full vitest run shows 5641/5642 pass (1 pre-existing flake in canvas-pane.test.tsx, unrelated).
larryro
added a commit
that referenced
this pull request
May 9, 2026
- auditLogChainGenesis: per-org sentinel patched on every createAuditLog to force concurrent first-writers to conflict on a real document. Closes round-2 review CRITICAL #10 (genesis fork: two concurrent initial audit writes both observe lastEntry=null and commit with previousHash='', forking the chain permanently). - activeLegalHoldClaims: per-target lock row for placeLegalHold to serialize concurrent placements via OCC. Closes CRITICAL #21 (TOCTOU duplicate-active-hold: two concurrent placeLegalHold both observe no active hold and both insert; row-level OCC does not detect range phantoms and Convex has no unique-constraint primitive). - threads.by_org_user index: enables GDPR requestErasure to enumerate a single user's threads via paged scheduler-driven continuation (Phase D.4.1) instead of .collect()-on-org which silently truncates past the 16K per-transaction read limit on large orgs. Schema-only; no migration needed (additive). All three close concrete spoliation/IDOR/integrity risks identified by the review.
larryro
added a commit
that referenced
this pull request
May 9, 2026
…heck - audit_hash: add lifecycleStatus + statusChangedAt to EXCLUDED_FIELDS so retention soft-delete (markRowExpiredGeneric) patching audit log rows doesn't poison the chain hash recompute. Pre-fix, ANY soft-deleted audit row caused verifyIntegrity to fail valid=false from that row forward. Round-2 review CRITICAL #8. - audit_logs/validators: declare lifecycleStatus + statusChangedAt on auditLogItemValidator so query-return validation accepts soft-deleted rows. Defense-in-depth alongside the EXCLUDED_FIELDS fix. - verify_integrity: anchor candidate filter accepts subtype === undefined (legacy retention checkpoints written before subtype field existed). Strict equality dropped them, breaking verifyIntegrity for any deployment that ran retention pre-upgrade. Match canonicalCheckpointPayload's `?? 'retention'` fallback. Round-2 review CRITICAL #9. - verify_integrity: add fromTimestamp arg for paged resumption + suppress isFirstEntry head-anchor when paging mid-chain. Pre-fix, the response promised "page from lastVerifiedTimestamp + 1" but the query had no such arg — large-org chains could not be paged. - verify_integrity: drop unsignedScrubSubjects Set (security-flavored dead code; unsignedScrubCount alone tracks the metric). The set was populated but never read; the actual gate is `!hasSigningKey`. Comment clarified. - verify_integrity: type entries as Doc<'auditLogs'>[] instead of an open `[key: string]: unknown` index signature. - audit_logs/internal_mutations: delete archiveOldLogs deprecated re-export — zero callers, AGENTS.md prohibits @deprecated tombstones. - audit_logs/helpers (createAuditLog): introduce buildAuditRecordHashInput as single source of truth for the canonical record payload — both writer and self-check call it, eliminating drift risk that schema additions could change the hash output across writes vs verify. - audit_logs/helpers (createAuditLog): genesis sentinel — read + patch the per-org auditLogChainGenesis row before each write. This forces OCC contention on a real document for the first audit write per org, closing the genesis-fork race where two concurrent first-writers both observe lastEntry=null and commit two roots with previousHash=''. Round-2 review CRITICAL #10. - audit_logs/helpers (createAuditLog): inline self-check on every write recomputes the prior row's integrity hash and console.errors on mismatch. Catches naive scenario-1 tampering (field changed, hash not updated) at the next legitimate audit write — the only automated tamper detection today. Wrapped in try/catch and skips piiScrubbed rows so it cannot affect the legitimate write path. Round-2 review C.5. Lint + typecheck clean. Convex codegen succeeded.
larryro
added a commit
that referenced
this pull request
May 9, 2026
- audit_hash.test: add three regression tests for round-2 CRITICAL #8. (1) lifecycleStatus + statusChangedAt are excluded from canonicalize (2) chainSuccessor + piiScrubbedAt are excluded (3) writer-vs-verifier hash equality after retention soft-delete patches the audit row. Pre-fix, soft-deleting any audit row poisoned the chain. - delete_document.test: replace the "404 returns error" assertion with the post-fix contract: 404 is treated as `success: true, deletedCount: 0` for idempotency on retention re-runs. The previous test asserted 400 to dodge the bug; real RAG returns 404 for not-found. - audit_logs/helpers.test: extend the mock to support the new `auditLogChainGenesis` query + insert (the genesis-fork sentinel from CRITICAL #10) and filter assertions to count only `auditLogs` inserts so the sentinel write doesn't break the count expectations. - delete_chat_thread.test: support `for await` over the agentWebhookUserThreads query — the cascade now iterates instead of collecting. Async iterator returns the pre-seeded mappings. All 3155 server tests pass.
larryro
added a commit
that referenced
this pull request
May 9, 2026
…OCTOU Adds assertSafeRetentionDelete to internal_mutations_retention.ts and threads the new cutoffMs through every retention dispatcher call site. The guard runs in-mutation (V8 transaction) and closes three round-2 findings simultaneously: - W1 #4 — snapshot-race vs legal hold. The dispatcher's loadActiveHoldsForOrg snapshot is up to 25 min stale; a hold placed AFTER snapshot but BEFORE the per-row mutation otherwise had zero protection. Re-reading inside each mutation makes the in-mutation read the only authoritative gate. - W6 #10 — cross-org corruption. Every deleteExpired* trusted args.organizationId blindly; a swapped id silently deleted org A's row while logging the audit under org B (and forking the per-org hash chain). Now we assert row.organizationId === args.organizationId. - W4 #13 — cutoff TOCTOU. Between the dispatcher's listExpired query and the per-row delete, a user can re-touch the row (chat thread.updatedAt bump, document patch). The mutation now re-evaluates cutoff against (updatedAt ?? _creationTime) and skips when the row is no longer eligible. Mutations updated (each accepts new optional cutoffMs): - deleteExpiredDocument (targetType: 'document') - deleteExpiredThread (targetType: 'thread') - deleteExpiredWorkflowExecution (targetType: 'execution') - deleteExpiredWorkflowTriggerLog (orgId + cutoff only) - deleteExpiredCustomer / Vendor / ExternalConversation / PromptTemplate / MessageFeedback / MemoryAuditRow / ChatFilterEvent / UsageLedgerRow (orgId + cutoff) Out-of-scope here: - deleteExpiredTempFile, deleteExpiredMessageMetadata, deleteExpiredTwoFactorAttempt, deleteExpiredLoginAttempt, deleteExpiredLoginBlockCounter — none of these carry organizationId on their args today (they're either cross-org-by-shape or have an indirect link). The orgHeld / cross-org checks are not applicable without an org backreference, which is itself a separate phase-10 follow-up. Their existing absence-of-row idempotence is preserved.
larryro
added a commit
that referenced
this pull request
May 9, 2026
Three round-2-confirmed gaps in the dual-control flow. placeLegalHold (W1 #11) - targetType: 'org' is the nuclear "halt all retention" hold; a single admin placing it freezes cleanup org-wide, defeating dual-control. Refuse by default; deployments running a single admin opt back in via the existing TALE_LEGAL_HOLD_SINGLE_ADMIN_OK env. A future maker- checker placement flow (legalHoldPlaceRequests + approveLegalHoldPlacement) supersedes this; today's commit restores the contract. approveLegalHoldRelease (W1 #12) - 5-min minimum delay between request and approval (constant RELEASE_APPROVAL_MIN_DELAY_MS). Defeats the chained-call attack where one admin requests + a second admin instantly approves via the same automation flow. Skipped when self-approval is in effect (the escape hatch already opted out of dual-control entirely). - Re-check that the original requester is still an org admin at approval time. A demoted/removed requester whose request lingered loses the ability to retroactively gate a destructive change. closeLegalMatter (W1 #10 + W4 #6) - Implements the docstring promise (round-2 v08 bonus): closing a matter now fans out PENDING release requests for every active hold linked via matterRef, leveraging the new by_organizationId_matterRef index. Approval still requires a second admin via approveLegalHoldRelease — matter-close does NOT auto-release, dual-control survives. - Returns releaseRequestsFiled count so the UI can confirm the cascade. Audit row records linkedHolds + releaseRequestsFiled. - Skips holds with an existing pending or approved request (idempotent against re-running close on a half-closed matter).
larryro
added a commit
that referenced
this pull request
May 9, 2026
…eout Round-2 v15 confirmed: /config unauthenticated, /openapi.json + /docs + /redoc unauthenticated, RAG container ran as root, default token baked into image ENV, strict-mode env name diverged across the wire, non-constant-time token compare, plus three SSRF-guard gaps. services/rag/app/auth.py - W7 #3: hmac.compare_digest replaces == on the bearer compare. Removes the dead-code EXEMPT_PATHS frozenset. services/rag/app/routers/health.py - W7 #1: split into public_router (`/`, `/health`) and protected_router (`/config`). main.py mounts the protected one under Depends(verify_internal_token). Old `router` re-export stays for backwards compat. services/rag/app/main.py - W7 #2: docs_url / redoc_url / openapi_url are None outside debug. - W7 #4: CORS allow_credentials flipped to False (bearer rides Authorization, never cookies). - W7 #1 wiring: mount health-public + health-protected separately. services/rag/app/config.py - W7 #8: require_custom_internal_token accepts BOTH RAG_REQUIRE_CUSTOM_INTERNAL_TOKEN and TALE_REQUIRE_CUSTOM_RAG_TOKEN via pydantic AliasChoices. services/rag/Dockerfile + services/convex/Dockerfile - W7 #5: RAG container runs as non-root (uid:gid 1001:1001 `app`). RAG ingests untrusted PDFs/DOCX through native parsers; biggest blast radius in the stack, now hardened. - W7 #6: removed RAG_INTERNAL_TOKEN=tale-rag-dev-only ENV bake from both runtime + scratch-squash stages and the matching bake in services/convex/Dockerfile. Operators MUST supply via env / compose / k8s secret. services/platform/convex/lib/helpers/rag_config.ts - W7 #9 F1: `redirect: 'manual'` on every ragFetch. - W7 #9 F2: added fc00::/7 (IPv6 ULA) to v6 blocklist (AWS IPv6 IMDSv2). - W7 #9 F3: strip trailing `.` before hostname blocklist lookup. - W7 #9 F4: re-validate URL per ragFetch invocation (DNS rebinding + env rotation mitigation). - W7 #9 F9: deleted path.startsWith('http') override branch (future- bypass foot-gun). services/platform/convex/agent_tools/rag/helpers/fetch_document_chunks.ts - W7 #10: pass timeoutMs=60_000 (default 10s was a regression). - Plus MAX_ITERATIONS=30 cap and "cursor did not advance" break to defend against an adversarial RAG response.
larryro
added a commit
that referenced
this pull request
May 9, 2026
Round-2 W8 batch (six independent UI / docs items, all small). i18n keys (W8 #1) - Add governance.retentionPolicy.group.deletionBehavior.{title,description} to en/de/fr — previously missing in all three locales, retention editor fell back to the inline English string in every locale. - Add governance.retentionPolicy.chatFilterEvents.{title,description, placeholder,helper} to en/de/fr — previously the row title rendered the literal id "chatFilterEvents". useTranslation namespace (W8 #2) - use-data-classification-notice.ts now binds useTranslation('dataNotice') so the dataNotice.default fallback string is reachable; the prior default-namespace bind meant the DE/FR fallbacks were unreachable. UI delete copy (W8 #7) - en/de/fr.json deletePermanentMessage now matches the actual user-visible behavior (Trash + grace + retention) instead of saying "permanently deleted" while the code soft-trashes. Stale test (W8 #10) - delete_chat_thread.test.ts was asserting the legacy `{ status: 'deleted' }` patch shape; the helper now writes `{ status: 'trashed', statusChangedAt: <ms> }` in user-trash mode. Docs cron + maker-checker (W8 #8) - docs/{en,de,fr}/self-hosted/configuration/retention.md: cron 03:00 UTC → 04:00 UTC (matches crons.ts), plus a sibling note about the new 01:00 UTC effectReleasesOnly cron. - en docs: replace the non-existent `releaseLegalHold` reference with the correct dual-control flow names plus the new RELEASE_APPROVAL_MIN_DELAY_MS, ORG_HOLD_REQUIRES_DUAL_CONTROL, and closeLegalMatter fan-out behavior. - retention_cleanup.ts stale "03:00 UTC" comment fixed.
larryro
added a commit
that referenced
this pull request
May 9, 2026
…stodian cascade Round-1 #10/#12/#13 + round-2 V3 confirmed three custodian-cascade gaps (P0-6 retention guard, P0-7 deleteDocumentById, P0-8 cleanupMessageMetadata + cleanupChatFilterEvents). The user's observation made the deeper truth clear: in production, only `org` and `userMembership` hold target types are operator-facing — `thread` / `document` / `execution` were modeled in the original fine-grained hold design but never wired into the place-hold UI (`PICKER_TARGET_TYPES = ['userMembership', 'org']`), and `usePlaceLegalHold` is called from a single dialog. The corresponding `holds.threadIds` / `documentIds` / `executionIds` Sets in `ActiveHolds` therefore stayed empty in every deployment, while ~80 LOC of dead per-row checks proliferated through retention, erasure, restore, folders, and the cascade helper. This commit collapses the model to its actual shape. Schema-level narrowing (legal_hold.ts): - HOLD_TARGET_TYPES = ['userMembership', 'org'] (write-side) - ActiveHolds = { orgHeld, userMembershipIds } - loadActiveHolds switch reduces to 2 cases; legacy thread/document/ execution rows in the table (if any) are intentionally ignored — they're surfaced in admin UI as `(legacy)` via the read-side validator in legal_hold_queries.ts. - resolveAndAssertTarget keeps only `org` + `userMembership` cases. - isHeld() simplified to org + custodian. Guard simplification (legal_hold_guard.ts): - assertNotHeld drops the per-row branches; `targetType`/`targetId` are now error-message context only. The user-custodian cascade via `authorUserId` is the sole remaining gate beyond org-wide. assertSafeRetentionDelete (internal_mutations_retention.ts): - Drops `targetType?` / `targetId?` args; adds `authorUserId?`. - Internal handler is now: cross-org check → TOCTOU check → org-wide hold → user-custodian cascade. Single, clear contract. P0-6 cascade wiring — every retention mutation passes the row's author user id (~15 call sites in internal_mutations_retention.ts): - documents → doc.createdBy - threads → thread.userId - workflowExecution → execution.userId - workflowTriggerLog → exec.userId via wfExecutionId join - messageFeedback → row.userId - memoryAudit → row.subjectUserId (with thread-fallback) - chatFilterEvent → parent thread's userId via index lookup - usageLedger → row.userId - promptTemplate → row.createdBy - messageMetadata → parent thread's userId (closes P0-8 directly) - tempFile, customer, vendor, externalConversation: org-only (no author concept on the row) P0-8 cascade fix — cleanupMessageMetadata + cleanupChatFilterEvents no longer rely on the now-empty `holds.threadIds` Set; both look up the parent thread's `userId` for the user-membership cascade (messageMetadata cleanup mutation was already P0-8 fixed in commit 1). Other dead-check removals: - retention_cleanup.ts: ~10 dead `holds.threadIds.has` / `documentIds.has` / `executionIds.has` checks removed; per-row custodian cascade either replaces them or the mutation re-checks. - governance/restore.ts: drops dead per-row branches; user-custodian cascade in restoreSoftDeletedRow (commit 1) covers the rest. - threads/cascade_helpers.ts: per-thread hold check replaced with a thread-metadata lookup → `userMembershipIds.has(thread.userId)`. - threads/delete_chat_thread.ts + restore_chat_thread.ts: drop `holds.threadIds.has(threadId)`; ownerHeld + orgHeld cover all refusal cases. - folders/mutations.ts assertNoHeldDescendantDocs: per-document hold check replaced with `userMembershipIds.has(doc.createdBy)`. - governance/erasure.ts requestErasure: drops heldThreadIds / heldDocumentIds collection — refusal is now `orgHeld || userCustodianHeld`. eraseSubjectDocuments drops the per-row held-document check (subject is by definition the only author in the iteration; user-custodian gate applied at request time). - legal_hold_internal.ts: loadActiveHoldsForOrg returns the narrowed ActiveHolds shape across the action/query boundary. Tests rewritten — `governance/__tests__/retention_legal_hold_gaps.test.ts`: - Drops thread/document/execution per-row hold cases (behaviour deleted on purpose). - Adds 6 cases covering the simplified contract: authorUserId cascade, TOCTOU + cross-org pre-checks, missing authorUserId grandfather, org-wide hold blanket refusal. - Source-grep regressions: every retention mutation passes the right authorUserId; cleanup actions pre-filter via the right `holds.userMembershipIds.has(...)`; per-row Sets stay deleted from the cleanup / erasure / guard sources. Note: this is a one-way migration. Any deployment with active legacy thread/document/execution holds loses enforcement of those holds going forward; the read-side validator in legal_hold_queries keeps showing them as `(legacy)`. Acceptable per project memory (demo stage, no prod use). Pre-prod check: convert to user/org holds first if any deployment has them. Verified: typecheck clean; 851 tests pass across affected dirs; full vitest run shows 5641/5642 pass (1 pre-existing flake in canvas-pane.test.tsx, unrelated).
larryro
added a commit
that referenced
this pull request
May 9, 2026
- auditLogChainGenesis: per-org sentinel patched on every createAuditLog to force concurrent first-writers to conflict on a real document. Closes round-2 review CRITICAL #10 (genesis fork: two concurrent initial audit writes both observe lastEntry=null and commit with previousHash='', forking the chain permanently). - activeLegalHoldClaims: per-target lock row for placeLegalHold to serialize concurrent placements via OCC. Closes CRITICAL #21 (TOCTOU duplicate-active-hold: two concurrent placeLegalHold both observe no active hold and both insert; row-level OCC does not detect range phantoms and Convex has no unique-constraint primitive). - threads.by_org_user index: enables GDPR requestErasure to enumerate a single user's threads via paged scheduler-driven continuation (Phase D.4.1) instead of .collect()-on-org which silently truncates past the 16K per-transaction read limit on large orgs. Schema-only; no migration needed (additive). All three close concrete spoliation/IDOR/integrity risks identified by the review.
larryro
added a commit
that referenced
this pull request
May 9, 2026
…heck - audit_hash: add lifecycleStatus + statusChangedAt to EXCLUDED_FIELDS so retention soft-delete (markRowExpiredGeneric) patching audit log rows doesn't poison the chain hash recompute. Pre-fix, ANY soft-deleted audit row caused verifyIntegrity to fail valid=false from that row forward. Round-2 review CRITICAL #8. - audit_logs/validators: declare lifecycleStatus + statusChangedAt on auditLogItemValidator so query-return validation accepts soft-deleted rows. Defense-in-depth alongside the EXCLUDED_FIELDS fix. - verify_integrity: anchor candidate filter accepts subtype === undefined (legacy retention checkpoints written before subtype field existed). Strict equality dropped them, breaking verifyIntegrity for any deployment that ran retention pre-upgrade. Match canonicalCheckpointPayload's `?? 'retention'` fallback. Round-2 review CRITICAL #9. - verify_integrity: add fromTimestamp arg for paged resumption + suppress isFirstEntry head-anchor when paging mid-chain. Pre-fix, the response promised "page from lastVerifiedTimestamp + 1" but the query had no such arg — large-org chains could not be paged. - verify_integrity: drop unsignedScrubSubjects Set (security-flavored dead code; unsignedScrubCount alone tracks the metric). The set was populated but never read; the actual gate is `!hasSigningKey`. Comment clarified. - verify_integrity: type entries as Doc<'auditLogs'>[] instead of an open `[key: string]: unknown` index signature. - audit_logs/internal_mutations: delete archiveOldLogs deprecated re-export — zero callers, AGENTS.md prohibits @deprecated tombstones. - audit_logs/helpers (createAuditLog): introduce buildAuditRecordHashInput as single source of truth for the canonical record payload — both writer and self-check call it, eliminating drift risk that schema additions could change the hash output across writes vs verify. - audit_logs/helpers (createAuditLog): genesis sentinel — read + patch the per-org auditLogChainGenesis row before each write. This forces OCC contention on a real document for the first audit write per org, closing the genesis-fork race where two concurrent first-writers both observe lastEntry=null and commit two roots with previousHash=''. Round-2 review CRITICAL #10. - audit_logs/helpers (createAuditLog): inline self-check on every write recomputes the prior row's integrity hash and console.errors on mismatch. Catches naive scenario-1 tampering (field changed, hash not updated) at the next legitimate audit write — the only automated tamper detection today. Wrapped in try/catch and skips piiScrubbed rows so it cannot affect the legitimate write path. Round-2 review C.5. Lint + typecheck clean. Convex codegen succeeded.
larryro
added a commit
that referenced
this pull request
May 9, 2026
- audit_hash.test: add three regression tests for round-2 CRITICAL #8. (1) lifecycleStatus + statusChangedAt are excluded from canonicalize (2) chainSuccessor + piiScrubbedAt are excluded (3) writer-vs-verifier hash equality after retention soft-delete patches the audit row. Pre-fix, soft-deleting any audit row poisoned the chain. - delete_document.test: replace the "404 returns error" assertion with the post-fix contract: 404 is treated as `success: true, deletedCount: 0` for idempotency on retention re-runs. The previous test asserted 400 to dodge the bug; real RAG returns 404 for not-found. - audit_logs/helpers.test: extend the mock to support the new `auditLogChainGenesis` query + insert (the genesis-fork sentinel from CRITICAL #10) and filter assertions to count only `auditLogs` inserts so the sentinel write doesn't break the count expectations. - delete_chat_thread.test: support `for await` over the agentWebhookUserThreads query — the cascade now iterates instead of collecting. Async iterator returns the pre-seeded mappings. All 3155 server tests pass.
larryro
added a commit
that referenced
this pull request
May 17, 2026
Closes #9, #10, #11, #12 — cascade correctness + GC durability. - `personalization_cascade.ts:cascadeOnOrgDeleted` swaps delete order: `db.delete` runs FIRST, then `storage.delete` inside the try/catch. Matches the documented contract in `tts/cascade_helpers.ts:55-62` — Convex `_storage` writes are out-of-band and not rolled back on tx abort, so the reverse order leaves a surviving row pointing at a dead storageId (404 on `/api/tts-audio`). - `threads/cascade_helpers.ts` step 7c TTS cleanup gets the same swap, for the same reason. - `cascadeOnTtsForMemberRemoved` per-mutation page cap lowered from 50 (~10K writes) to 30 (~6K writes) to stay under Convex's ~8K per-mutation write budget. `cascadeOnOrgDeleted` gets the same cap reduction. The hourly cron picks up whatever doesn't fit in a single pass — still well inside the 30-day Art 12(3) GDPR window. - `gcOrgTtsChunks` now persists its org-cursor in a new singleton `ttsGcCursor` table between cron runs. A deployment with more orgs than `MAX_ORGS_PER_RUN` now advances through the full org list over successive hours instead of restarting from the lex-first org every time and starving lex-tail orgs forever. On reaching the end of the org list the cursor wraps to null and the next run starts over. - `gcOrgTtsChunks` skip-empty: an org with no rows older than the retention cutoff no longer counts against `MAX_ORGS_PER_RUN`. Without this, a busy tail of stale orgs sandwiched behind quiet lex-leading orgs would never get reaped.
larryro
added a commit
that referenced
this pull request
May 17, 2026
Closes #9, #10, #11, #12 — cascade correctness + GC durability. - `personalization_cascade.ts:cascadeOnOrgDeleted` swaps delete order: `db.delete` runs FIRST, then `storage.delete` inside the try/catch. Matches the documented contract in `tts/cascade_helpers.ts:55-62` — Convex `_storage` writes are out-of-band and not rolled back on tx abort, so the reverse order leaves a surviving row pointing at a dead storageId (404 on `/api/tts-audio`). - `threads/cascade_helpers.ts` step 7c TTS cleanup gets the same swap, for the same reason. - `cascadeOnTtsForMemberRemoved` per-mutation page cap lowered from 50 (~10K writes) to 30 (~6K writes) to stay under Convex's ~8K per-mutation write budget. `cascadeOnOrgDeleted` gets the same cap reduction. The hourly cron picks up whatever doesn't fit in a single pass — still well inside the 30-day Art 12(3) GDPR window. - `gcOrgTtsChunks` now persists its org-cursor in a new singleton `ttsGcCursor` table between cron runs. A deployment with more orgs than `MAX_ORGS_PER_RUN` now advances through the full org list over successive hours instead of restarting from the lex-first org every time and starving lex-tail orgs forever. On reaching the end of the org list the cursor wraps to null and the next run starts over. - `gcOrgTtsChunks` skip-empty: an org with no rows older than the retention cutoff no longer counts against `MAX_ORGS_PER_RUN`. Without this, a busy tail of stale orgs sandwiched behind quiet lex-leading orgs would never get reaped.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Remove 33 documentation files that are no longer needed:
These docs have become stale and are being consolidated or deprecated.
Summary by CodeRabbit
✏️ Tip: You can customize this high-level summary in your review settings.