optimize rag action#4
Conversation
…ories - Move action helper files and types into dedicated /helpers subdirectories across multiple workflow action modules: - approval/ - conversation/ - crawler/ - imap/ - integration/ - rag/ - website/ - websitePages/ - workflow/ - workflow_processing_records/ - Simplify RAG action interface: - Replace documentId/organizationId params with single recordId - Remove forceReupload and includeMetadata options - Remove explicit timeout parameters from workflow definitions - Update upload_text and upload_document operations accordingly - Update predefined workflow definitions to use new recordId parameter: - customer_rag_sync.ts - document_rag_sync.ts - product_rag_sync.ts - website_pages_rag_sync.ts - Update all import paths across IMAP library files and action modules - Remove obsolete README documentation files (imap, product, rag) - Regenerate Convex API type definitions
- Rename `documentId` to `recordId` across workflow processing records
- Rename `documentCreationTime` to `recordCreationTime`
- Rename `workflowId` to `wfDefinitionId` for consistency with schema
- Add `rootWfDefinitionId` to wfExecutions for tracking workflow family versions
- Update schema indexes: `by_document` -> `by_record`, `by_org_table_workflow_*` -> `by_org_table_wfDefinition_*`
- Update predefined workflows to use `{{rootWfDefinitionId}}` instead of `{{workflowId}}`
- Change default RAG sync schedules from hourly to every 20 minutes
- Remove unused `create_execution.ts` and related types
- Rename helper function `isDocumentProcessed` to `isRecordProcessed`
c8f6e70 to
e2378b4
Compare
|
Caution Review failedThe pull request is closed. 📝 WalkthroughWalkthroughThis PR systematically refactors the workflow processing and execution system by renaming fields from document-centric to record-centric terminology (documentId → recordId, documentCreationTime → recordCreationTime), renaming workflowId to wfDefinitionId, and introducing rootWfDefinitionId to track workflow versions. It reorganizes helper modules into ./helpers/ subdirectories across multiple action types, refactors RAG upload APIs from multi-parameter to object-based signatures, removes the createExecution function, updates database schema indexes to reflect record terminology, and adjusts default RAG sync workflow schedules from hourly to 20 minutes. Import paths throughout are updated to accommodate the new directory structure. Estimated code review effort🎯 4 (Complex) | ⏱️ ~75 minutes Specific areas requiring attention:
Possibly related PRs
📜 Recent review detailsConfiguration used: CodeRabbit UI Review profile: ASSERTIVE Plan: Pro (Legacy) ⛔ Files ignored due to path filters (1)
📒 Files selected for processing (97)
Comment |
…volos) Replaces the runtime patch_sops.py monkey-patch with an in-source change. Reads existing *.secrets.json as plain JSON (seeded by tale-init.sh from OPENROUTER_API_KEY) with OPENROUTER_API_KEY / OPENAI_API_KEY env fallback; saveProviderSecret writes plain JSON instead of SOPS-encrypting. Also documented in FORK.md as patch tale-project#4.
Round-2-confirmed critical bug. The docstring on `requestErasure`
advertised `LEGAL_HOLD_BLOCKS_ERASURE` as a refusal code; the body had
`const blocked: string[] = []` hard-coded as a "phase-8 placeholder"
with `loadActiveHolds` commented out. Result: an org admin (or anyone
with a compromised admin token) could spoliate litigation-hold data via
this mutation, and the audit row would record `threadsBlocked: []` —
actively misleading. Direct GDPR Art 17(3)(e) and FRCP 37(e) violation.
`loadActiveHolds` ships in the same branch (legal_hold.ts:65) and is
already consumed by retention_cleanup; only erasure ignored it.
This commit:
- Calls `loadActiveHolds` and partitions the subject's threads into
held vs erasable.
- Fails the whole request closed when `holds.orgHeld === true` or any
of the subject's threads is held. Emits a `gdpr_erasure_blocked_by_hold`
audit row before throwing so the refusal itself is logged in the
tamper-evident chain (compliance teams need this).
- Throws `ConvexError({ code: 'LEGAL_HOLD_BLOCKS_ERASURE', orgHeld,
heldThreadIds })` matching the docstring contract.
- Tracks per-thread cascade completion explicitly. The previous loop
ran cascadeDeleteThreadChildren up to 50 times then incremented
`erased` regardless of outcome; a thread with >50 pages of children
was reported as erased while leaving residue. Now only counts threads
whose cascade actually returned `done: true`, and emits the audit row
with `status: 'failure'` + an explicit error message when any thread
was partial.
- Audit row records `heldThreadIds` (when present) and
`threadsTargeted` (so reviewers can spot partial erasures).
Out-of-scope here (separate work-streams):
- Sub-thread cascade hold check (W1 #3 — cascade_helpers).
- Snapshot-race fix in deleteExpired* mutations (W1 #4).
- Subject scope expansion (W2: userMemories, BetterAuth, audit-PII
scrub, RAG propagation, gdprErasureRequests state machine).
…ascade Round-1 reviewer-confirmed: 12 of 14 retention cleanup categories did not consult holds.orgHeld, and cascadeDeleteThreadChildren recursed into sub-threads with no hold check. Both let a litigation/preservation hold be silently bypassed for the table that records *why* the hold exists (auditLogs), every PII-bearing table, and any held sub-thread when its parent ages out — direct US FRCP 37(e) / EU GDPR Art 21 spoliation risk. retention_cleanup.ts (W1 #2) - Added `holds: ActiveHolds` to every cleanup category that lacked it: cleanupTempFiles (user + agent), cleanupAuditLogs, cleanupWorkflowLogs, cleanupUsageLedger, cleanupChatFilterEvents, cleanupPromptTemplates, cleanupMessageFeedback, cleanupMemoryAudit, cleanupCustomers, cleanupVendors, cleanupExternalConversations, cleanupMessageMetadata. - Each short-circuits with a clear info log when holds.orgHeld is true. - cleanupWorkflowLogs additionally consults holds.executionIds for per-execution holds (`targetType: 'execution'`) — until now those rows were silently ignored by cleanup. - The dispatcher's category list updated to thread `holds` to all 15 category invocations. cascade_helpers.ts (W1 #3 + partial W1 #4) - cascadeDeleteThreadChildren accepts an optional `holds` snapshot. If omitted AND organizationId is set, the helper re-loads `legalHolds` itself — closing the snapshot-once race where a hold placed mid-run provided zero protection because the dispatcher's pre-fetched Set was already stale by the time per-thread cascade fired. - Sub-thread recursion now passes the snapshot through, so the per-sub-thread hold check uses the same authoritative read. - The helper returns `{ done: true, remaining: 0 }` (no-op) on a held thread; callers (retention / erasure) treat that as "skip and continue" rather than "delete completed".
…OCTOU Adds assertSafeRetentionDelete to internal_mutations_retention.ts and threads the new cutoffMs through every retention dispatcher call site. The guard runs in-mutation (V8 transaction) and closes three round-2 findings simultaneously: - W1 #4 — snapshot-race vs legal hold. The dispatcher's loadActiveHoldsForOrg snapshot is up to 25 min stale; a hold placed AFTER snapshot but BEFORE the per-row mutation otherwise had zero protection. Re-reading inside each mutation makes the in-mutation read the only authoritative gate. - W6 #10 — cross-org corruption. Every deleteExpired* trusted args.organizationId blindly; a swapped id silently deleted org A's row while logging the audit under org B (and forking the per-org hash chain). Now we assert row.organizationId === args.organizationId. - W4 #13 — cutoff TOCTOU. Between the dispatcher's listExpired query and the per-row delete, a user can re-touch the row (chat thread.updatedAt bump, document patch). The mutation now re-evaluates cutoff against (updatedAt ?? _creationTime) and skips when the row is no longer eligible. Mutations updated (each accepts new optional cutoffMs): - deleteExpiredDocument (targetType: 'document') - deleteExpiredThread (targetType: 'thread') - deleteExpiredWorkflowExecution (targetType: 'execution') - deleteExpiredWorkflowTriggerLog (orgId + cutoff only) - deleteExpiredCustomer / Vendor / ExternalConversation / PromptTemplate / MessageFeedback / MemoryAuditRow / ChatFilterEvent / UsageLedgerRow (orgId + cutoff) Out-of-scope here: - deleteExpiredTempFile, deleteExpiredMessageMetadata, deleteExpiredTwoFactorAttempt, deleteExpiredLoginAttempt, deleteExpiredLoginBlockCounter — none of these carry organizationId on their args today (they're either cross-org-by-shape or have an indirect link). The orgHeld / cross-org checks are not applicable without an org backreference, which is itself a separate phase-10 follow-up. Their existing absence-of-row idempotence is preserved.
…eout Round-2 v15 confirmed: /config unauthenticated, /openapi.json + /docs + /redoc unauthenticated, RAG container ran as root, default token baked into image ENV, strict-mode env name diverged across the wire, non-constant-time token compare, plus three SSRF-guard gaps. services/rag/app/auth.py - W7 #3: hmac.compare_digest replaces == on the bearer compare. Removes the dead-code EXEMPT_PATHS frozenset. services/rag/app/routers/health.py - W7 #1: split into public_router (`/`, `/health`) and protected_router (`/config`). main.py mounts the protected one under Depends(verify_internal_token). Old `router` re-export stays for backwards compat. services/rag/app/main.py - W7 #2: docs_url / redoc_url / openapi_url are None outside debug. - W7 #4: CORS allow_credentials flipped to False (bearer rides Authorization, never cookies). - W7 #1 wiring: mount health-public + health-protected separately. services/rag/app/config.py - W7 #8: require_custom_internal_token accepts BOTH RAG_REQUIRE_CUSTOM_INTERNAL_TOKEN and TALE_REQUIRE_CUSTOM_RAG_TOKEN via pydantic AliasChoices. services/rag/Dockerfile + services/convex/Dockerfile - W7 #5: RAG container runs as non-root (uid:gid 1001:1001 `app`). RAG ingests untrusted PDFs/DOCX through native parsers; biggest blast radius in the stack, now hardened. - W7 #6: removed RAG_INTERNAL_TOKEN=tale-rag-dev-only ENV bake from both runtime + scratch-squash stages and the matching bake in services/convex/Dockerfile. Operators MUST supply via env / compose / k8s secret. services/platform/convex/lib/helpers/rag_config.ts - W7 #9 F1: `redirect: 'manual'` on every ragFetch. - W7 #9 F2: added fc00::/7 (IPv6 ULA) to v6 blocklist (AWS IPv6 IMDSv2). - W7 #9 F3: strip trailing `.` before hostname blocklist lookup. - W7 #9 F4: re-validate URL per ragFetch invocation (DNS rebinding + env rotation mitigation). - W7 #9 F9: deleted path.startsWith('http') override branch (future- bypass foot-gun). services/platform/convex/agent_tools/rag/helpers/fetch_document_chunks.ts - W7 #10: pass timeoutMs=60_000 (default 10s was a regression). - Plus MAX_ITERATIONS=30 cap and "cursor did not advance" break to defend against an adversarial RAG response.
… rag propagation
W2 — completes the round-2-confirmed gaps in requestErasure:
- v11 C2: erasure was a mutation, structurally unable to call the RAG
service. Vector chunks + indexed text persisted after a "successful"
erasure receipt.
- v11 C2/C3: bounded loop reported partial cascades as success; receipt
lied about coverage. No durable receipt for the subject (Art 19).
- v12 H3: single-mutation cascade hit Convex transaction limits.
Schema (gdprErasureRequestsTable)
- New per-request state machine: pending → running → done | partial |
failed. Carries threadsTargeted snapshot, threadsErased counter,
threadsBlockedByHold, ragDocumentsRemoved, slaDeadlineAt
(= requestedAt + 30 days per Art 12(3)), startedAt / completedAt.
erasure.ts — split-phase pipeline
- Public mutation requestErasure: auth + org-admin gate + legal-hold
gate (refusal also audit-logged) + inserts request row + schedules
the processor + returns { requestId, threadsTargeted } so the admin
can poll the row for progress.
- Internal action processErasureRequest: flips row to 'running',
cascades each thread via cascadeDeleteThreadChildren (up to 50 pages
per thread), propagates to RAG via ragFetch DELETE per subject-owned
document, then finalizeProcessing transitions the row to done |
partial | failed with explicit counts.
- Three supporting internal mutations: beginProcessing,
eraseThreadById, finalizeProcessing, listSubjectDocuments.
Subject scope
- Today: chat threads + their RAG-indexed descendants (departing-
employee case).
- Out of scope (documented in file header): userMemories,
userPreferences, feedback, fileMetadata blobs, audit-chain PII
scrub (W3 #4), BetterAuth tables, loginAttempts / twoFactorAttempts,
policyAcknowledgements.
Receipt = the gdprErasureRequests row. The admin UI can render the
full state machine without scraping audit logs; SLA deadline is
durable.
Two work-streams in one commit because they share the erasure
processor's per-table cleanup path.
Subject-scope expansion (W2 follow-up)
- Adds eight per-table internal mutations called sequentially after
the chat-thread cascade + RAG propagation:
* eraseSubjectUserMemories
* eraseSubjectUserPreferences
* eraseSubjectMessageFeedback
* eraseSubjectFileMetadata (deletes _storage blobs too)
* eraseSubjectUsageLedger
* eraseSubjectTwoFactorAttempts
* eraseSubjectPolicyAcknowledgements
* eraseSubjectOnedrive
- loginAttempts / loginBlockCounters are email-keyed; new
lookupSubjectEmail mutation reads the BetterAuth user row first,
then eraseSubjectLoginAttempts wipes both tables.
- BetterAuth `user`/`account`/`session`/`verification` rows are
intentionally NOT touched here — the auth component owns its own
delete-user flow and direct table manipulation could leave dangling
sessions in flight. Documented as out-of-scope in the file header
for a follow-up that goes through `authComponent`.
Audit-chain PII scrub (W3 #4)
- New `auditLogs.piiScrubbed` (+ `piiScrubbedAt`) field. The processor
blanks `actorEmail`, `actorRole`, `ipAddress`, `userAgent`,
`previousState`, `newState`, `metadata` on every row where
`actorId === targetUserId`. Row existence + timestamp + action are
preserved per Art 17(3)(b)/(e) accountability requirements.
- New `auditLogCheckpoints.subtype` field discriminates 'retention'
(existing) from 'pii_scrub'. The scrub mutation inserts a signed
checkpoint with `scrubbedSubjectId` + `scrubbedRowCount` so an
external auditor can confirm the divergence is bounded and
intentional.
- `verifyIntegrity` updated to recognize the boundary: rows with
`piiScrubbed: true` (or whose actorId is in any pii_scrub
checkpoint's subjectId set) skip the SHA-256 recompute step. Chain
forward-link via `previousHash` is still validated — the scrub does
not break order, only blanks the body.
- `scrubSubjectAuditLogs` is in audit_logs/internal_mutations.ts so
it can use the existing `signCheckpoint` HMAC helper directly.
The processor finalizeProcessing audit log on a 'done' run now
implicitly covers the scrub-marker checkpoint (admins reviewing the
erasure see both rows in the chain). No client-callable surface added
— scrub fires only as part of an authorized requestErasure flow.
Round-2-confirmed critical bug. The docstring on `requestErasure`
advertised `LEGAL_HOLD_BLOCKS_ERASURE` as a refusal code; the body had
`const blocked: string[] = []` hard-coded as a "phase-8 placeholder"
with `loadActiveHolds` commented out. Result: an org admin (or anyone
with a compromised admin token) could spoliate litigation-hold data via
this mutation, and the audit row would record `threadsBlocked: []` —
actively misleading. Direct GDPR Art 17(3)(e) and FRCP 37(e) violation.
`loadActiveHolds` ships in the same branch (legal_hold.ts:65) and is
already consumed by retention_cleanup; only erasure ignored it.
This commit:
- Calls `loadActiveHolds` and partitions the subject's threads into
held vs erasable.
- Fails the whole request closed when `holds.orgHeld === true` or any
of the subject's threads is held. Emits a `gdpr_erasure_blocked_by_hold`
audit row before throwing so the refusal itself is logged in the
tamper-evident chain (compliance teams need this).
- Throws `ConvexError({ code: 'LEGAL_HOLD_BLOCKS_ERASURE', orgHeld,
heldThreadIds })` matching the docstring contract.
- Tracks per-thread cascade completion explicitly. The previous loop
ran cascadeDeleteThreadChildren up to 50 times then incremented
`erased` regardless of outcome; a thread with >50 pages of children
was reported as erased while leaving residue. Now only counts threads
whose cascade actually returned `done: true`, and emits the audit row
with `status: 'failure'` + an explicit error message when any thread
was partial.
- Audit row records `heldThreadIds` (when present) and
`threadsTargeted` (so reviewers can spot partial erasures).
Out-of-scope here (separate work-streams):
- Sub-thread cascade hold check (W1 #3 — cascade_helpers).
- Snapshot-race fix in deleteExpired* mutations (W1 #4).
- Subject scope expansion (W2: userMemories, BetterAuth, audit-PII
scrub, RAG propagation, gdprErasureRequests state machine).
…ascade Round-1 reviewer-confirmed: 12 of 14 retention cleanup categories did not consult holds.orgHeld, and cascadeDeleteThreadChildren recursed into sub-threads with no hold check. Both let a litigation/preservation hold be silently bypassed for the table that records *why* the hold exists (auditLogs), every PII-bearing table, and any held sub-thread when its parent ages out — direct US FRCP 37(e) / EU GDPR Art 21 spoliation risk. retention_cleanup.ts (W1 #2) - Added `holds: ActiveHolds` to every cleanup category that lacked it: cleanupTempFiles (user + agent), cleanupAuditLogs, cleanupWorkflowLogs, cleanupUsageLedger, cleanupChatFilterEvents, cleanupPromptTemplates, cleanupMessageFeedback, cleanupMemoryAudit, cleanupCustomers, cleanupVendors, cleanupExternalConversations, cleanupMessageMetadata. - Each short-circuits with a clear info log when holds.orgHeld is true. - cleanupWorkflowLogs additionally consults holds.executionIds for per-execution holds (`targetType: 'execution'`) — until now those rows were silently ignored by cleanup. - The dispatcher's category list updated to thread `holds` to all 15 category invocations. cascade_helpers.ts (W1 #3 + partial W1 #4) - cascadeDeleteThreadChildren accepts an optional `holds` snapshot. If omitted AND organizationId is set, the helper re-loads `legalHolds` itself — closing the snapshot-once race where a hold placed mid-run provided zero protection because the dispatcher's pre-fetched Set was already stale by the time per-thread cascade fired. - Sub-thread recursion now passes the snapshot through, so the per-sub-thread hold check uses the same authoritative read. - The helper returns `{ done: true, remaining: 0 }` (no-op) on a held thread; callers (retention / erasure) treat that as "skip and continue" rather than "delete completed".
…OCTOU Adds assertSafeRetentionDelete to internal_mutations_retention.ts and threads the new cutoffMs through every retention dispatcher call site. The guard runs in-mutation (V8 transaction) and closes three round-2 findings simultaneously: - W1 #4 — snapshot-race vs legal hold. The dispatcher's loadActiveHoldsForOrg snapshot is up to 25 min stale; a hold placed AFTER snapshot but BEFORE the per-row mutation otherwise had zero protection. Re-reading inside each mutation makes the in-mutation read the only authoritative gate. - W6 #10 — cross-org corruption. Every deleteExpired* trusted args.organizationId blindly; a swapped id silently deleted org A's row while logging the audit under org B (and forking the per-org hash chain). Now we assert row.organizationId === args.organizationId. - W4 #13 — cutoff TOCTOU. Between the dispatcher's listExpired query and the per-row delete, a user can re-touch the row (chat thread.updatedAt bump, document patch). The mutation now re-evaluates cutoff against (updatedAt ?? _creationTime) and skips when the row is no longer eligible. Mutations updated (each accepts new optional cutoffMs): - deleteExpiredDocument (targetType: 'document') - deleteExpiredThread (targetType: 'thread') - deleteExpiredWorkflowExecution (targetType: 'execution') - deleteExpiredWorkflowTriggerLog (orgId + cutoff only) - deleteExpiredCustomer / Vendor / ExternalConversation / PromptTemplate / MessageFeedback / MemoryAuditRow / ChatFilterEvent / UsageLedgerRow (orgId + cutoff) Out-of-scope here: - deleteExpiredTempFile, deleteExpiredMessageMetadata, deleteExpiredTwoFactorAttempt, deleteExpiredLoginAttempt, deleteExpiredLoginBlockCounter — none of these carry organizationId on their args today (they're either cross-org-by-shape or have an indirect link). The orgHeld / cross-org checks are not applicable without an org backreference, which is itself a separate phase-10 follow-up. Their existing absence-of-row idempotence is preserved.
…eout Round-2 v15 confirmed: /config unauthenticated, /openapi.json + /docs + /redoc unauthenticated, RAG container ran as root, default token baked into image ENV, strict-mode env name diverged across the wire, non-constant-time token compare, plus three SSRF-guard gaps. services/rag/app/auth.py - W7 #3: hmac.compare_digest replaces == on the bearer compare. Removes the dead-code EXEMPT_PATHS frozenset. services/rag/app/routers/health.py - W7 #1: split into public_router (`/`, `/health`) and protected_router (`/config`). main.py mounts the protected one under Depends(verify_internal_token). Old `router` re-export stays for backwards compat. services/rag/app/main.py - W7 #2: docs_url / redoc_url / openapi_url are None outside debug. - W7 #4: CORS allow_credentials flipped to False (bearer rides Authorization, never cookies). - W7 #1 wiring: mount health-public + health-protected separately. services/rag/app/config.py - W7 #8: require_custom_internal_token accepts BOTH RAG_REQUIRE_CUSTOM_INTERNAL_TOKEN and TALE_REQUIRE_CUSTOM_RAG_TOKEN via pydantic AliasChoices. services/rag/Dockerfile + services/convex/Dockerfile - W7 #5: RAG container runs as non-root (uid:gid 1001:1001 `app`). RAG ingests untrusted PDFs/DOCX through native parsers; biggest blast radius in the stack, now hardened. - W7 #6: removed RAG_INTERNAL_TOKEN=tale-rag-dev-only ENV bake from both runtime + scratch-squash stages and the matching bake in services/convex/Dockerfile. Operators MUST supply via env / compose / k8s secret. services/platform/convex/lib/helpers/rag_config.ts - W7 #9 F1: `redirect: 'manual'` on every ragFetch. - W7 #9 F2: added fc00::/7 (IPv6 ULA) to v6 blocklist (AWS IPv6 IMDSv2). - W7 #9 F3: strip trailing `.` before hostname blocklist lookup. - W7 #9 F4: re-validate URL per ragFetch invocation (DNS rebinding + env rotation mitigation). - W7 #9 F9: deleted path.startsWith('http') override branch (future- bypass foot-gun). services/platform/convex/agent_tools/rag/helpers/fetch_document_chunks.ts - W7 #10: pass timeoutMs=60_000 (default 10s was a regression). - Plus MAX_ITERATIONS=30 cap and "cursor did not advance" break to defend against an adversarial RAG response.
… rag propagation
W2 — completes the round-2-confirmed gaps in requestErasure:
- v11 C2: erasure was a mutation, structurally unable to call the RAG
service. Vector chunks + indexed text persisted after a "successful"
erasure receipt.
- v11 C2/C3: bounded loop reported partial cascades as success; receipt
lied about coverage. No durable receipt for the subject (Art 19).
- v12 H3: single-mutation cascade hit Convex transaction limits.
Schema (gdprErasureRequestsTable)
- New per-request state machine: pending → running → done | partial |
failed. Carries threadsTargeted snapshot, threadsErased counter,
threadsBlockedByHold, ragDocumentsRemoved, slaDeadlineAt
(= requestedAt + 30 days per Art 12(3)), startedAt / completedAt.
erasure.ts — split-phase pipeline
- Public mutation requestErasure: auth + org-admin gate + legal-hold
gate (refusal also audit-logged) + inserts request row + schedules
the processor + returns { requestId, threadsTargeted } so the admin
can poll the row for progress.
- Internal action processErasureRequest: flips row to 'running',
cascades each thread via cascadeDeleteThreadChildren (up to 50 pages
per thread), propagates to RAG via ragFetch DELETE per subject-owned
document, then finalizeProcessing transitions the row to done |
partial | failed with explicit counts.
- Three supporting internal mutations: beginProcessing,
eraseThreadById, finalizeProcessing, listSubjectDocuments.
Subject scope
- Today: chat threads + their RAG-indexed descendants (departing-
employee case).
- Out of scope (documented in file header): userMemories,
userPreferences, feedback, fileMetadata blobs, audit-chain PII
scrub (W3 #4), BetterAuth tables, loginAttempts / twoFactorAttempts,
policyAcknowledgements.
Receipt = the gdprErasureRequests row. The admin UI can render the
full state machine without scraping audit logs; SLA deadline is
durable.
Two work-streams in one commit because they share the erasure
processor's per-table cleanup path.
Subject-scope expansion (W2 follow-up)
- Adds eight per-table internal mutations called sequentially after
the chat-thread cascade + RAG propagation:
* eraseSubjectUserMemories
* eraseSubjectUserPreferences
* eraseSubjectMessageFeedback
* eraseSubjectFileMetadata (deletes _storage blobs too)
* eraseSubjectUsageLedger
* eraseSubjectTwoFactorAttempts
* eraseSubjectPolicyAcknowledgements
* eraseSubjectOnedrive
- loginAttempts / loginBlockCounters are email-keyed; new
lookupSubjectEmail mutation reads the BetterAuth user row first,
then eraseSubjectLoginAttempts wipes both tables.
- BetterAuth `user`/`account`/`session`/`verification` rows are
intentionally NOT touched here — the auth component owns its own
delete-user flow and direct table manipulation could leave dangling
sessions in flight. Documented as out-of-scope in the file header
for a follow-up that goes through `authComponent`.
Audit-chain PII scrub (W3 #4)
- New `auditLogs.piiScrubbed` (+ `piiScrubbedAt`) field. The processor
blanks `actorEmail`, `actorRole`, `ipAddress`, `userAgent`,
`previousState`, `newState`, `metadata` on every row where
`actorId === targetUserId`. Row existence + timestamp + action are
preserved per Art 17(3)(b)/(e) accountability requirements.
- New `auditLogCheckpoints.subtype` field discriminates 'retention'
(existing) from 'pii_scrub'. The scrub mutation inserts a signed
checkpoint with `scrubbedSubjectId` + `scrubbedRowCount` so an
external auditor can confirm the divergence is bounded and
intentional.
- `verifyIntegrity` updated to recognize the boundary: rows with
`piiScrubbed: true` (or whose actorId is in any pii_scrub
checkpoint's subjectId set) skip the SHA-256 recompute step. Chain
forward-link via `previousHash` is still validated — the scrub does
not break order, only blanks the body.
- `scrubSubjectAuditLogs` is in audit_logs/internal_mutations.ts so
it can use the existing `signCheckpoint` HMAC helper directly.
The processor finalizeProcessing audit log on a 'done' run now
implicitly covers the scrub-marker checkpoint (admins reviewing the
erasure see both rows in the chain). No client-callable surface added
— scrub fires only as part of an authorized requestErasure flow.
Round-2-confirmed critical bug. The docstring on `requestErasure`
advertised `LEGAL_HOLD_BLOCKS_ERASURE` as a refusal code; the body had
`const blocked: string[] = []` hard-coded as a "phase-8 placeholder"
with `loadActiveHolds` commented out. Result: an org admin (or anyone
with a compromised admin token) could spoliate litigation-hold data via
this mutation, and the audit row would record `threadsBlocked: []` —
actively misleading. Direct GDPR Art 17(3)(e) and FRCP 37(e) violation.
`loadActiveHolds` ships in the same branch (legal_hold.ts:65) and is
already consumed by retention_cleanup; only erasure ignored it.
This commit:
- Calls `loadActiveHolds` and partitions the subject's threads into
held vs erasable.
- Fails the whole request closed when `holds.orgHeld === true` or any
of the subject's threads is held. Emits a `gdpr_erasure_blocked_by_hold`
audit row before throwing so the refusal itself is logged in the
tamper-evident chain (compliance teams need this).
- Throws `ConvexError({ code: 'LEGAL_HOLD_BLOCKS_ERASURE', orgHeld,
heldThreadIds })` matching the docstring contract.
- Tracks per-thread cascade completion explicitly. The previous loop
ran cascadeDeleteThreadChildren up to 50 times then incremented
`erased` regardless of outcome; a thread with >50 pages of children
was reported as erased while leaving residue. Now only counts threads
whose cascade actually returned `done: true`, and emits the audit row
with `status: 'failure'` + an explicit error message when any thread
was partial.
- Audit row records `heldThreadIds` (when present) and
`threadsTargeted` (so reviewers can spot partial erasures).
Out-of-scope here (separate work-streams):
- Sub-thread cascade hold check (W1 #3 — cascade_helpers).
- Snapshot-race fix in deleteExpired* mutations (W1 #4).
- Subject scope expansion (W2: userMemories, BetterAuth, audit-PII
scrub, RAG propagation, gdprErasureRequests state machine).
…ascade Round-1 reviewer-confirmed: 12 of 14 retention cleanup categories did not consult holds.orgHeld, and cascadeDeleteThreadChildren recursed into sub-threads with no hold check. Both let a litigation/preservation hold be silently bypassed for the table that records *why* the hold exists (auditLogs), every PII-bearing table, and any held sub-thread when its parent ages out — direct US FRCP 37(e) / EU GDPR Art 21 spoliation risk. retention_cleanup.ts (W1 #2) - Added `holds: ActiveHolds` to every cleanup category that lacked it: cleanupTempFiles (user + agent), cleanupAuditLogs, cleanupWorkflowLogs, cleanupUsageLedger, cleanupChatFilterEvents, cleanupPromptTemplates, cleanupMessageFeedback, cleanupMemoryAudit, cleanupCustomers, cleanupVendors, cleanupExternalConversations, cleanupMessageMetadata. - Each short-circuits with a clear info log when holds.orgHeld is true. - cleanupWorkflowLogs additionally consults holds.executionIds for per-execution holds (`targetType: 'execution'`) — until now those rows were silently ignored by cleanup. - The dispatcher's category list updated to thread `holds` to all 15 category invocations. cascade_helpers.ts (W1 #3 + partial W1 #4) - cascadeDeleteThreadChildren accepts an optional `holds` snapshot. If omitted AND organizationId is set, the helper re-loads `legalHolds` itself — closing the snapshot-once race where a hold placed mid-run provided zero protection because the dispatcher's pre-fetched Set was already stale by the time per-thread cascade fired. - Sub-thread recursion now passes the snapshot through, so the per-sub-thread hold check uses the same authoritative read. - The helper returns `{ done: true, remaining: 0 }` (no-op) on a held thread; callers (retention / erasure) treat that as "skip and continue" rather than "delete completed".
…OCTOU Adds assertSafeRetentionDelete to internal_mutations_retention.ts and threads the new cutoffMs through every retention dispatcher call site. The guard runs in-mutation (V8 transaction) and closes three round-2 findings simultaneously: - W1 #4 — snapshot-race vs legal hold. The dispatcher's loadActiveHoldsForOrg snapshot is up to 25 min stale; a hold placed AFTER snapshot but BEFORE the per-row mutation otherwise had zero protection. Re-reading inside each mutation makes the in-mutation read the only authoritative gate. - W6 #10 — cross-org corruption. Every deleteExpired* trusted args.organizationId blindly; a swapped id silently deleted org A's row while logging the audit under org B (and forking the per-org hash chain). Now we assert row.organizationId === args.organizationId. - W4 #13 — cutoff TOCTOU. Between the dispatcher's listExpired query and the per-row delete, a user can re-touch the row (chat thread.updatedAt bump, document patch). The mutation now re-evaluates cutoff against (updatedAt ?? _creationTime) and skips when the row is no longer eligible. Mutations updated (each accepts new optional cutoffMs): - deleteExpiredDocument (targetType: 'document') - deleteExpiredThread (targetType: 'thread') - deleteExpiredWorkflowExecution (targetType: 'execution') - deleteExpiredWorkflowTriggerLog (orgId + cutoff only) - deleteExpiredCustomer / Vendor / ExternalConversation / PromptTemplate / MessageFeedback / MemoryAuditRow / ChatFilterEvent / UsageLedgerRow (orgId + cutoff) Out-of-scope here: - deleteExpiredTempFile, deleteExpiredMessageMetadata, deleteExpiredTwoFactorAttempt, deleteExpiredLoginAttempt, deleteExpiredLoginBlockCounter — none of these carry organizationId on their args today (they're either cross-org-by-shape or have an indirect link). The orgHeld / cross-org checks are not applicable without an org backreference, which is itself a separate phase-10 follow-up. Their existing absence-of-row idempotence is preserved.
…eout Round-2 v15 confirmed: /config unauthenticated, /openapi.json + /docs + /redoc unauthenticated, RAG container ran as root, default token baked into image ENV, strict-mode env name diverged across the wire, non-constant-time token compare, plus three SSRF-guard gaps. services/rag/app/auth.py - W7 #3: hmac.compare_digest replaces == on the bearer compare. Removes the dead-code EXEMPT_PATHS frozenset. services/rag/app/routers/health.py - W7 #1: split into public_router (`/`, `/health`) and protected_router (`/config`). main.py mounts the protected one under Depends(verify_internal_token). Old `router` re-export stays for backwards compat. services/rag/app/main.py - W7 #2: docs_url / redoc_url / openapi_url are None outside debug. - W7 #4: CORS allow_credentials flipped to False (bearer rides Authorization, never cookies). - W7 #1 wiring: mount health-public + health-protected separately. services/rag/app/config.py - W7 #8: require_custom_internal_token accepts BOTH RAG_REQUIRE_CUSTOM_INTERNAL_TOKEN and TALE_REQUIRE_CUSTOM_RAG_TOKEN via pydantic AliasChoices. services/rag/Dockerfile + services/convex/Dockerfile - W7 #5: RAG container runs as non-root (uid:gid 1001:1001 `app`). RAG ingests untrusted PDFs/DOCX through native parsers; biggest blast radius in the stack, now hardened. - W7 #6: removed RAG_INTERNAL_TOKEN=tale-rag-dev-only ENV bake from both runtime + scratch-squash stages and the matching bake in services/convex/Dockerfile. Operators MUST supply via env / compose / k8s secret. services/platform/convex/lib/helpers/rag_config.ts - W7 #9 F1: `redirect: 'manual'` on every ragFetch. - W7 #9 F2: added fc00::/7 (IPv6 ULA) to v6 blocklist (AWS IPv6 IMDSv2). - W7 #9 F3: strip trailing `.` before hostname blocklist lookup. - W7 #9 F4: re-validate URL per ragFetch invocation (DNS rebinding + env rotation mitigation). - W7 #9 F9: deleted path.startsWith('http') override branch (future- bypass foot-gun). services/platform/convex/agent_tools/rag/helpers/fetch_document_chunks.ts - W7 #10: pass timeoutMs=60_000 (default 10s was a regression). - Plus MAX_ITERATIONS=30 cap and "cursor did not advance" break to defend against an adversarial RAG response.
… rag propagation
W2 — completes the round-2-confirmed gaps in requestErasure:
- v11 C2: erasure was a mutation, structurally unable to call the RAG
service. Vector chunks + indexed text persisted after a "successful"
erasure receipt.
- v11 C2/C3: bounded loop reported partial cascades as success; receipt
lied about coverage. No durable receipt for the subject (Art 19).
- v12 H3: single-mutation cascade hit Convex transaction limits.
Schema (gdprErasureRequestsTable)
- New per-request state machine: pending → running → done | partial |
failed. Carries threadsTargeted snapshot, threadsErased counter,
threadsBlockedByHold, ragDocumentsRemoved, slaDeadlineAt
(= requestedAt + 30 days per Art 12(3)), startedAt / completedAt.
erasure.ts — split-phase pipeline
- Public mutation requestErasure: auth + org-admin gate + legal-hold
gate (refusal also audit-logged) + inserts request row + schedules
the processor + returns { requestId, threadsTargeted } so the admin
can poll the row for progress.
- Internal action processErasureRequest: flips row to 'running',
cascades each thread via cascadeDeleteThreadChildren (up to 50 pages
per thread), propagates to RAG via ragFetch DELETE per subject-owned
document, then finalizeProcessing transitions the row to done |
partial | failed with explicit counts.
- Three supporting internal mutations: beginProcessing,
eraseThreadById, finalizeProcessing, listSubjectDocuments.
Subject scope
- Today: chat threads + their RAG-indexed descendants (departing-
employee case).
- Out of scope (documented in file header): userMemories,
userPreferences, feedback, fileMetadata blobs, audit-chain PII
scrub (W3 #4), BetterAuth tables, loginAttempts / twoFactorAttempts,
policyAcknowledgements.
Receipt = the gdprErasureRequests row. The admin UI can render the
full state machine without scraping audit logs; SLA deadline is
durable.
Two work-streams in one commit because they share the erasure
processor's per-table cleanup path.
Subject-scope expansion (W2 follow-up)
- Adds eight per-table internal mutations called sequentially after
the chat-thread cascade + RAG propagation:
* eraseSubjectUserMemories
* eraseSubjectUserPreferences
* eraseSubjectMessageFeedback
* eraseSubjectFileMetadata (deletes _storage blobs too)
* eraseSubjectUsageLedger
* eraseSubjectTwoFactorAttempts
* eraseSubjectPolicyAcknowledgements
* eraseSubjectOnedrive
- loginAttempts / loginBlockCounters are email-keyed; new
lookupSubjectEmail mutation reads the BetterAuth user row first,
then eraseSubjectLoginAttempts wipes both tables.
- BetterAuth `user`/`account`/`session`/`verification` rows are
intentionally NOT touched here — the auth component owns its own
delete-user flow and direct table manipulation could leave dangling
sessions in flight. Documented as out-of-scope in the file header
for a follow-up that goes through `authComponent`.
Audit-chain PII scrub (W3 #4)
- New `auditLogs.piiScrubbed` (+ `piiScrubbedAt`) field. The processor
blanks `actorEmail`, `actorRole`, `ipAddress`, `userAgent`,
`previousState`, `newState`, `metadata` on every row where
`actorId === targetUserId`. Row existence + timestamp + action are
preserved per Art 17(3)(b)/(e) accountability requirements.
- New `auditLogCheckpoints.subtype` field discriminates 'retention'
(existing) from 'pii_scrub'. The scrub mutation inserts a signed
checkpoint with `scrubbedSubjectId` + `scrubbedRowCount` so an
external auditor can confirm the divergence is bounded and
intentional.
- `verifyIntegrity` updated to recognize the boundary: rows with
`piiScrubbed: true` (or whose actorId is in any pii_scrub
checkpoint's subjectId set) skip the SHA-256 recompute step. Chain
forward-link via `previousHash` is still validated — the scrub does
not break order, only blanks the body.
- `scrubSubjectAuditLogs` is in audit_logs/internal_mutations.ts so
it can use the existing `signCheckpoint` HMAC helper directly.
The processor finalizeProcessing audit log on a 'done' run now
implicitly covers the scrub-marker checkpoint (admins reviewing the
erasure see both rows in the chain). No client-callable surface added
— scrub fires only as part of an authorized requestErasure flow.
Round-2-confirmed critical bug. The docstring on `requestErasure`
advertised `LEGAL_HOLD_BLOCKS_ERASURE` as a refusal code; the body had
`const blocked: string[] = []` hard-coded as a "phase-8 placeholder"
with `loadActiveHolds` commented out. Result: an org admin (or anyone
with a compromised admin token) could spoliate litigation-hold data via
this mutation, and the audit row would record `threadsBlocked: []` —
actively misleading. Direct GDPR Art 17(3)(e) and FRCP 37(e) violation.
`loadActiveHolds` ships in the same branch (legal_hold.ts:65) and is
already consumed by retention_cleanup; only erasure ignored it.
This commit:
- Calls `loadActiveHolds` and partitions the subject's threads into
held vs erasable.
- Fails the whole request closed when `holds.orgHeld === true` or any
of the subject's threads is held. Emits a `gdpr_erasure_blocked_by_hold`
audit row before throwing so the refusal itself is logged in the
tamper-evident chain (compliance teams need this).
- Throws `ConvexError({ code: 'LEGAL_HOLD_BLOCKS_ERASURE', orgHeld,
heldThreadIds })` matching the docstring contract.
- Tracks per-thread cascade completion explicitly. The previous loop
ran cascadeDeleteThreadChildren up to 50 times then incremented
`erased` regardless of outcome; a thread with >50 pages of children
was reported as erased while leaving residue. Now only counts threads
whose cascade actually returned `done: true`, and emits the audit row
with `status: 'failure'` + an explicit error message when any thread
was partial.
- Audit row records `heldThreadIds` (when present) and
`threadsTargeted` (so reviewers can spot partial erasures).
Out-of-scope here (separate work-streams):
- Sub-thread cascade hold check (W1 #3 — cascade_helpers).
- Snapshot-race fix in deleteExpired* mutations (W1 #4).
- Subject scope expansion (W2: userMemories, BetterAuth, audit-PII
scrub, RAG propagation, gdprErasureRequests state machine).
…ascade Round-1 reviewer-confirmed: 12 of 14 retention cleanup categories did not consult holds.orgHeld, and cascadeDeleteThreadChildren recursed into sub-threads with no hold check. Both let a litigation/preservation hold be silently bypassed for the table that records *why* the hold exists (auditLogs), every PII-bearing table, and any held sub-thread when its parent ages out — direct US FRCP 37(e) / EU GDPR Art 21 spoliation risk. retention_cleanup.ts (W1 #2) - Added `holds: ActiveHolds` to every cleanup category that lacked it: cleanupTempFiles (user + agent), cleanupAuditLogs, cleanupWorkflowLogs, cleanupUsageLedger, cleanupChatFilterEvents, cleanupPromptTemplates, cleanupMessageFeedback, cleanupMemoryAudit, cleanupCustomers, cleanupVendors, cleanupExternalConversations, cleanupMessageMetadata. - Each short-circuits with a clear info log when holds.orgHeld is true. - cleanupWorkflowLogs additionally consults holds.executionIds for per-execution holds (`targetType: 'execution'`) — until now those rows were silently ignored by cleanup. - The dispatcher's category list updated to thread `holds` to all 15 category invocations. cascade_helpers.ts (W1 #3 + partial W1 #4) - cascadeDeleteThreadChildren accepts an optional `holds` snapshot. If omitted AND organizationId is set, the helper re-loads `legalHolds` itself — closing the snapshot-once race where a hold placed mid-run provided zero protection because the dispatcher's pre-fetched Set was already stale by the time per-thread cascade fired. - Sub-thread recursion now passes the snapshot through, so the per-sub-thread hold check uses the same authoritative read. - The helper returns `{ done: true, remaining: 0 }` (no-op) on a held thread; callers (retention / erasure) treat that as "skip and continue" rather than "delete completed".
…OCTOU Adds assertSafeRetentionDelete to internal_mutations_retention.ts and threads the new cutoffMs through every retention dispatcher call site. The guard runs in-mutation (V8 transaction) and closes three round-2 findings simultaneously: - W1 #4 — snapshot-race vs legal hold. The dispatcher's loadActiveHoldsForOrg snapshot is up to 25 min stale; a hold placed AFTER snapshot but BEFORE the per-row mutation otherwise had zero protection. Re-reading inside each mutation makes the in-mutation read the only authoritative gate. - W6 #10 — cross-org corruption. Every deleteExpired* trusted args.organizationId blindly; a swapped id silently deleted org A's row while logging the audit under org B (and forking the per-org hash chain). Now we assert row.organizationId === args.organizationId. - W4 #13 — cutoff TOCTOU. Between the dispatcher's listExpired query and the per-row delete, a user can re-touch the row (chat thread.updatedAt bump, document patch). The mutation now re-evaluates cutoff against (updatedAt ?? _creationTime) and skips when the row is no longer eligible. Mutations updated (each accepts new optional cutoffMs): - deleteExpiredDocument (targetType: 'document') - deleteExpiredThread (targetType: 'thread') - deleteExpiredWorkflowExecution (targetType: 'execution') - deleteExpiredWorkflowTriggerLog (orgId + cutoff only) - deleteExpiredCustomer / Vendor / ExternalConversation / PromptTemplate / MessageFeedback / MemoryAuditRow / ChatFilterEvent / UsageLedgerRow (orgId + cutoff) Out-of-scope here: - deleteExpiredTempFile, deleteExpiredMessageMetadata, deleteExpiredTwoFactorAttempt, deleteExpiredLoginAttempt, deleteExpiredLoginBlockCounter — none of these carry organizationId on their args today (they're either cross-org-by-shape or have an indirect link). The orgHeld / cross-org checks are not applicable without an org backreference, which is itself a separate phase-10 follow-up. Their existing absence-of-row idempotence is preserved.
…eout Round-2 v15 confirmed: /config unauthenticated, /openapi.json + /docs + /redoc unauthenticated, RAG container ran as root, default token baked into image ENV, strict-mode env name diverged across the wire, non-constant-time token compare, plus three SSRF-guard gaps. services/rag/app/auth.py - W7 #3: hmac.compare_digest replaces == on the bearer compare. Removes the dead-code EXEMPT_PATHS frozenset. services/rag/app/routers/health.py - W7 #1: split into public_router (`/`, `/health`) and protected_router (`/config`). main.py mounts the protected one under Depends(verify_internal_token). Old `router` re-export stays for backwards compat. services/rag/app/main.py - W7 #2: docs_url / redoc_url / openapi_url are None outside debug. - W7 #4: CORS allow_credentials flipped to False (bearer rides Authorization, never cookies). - W7 #1 wiring: mount health-public + health-protected separately. services/rag/app/config.py - W7 #8: require_custom_internal_token accepts BOTH RAG_REQUIRE_CUSTOM_INTERNAL_TOKEN and TALE_REQUIRE_CUSTOM_RAG_TOKEN via pydantic AliasChoices. services/rag/Dockerfile + services/convex/Dockerfile - W7 #5: RAG container runs as non-root (uid:gid 1001:1001 `app`). RAG ingests untrusted PDFs/DOCX through native parsers; biggest blast radius in the stack, now hardened. - W7 #6: removed RAG_INTERNAL_TOKEN=tale-rag-dev-only ENV bake from both runtime + scratch-squash stages and the matching bake in services/convex/Dockerfile. Operators MUST supply via env / compose / k8s secret. services/platform/convex/lib/helpers/rag_config.ts - W7 #9 F1: `redirect: 'manual'` on every ragFetch. - W7 #9 F2: added fc00::/7 (IPv6 ULA) to v6 blocklist (AWS IPv6 IMDSv2). - W7 #9 F3: strip trailing `.` before hostname blocklist lookup. - W7 #9 F4: re-validate URL per ragFetch invocation (DNS rebinding + env rotation mitigation). - W7 #9 F9: deleted path.startsWith('http') override branch (future- bypass foot-gun). services/platform/convex/agent_tools/rag/helpers/fetch_document_chunks.ts - W7 #10: pass timeoutMs=60_000 (default 10s was a regression). - Plus MAX_ITERATIONS=30 cap and "cursor did not advance" break to defend against an adversarial RAG response.
… rag propagation
W2 — completes the round-2-confirmed gaps in requestErasure:
- v11 C2: erasure was a mutation, structurally unable to call the RAG
service. Vector chunks + indexed text persisted after a "successful"
erasure receipt.
- v11 C2/C3: bounded loop reported partial cascades as success; receipt
lied about coverage. No durable receipt for the subject (Art 19).
- v12 H3: single-mutation cascade hit Convex transaction limits.
Schema (gdprErasureRequestsTable)
- New per-request state machine: pending → running → done | partial |
failed. Carries threadsTargeted snapshot, threadsErased counter,
threadsBlockedByHold, ragDocumentsRemoved, slaDeadlineAt
(= requestedAt + 30 days per Art 12(3)), startedAt / completedAt.
erasure.ts — split-phase pipeline
- Public mutation requestErasure: auth + org-admin gate + legal-hold
gate (refusal also audit-logged) + inserts request row + schedules
the processor + returns { requestId, threadsTargeted } so the admin
can poll the row for progress.
- Internal action processErasureRequest: flips row to 'running',
cascades each thread via cascadeDeleteThreadChildren (up to 50 pages
per thread), propagates to RAG via ragFetch DELETE per subject-owned
document, then finalizeProcessing transitions the row to done |
partial | failed with explicit counts.
- Three supporting internal mutations: beginProcessing,
eraseThreadById, finalizeProcessing, listSubjectDocuments.
Subject scope
- Today: chat threads + their RAG-indexed descendants (departing-
employee case).
- Out of scope (documented in file header): userMemories,
userPreferences, feedback, fileMetadata blobs, audit-chain PII
scrub (W3 #4), BetterAuth tables, loginAttempts / twoFactorAttempts,
policyAcknowledgements.
Receipt = the gdprErasureRequests row. The admin UI can render the
full state machine without scraping audit logs; SLA deadline is
durable.
Two work-streams in one commit because they share the erasure
processor's per-table cleanup path.
Subject-scope expansion (W2 follow-up)
- Adds eight per-table internal mutations called sequentially after
the chat-thread cascade + RAG propagation:
* eraseSubjectUserMemories
* eraseSubjectUserPreferences
* eraseSubjectMessageFeedback
* eraseSubjectFileMetadata (deletes _storage blobs too)
* eraseSubjectUsageLedger
* eraseSubjectTwoFactorAttempts
* eraseSubjectPolicyAcknowledgements
* eraseSubjectOnedrive
- loginAttempts / loginBlockCounters are email-keyed; new
lookupSubjectEmail mutation reads the BetterAuth user row first,
then eraseSubjectLoginAttempts wipes both tables.
- BetterAuth `user`/`account`/`session`/`verification` rows are
intentionally NOT touched here — the auth component owns its own
delete-user flow and direct table manipulation could leave dangling
sessions in flight. Documented as out-of-scope in the file header
for a follow-up that goes through `authComponent`.
Audit-chain PII scrub (W3 #4)
- New `auditLogs.piiScrubbed` (+ `piiScrubbedAt`) field. The processor
blanks `actorEmail`, `actorRole`, `ipAddress`, `userAgent`,
`previousState`, `newState`, `metadata` on every row where
`actorId === targetUserId`. Row existence + timestamp + action are
preserved per Art 17(3)(b)/(e) accountability requirements.
- New `auditLogCheckpoints.subtype` field discriminates 'retention'
(existing) from 'pii_scrub'. The scrub mutation inserts a signed
checkpoint with `scrubbedSubjectId` + `scrubbedRowCount` so an
external auditor can confirm the divergence is bounded and
intentional.
- `verifyIntegrity` updated to recognize the boundary: rows with
`piiScrubbed: true` (or whose actorId is in any pii_scrub
checkpoint's subjectId set) skip the SHA-256 recompute step. Chain
forward-link via `previousHash` is still validated — the scrub does
not break order, only blanks the body.
- `scrubSubjectAuditLogs` is in audit_logs/internal_mutations.ts so
it can use the existing `signCheckpoint` HMAC helper directly.
The processor finalizeProcessing audit log on a 'done' run now
implicitly covers the scrub-marker checkpoint (admins reviewing the
erasure see both rows in the chain). No client-callable surface added
— scrub fires only as part of an authorized requestErasure flow.
Summary by CodeRabbit
Release Notes
✏️ Tip: You can customize this high-level summary in your review settings.