Skip to content

OUT-3782: prevent Dropbox↔Assembly ping-pong on file create#107

Merged
SandipBajracharya merged 3 commits into
mainfrom
OUT-3782
May 29, 2026
Merged

OUT-3782: prevent Dropbox↔Assembly ping-pong on file create#107
SandipBajracharya merged 3 commits into
mainfrom
OUT-3782

Conversation

@SandipBajracharya

Copy link
Copy Markdown
Collaborator

Summary

Closes a race where a Dropbox→Assembly create can echo back via Assembly's file.created webhook and produce duplicate files on both sides. Stacks on top of OUT-3645 (#106).

What's in this PR

1. Early-stamp assemblyFileIdSync.service.ts#completePendingAssemblyCreate
Persists assemblyFileId via updateFileMap immediately after copilotApi.createFile() returns, before the (potentially seconds-long) S3 upload. Shrinks the race window from the upload duration to a single UPDATE roundtrip, so the Assembly webhook controller's 800ms sleep + existingFile lookup actually catches the echo.

2. itemPath-based dedupe in the Assembly webhook controllerwebhook.controller.ts
Adds a second lookup on file_folder_sync matching (portalId, normalized itemPath, pendingAction = CREATE, pendingActionTarget = ASSEMBLY, deletedAt IS NULL). Catches the residual race where Assembly fires file.created before our createFile HTTP response returns — at that point we haven't stamped assemblyFileId, but the pre-inserted tombstone row already has the itemPath populated, so we can still match. Stored itemPath has a leading /; Copilot's data.path doesn't, so the lookup normalizes.

3. Handler ordering swapprocessFileSync.ts#handleChannelFileChanges
Process deletes before creates. With the partial unique indexes from OUT-3778, a rename flow can otherwise hit there is no unique or exclusion constraint matching the ON CONFLICT specification if the new path's create races the old path's delete.

Why the ping-pong happens

1. copilotApi.createFile     → Assembly creates file, fires file.created (~100ms)
2. uploadFileInAssembly      → S3 upload (seconds for large files)
3. markUpdated(assemblyFileId)  ← old code only stamped here

If the Assembly echo arrives between (1) and (3), our existing dedupe (which keys off assemblyFileId) misses → we trigger syncAssemblyFileToDropbox → duplicate appears in Dropbox → loop.

Test plan

  • pnpm typecheck clean
  • pnpm lint clean
  • pnpm test passes
  • Manual: upload a file to Dropbox, verify exactly one copy appears in Assembly (and stays at one — no echo).
  • Manual: upload a large file (>10 MB so upload takes seconds), verify no duplicate is created during the upload window.
  • Manual: rename a file in Dropbox, verify the delete + create sequence applies cleanly without ON CONFLICT errors on the partial unique indexes.

🤖 Generated with Claude Code

@linear-code

linear-code Bot commented May 27, 2026

Copy link
Copy Markdown

OUT-3782

@vercel

vercel Bot commented May 27, 2026

Copy link
Copy Markdown

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
dropbox-integration Ready Ready Preview, Comment May 29, 2026 9:56am

Request Review

@SandipBajracharya SandipBajracharya changed the title fix(OUT-3782): prevent Dropbox↔Assembly ping-pong on file create OUT-3782: prevent Dropbox↔Assembly ping-pong on file create May 27, 2026
@greptile-apps

greptile-apps Bot commented May 27, 2026

Copy link
Copy Markdown

Greptile Summary

This PR closes the Dropbox↔Assembly ping-pong race by deploying three coordinated fixes: an early assemblyFileId stamp in completePendingAssemblyCreate (before the S3 upload), a path-based in-flight dedupe lookup in the webhook controller, and a delete-before-create ordering swap to prevent partial unique index violations on renames.

  • Early stamp (Sync.service.ts): updateFileMap writes assemblyFileId immediately after copilotApi.createFile() returns, shrinking the race window from "entire upload duration" to one DB round-trip so the controller's existingFile lookup can catch the echo.
  • Path-based dedupe (webhook.controller.ts + webhook.service.ts): findInFlightAssemblyCreate covers the residual sub-round-trip window by matching the pre-inserted tombstone row on (portalId, channelSyncId, itemPath, pendingAction=CREATE, assemblyFileId IS NULL), correctly scoped per channel via a two-query pattern.
  • Order swap (processFileSync.ts): Processes deletes before creates to avoid ON CONFLICT errors on partial unique indexes during rename flows.

Confidence Score: 4/5

Safe to merge; all three previously raised issues are addressed and the dedupe logic correctly covers both the post-stamp and pre-stamp windows.

The three-layered race fix (early stamp + path-based lookup + delete-before-create ordering) is sound and all previous thread concerns have been resolved. The only new gap is observability: when pendingCreate suppresses a create event, nothing is logged, which would make it hard to confirm the dedupe is firing correctly in production. The core logic is correct.

webhook.controller.ts — the suppression branch at line 71 has no log emit; worth a one-liner before shipping to production.

Important Files Changed

Filename Overview
src/features/sync/lib/Sync.service.ts Early-stamps assemblyFileId via updateFileMap immediately after copilotApi.createFile() returns, before the S3 upload. Shrinks the race window to a single DB round-trip so the controller's existingFile lookup can catch the echo.
src/features/webhook/assembly/api/webhook.controller.ts Adds isCreateEvent guard and calls findInFlightAssemblyCreate only for create events. Guards handleFileCreated with both existingFile and pendingCreate checks. No log on suppression path is the only minor gap.
src/features/webhook/assembly/lib/webhook.service.ts New findInFlightAssemblyCreate correctly scopes to (portalId, channelSyncId, itemPath, pendingAction=CREATE, pendingActionTarget=ASSEMBLY, assemblyFileId IS NULL, deletedAt IS NULL) via a two-query pattern that addresses the previous cross-channel false-positive concern.
src/trigger/processFileSync.ts Swaps batch-trigger order from create-then-delete to delete-then-create to prevent ON CONFLICT errors on partial unique indexes during rename flows.

Sequence Diagram

sequenceDiagram
    participant DBX as Dropbox
    participant Sync as SyncService
    participant DB as Database
    participant ASM as Assembly API
    participant WH as Webhook Controller

    DBX->>Sync: file change detected
    Sync->>DB: "pre-insert row (pendingAction=CREATE, assemblyFileId=NULL)"
    Sync->>ASM: copilotApi.createFile()
    ASM-->>Sync: "fileCreateResponse (id=X)"
    Sync->>DB: "updateFileMap(assemblyFileId=X) ← early stamp"
    ASM->>WH: "file.created echo (data.id=X) ~100ms later"
    WH->>WH: sleep(800ms)
    WH->>DB: "existingFile lookup (assemblyFileId=X)"
    DB-->>WH: row found → skip handleFileCreated ✓
    Sync->>ASM: uploadFileInAssembly (S3, seconds)
    Sync->>DB: "markUpdated(assemblyFileId=X, pendingAction=null)"

    note over WH,DB: Residual window: stamp not yet committed
    ASM->>WH: file.created echo arrives before stamp
    WH->>DB: existingFile lookup → null
    WH->>DB: findInFlightAssemblyCreate(channelId, path)
    DB-->>WH: pre-insert row matched (assemblyFileId IS NULL) → skip ✓
Loading

Reviews (3): Last reviewed commit: "fix(OUT-3782): scope in-flight dedupe to..." | Re-trigger Greptile

Comment thread src/features/webhook/assembly/api/webhook.controller.ts
Comment thread src/features/webhook/assembly/api/webhook.controller.ts
@SandipBajracharya

Copy link
Copy Markdown
Collaborator Author

@greptileai

Comment thread src/features/webhook/assembly/lib/webhook.service.ts
@SandipBajracharya

Copy link
Copy Markdown
Collaborator Author

@greptileai

@priosshrsth priosshrsth left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

SandipBajracharya and others added 3 commits May 29, 2026 15:39
- Stamp assemblyFileId via updateFileMap immediately after
  copilotApi.createFile in completePendingAssemblyCreate, before
  uploadFileInAssembly. Shrinks the race window from upload-duration to
  one UPDATE so the controller's existingFile lookup dedupes the
  Assembly file.created echo.
- Add a path-based dedupe in the Assembly webhook controller: lookup
  file_folder_sync by (portalId, normalized itemPath,
  pendingAction=CREATE, pendingActionTarget=ASSEMBLY, deletedAt IS NULL)
  and skip handleFileCreated/FolderCreated when a pre-inserted row
  matches. Stored itemPath has a leading "/"; Copilot's data.path does
  not, so we normalize.
- Swap order in handleChannelFileChanges: process deletes before creates
  so the (portal, channel, dbx_file_id) partial unique index from
  OUT-3778 isn't violated during rename flows.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…rvice

Move the channel lookup, path normalization, and pre-insert row match
out of webhook.controller.ts and into a single method:

  AssemblyWebhookService.findInFlightAssemblyCreate(
    assemblyChannelId,
    filePath,
  )

- Dedupe is now scoped to (portalId, channelSyncId, itemPath) rather
  than (portalId, itemPath). Path uniqueness lives at per-channel
  granularity (matches the partial unique indexes from OUT-3778), so
  the previous query could silently drop legitimate file.created
  events for a different channel that shared a path.
- Normalization (leading "/") moved inside the service method. It now
  only runs for create events instead of for every webhook, addressing
  the latent crash on non-create events that omit data.path.
- Controller becomes one call: lighter to read, easier to test.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…FileId

The pendingCreate lookup in findInFlightAssemblyCreate matched any row
with pendingAction=CREATE, target=ASSEMBLY, and a matching itemPath,
ignoring whether assemblyFileId was set. The JSDoc already documented
the intent as "row exists with assemblyFileId IS NULL", but the query
omitted the filter.

Defensive correctness fix: if completePendingAssemblyCreate gets past
the early updateFileMap (stamping assemblyFileId) but uploadFileInAssembly
then throws, the row stays at (pendingAction=CREATE, assemblyFileId=B).
The normal delete-webhook cleanup chain usually soft-deletes such rows
when the orphaned Assembly file is removed — but if that chain breaks
(lost webhook, errored handler, sweeper exhausted at MAX_ATTEMPTS), a
later legitimate file.created event for the same path would falsely
match the stuck row and silently skip handleFileCreated.

Adding isNull(t.assemblyFileId) to the query restricts the match to
the actual in-flight window (post-insertCreatePending, pre-updateFileMap)
that the dedupe was designed for.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@SandipBajracharya SandipBajracharya changed the base branch from OUT-3645 to main May 29, 2026 09:55
@SandipBajracharya SandipBajracharya merged commit 70d2db8 into main May 29, 2026
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants