OUT-3800: dedupe parallel folder creation in Dropbox→Assembly sync#111
Conversation
Sibling files syncing in parallel both create their shared parent folder. Assembly's folder create is path-idempotent and returns the SAME assemblyFileId to both racers, so the second insertFileMap collided on file_folder_sync_portal_channel_assembly_unique and threw. - insertFileMap uses onConflictDoNothing against the existing assembly unique index (no new index), returns null on conflict; synced count only increments on a real insert. - createAndUploadFileToAssembly pre-checks the mapping table by path to skip the redundant Assembly create in the common sequential case. - Insert-race-loser stamps dbxFileId via handleFolderCreatedCase so a folder entry that loses to an intermediate-segment writer isn't left null. - Split createAndUploadFileToAssembly into createLeafFileInAssembly and createFolderInAssembly; public method is now a thin dispatcher. - Cover insertFileMap conflict target, predicate, and null path in tests. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
Greptile SummaryFixes a
Confidence Score: 4/5Safe to merge; the deduplication logic is correct, the partial-index predicate exactly mirrors the schema, and the synced-files count increments exactly once per unique folder. The partial-index predicate in insertFileMap exactly mirrors the schema so the deduplication fires correctly in Postgres. The entire concurrent-race branch relies on Assembly returning the same assemblyFileId for the same folder path, a property confirmed empirically but not enforced in code. If that contract breaks, silent duplicate rows could be inserted. Test coverage for the new insertFileMap behaviour is thorough. src/features/sync/lib/Sync.service.ts — specifically the concurrent folder-creation path in createFolderInAssembly and the assumptions around Assembly idempotency. Important Files Changed
Reviews (1): Last reviewed commit: "fix(OUT-3800): dedupe parallel folder cr..." | Re-trigger Greptile |
Note that handleFolderCreatedCase covers both sub-cases: needed when the winner was an intermediate-segment writer (dbxFileId=null), no-op otherwise. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Defensive guard: getDbxMappedFileFromPath already fetches all live rows for a path, so warn when >1 exists. Surfaces the case where the onConflictDoNothing dedupe fails because Assembly stopped returning the same assemblyFileId on duplicate folder create. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
arpandhakal
left a comment
There was a problem hiding this comment.
good to go. Lets reduce the number of comments and make them simple. claude pollutes the codebase with extra comments as of now.
Problem
Sibling files syncing in parallel (e.g.
/abc/file1.txt,/abc/file2.txt) both try to create their shared parent folder/abc. Assembly's folder create is path-idempotent and returns the sameassemblyFileIdto both racers, so the secondinsertFileMapcollided on thefile_folder_sync_portal_channel_assembly_uniquepartial index and threw theFailed query: insert into file_folder_sync ...error in Sentry.Fix
insertFileMapnow usesonConflictDoNothingagainst the existing assembly unique index (no new index added) and returnsnullon conflict. The synced-files count only increments when a row was actually inserted.createAndUploadFileToAssemblypre-checks the mapping table by path (getDbxMappedFileFromPath) to skip the redundant Assembly create in the common sequential case.onConflictDoNothingis the safety net for the true concurrent race.dbxFileIdviahandleFolderCreatedCase, so a folder entry that loses the insert race to an intermediate-segment writer isn't left withdbxFileId=null(which would break later deletes/moves).createAndUploadFileToAssemblyintocreateLeafFileInAssemblyandcreateFolderInAssembly; the public method is now a thin dispatcher.Notes
file_folder_sync_portal_channel_assembly_unique.assemblyFileIdfor a duplicate folder (confirmed by the Sentry error). The existing"Folder already exists"catch covers the sequential variant.Tests
Added coverage in
MapFiles.tombstone.test.tsfor the conflict target columns, the partial-index predicate, and the inserted / null-on-conflict return paths.pnpm typecheck,pnpm lint, and all 14 tests pass.🤖 Generated with Claude Code