Fix UNIQUE constraint crash on large C/C++ projects#21
Merged
Conversation
Node insertion used plain INSERT which crashes on duplicate IDs. In large C/C++ projects, tree-sitter can produce duplicate nodes for the same symbol (e.g. typedef struct where both struct_specifier and type_definition resolve to the same name/kind/line, or multiple anonymous constructs on the same line). - Change nodes INSERT to INSERT OR REPLACE (idempotent, same data) - Change edges INSERT to INSERT OR IGNORE (skip duplicate edges) The node ID is sha256(filePath:kind:name:line) which already uses full relative paths, so cross-directory collisions are not the issue. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
jorgerobles
pushed a commit
to jorgerobles/codegraph
that referenced
this pull request
Jun 1, 2026
…raint-collision Fix UNIQUE constraint crash on large C/C++ projects
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #11
Summary
UNIQUE constraint failed: nodes.idcrash reported on large C/C++ and React/JSX projectsINSERTwhich crashes on duplicate IDstypedef structin C/C++ where bothstruct_specifierandtype_definitionresolve to the same name/kind/line, or in React projects with many barrelindex.jsfiles and similarly-named components/hooks across directoriesChanges
INSERT OR REPLACE INTO nodes— if the same symbol ID appears twice, the second write wins (idempotent, same data anyway)INSERT OR IGNORE INTO edges— silently skip duplicate edgesWhy this is safe
The node ID is
sha256(filePath:kind:name:line)which already uses full relative paths (not just filenames), so cross-directory collisions were never the issue. A hash collision means it's genuinely the same symbol extracted twice from the same location — replacing is correct.Test plan
🤖 Generated with Claude Code