Skip to content

fix: duplicate edges on re-ingestion — replace CREATE with MERGE in upsert_relations() and add_graph_documents() #21

@polaz

Description

@polaz

Problem

Re-ingesting the same data creates duplicate edges. Every call to upsert_relations() (LlamaIndex) or add_graph_documents() (LangChain) creates new edges unconditionally:

# First ingest
store.upsert_relations([Relation(label="KNOWS", source_id="alice", target_id="bob")])
# Second ingest — creates a DUPLICATE edge
store.upsert_relations([Relation(label="KNOWS", source_id="alice", target_id="bob")])

This makes repeated indexing non-idempotent and bloats the graph.

Root cause

CoordiNode Cypher does not support MERGE for relationship patterns — only for node patterns (NodeScan). The adapters work around this with CREATE, accepting duplicate edges as a known limitation.

Tracked as G072 in CoordiNode DB repository: MERGE (src)-[r:TYPE]->(dst) fails with "MERGE create from non-NodeScan pattern".

SDK changes (after G072 is fixed in DB)

Once G072 is resolved:

llama-index-coordinode/llama_index/graph_stores/coordinode/base.py

  • upsert_relations(): replace CREATE (src)-[r:{label}]->(dst)MERGE (src)-[r:{label}]->(dst)
  • Remove the comment block explaining the CREATE fallback
  • Remove the SET r += $props workaround comment

langchain-coordinode/langchain_coordinode/graph.py

  • _create_edge(): replace CREATE (src)-[r:{rel_type}]->(dst)MERGE (src)-[r:{rel_type}]->(dst)
  • Rename _create_edge()_upsert_edge() (or keep name, update docstring)
  • _link_document_to_entities(): replace CREATE (d)-[:MENTIONS]->(n)MERGE (d)-[:MENTIONS]->(n)
  • add_graph_documents() docstring: remove duplicate-edge warning

Tests

  • test_add_graph_documents_idempotent: change assertion from cnt >= 1 to cnt == 1
  • Add similar idempotency test for LlamaIndex upsert_relations()

Acceptance criteria

  • Re-ingesting the same document creates exactly 1 edge (not N)
  • upsert_relations() is idempotent
  • add_graph_documents() is idempotent for both nodes and edges
  • All existing tests pass

Gate

Blocked by G072 — requires CoordiNode DB fix: MERGE clause for relationship patterns in Cypher executor.
Do not start SDK implementation until a CoordiNode release with G072 fixed is available.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions