Skip to content

feat: similarity_search() for LangChain + upsert_relations() idempotency test#27

Merged
polaz merged 5 commits intomainfrom
feat/#26-similarity-search-upsert-idempotency
Apr 12, 2026
Merged

feat: similarity_search() for LangChain + upsert_relations() idempotency test#27
polaz merged 5 commits intomainfrom
feat/#26-similarity-search-upsert-idempotency

Conversation

@polaz
Copy link
Copy Markdown
Member

@polaz polaz commented Apr 12, 2026

Summary

  • Add similarity_search(query_vector, k, label, property) to CoordinodeGraph (LangChain adapter), wrapping CoordinodeClient.vector_search() with label/property defaults
  • Add test_upsert_relations_idempotent integration test for LlamaIndex adapter, verifying exactly 1 edge after double upsert_relations() call (MERGE semantics)

Technical Details

similarity_search():

  • Returns [{"id": ..., "node": ..., "distance": ...}, ...] sorted by ascending cosine distance
  • Guards against empty query_vector (server returns INVALID_ARGUMENT for empty vectors)
  • Two integration tests: seeded :LCSim node found in top-k results, empty vector returns []

Idempotency test:

  • Seeds src/dst nodes, calls upsert_relations() twice with the same Relation
  • Verifies count(r) == 1 via Cypher — MERGE semantics confirmed

Test Plan

  • 55/55 integration tests pass against live CoordiNode instance

Closes #20
Closes #21

polaz added 2 commits April 12, 2026 14:36
Verifies that calling upsert_relations() twice with the same Relation
produces exactly one edge (MERGE semantics, not CREATE).

Closes #21
Wraps CoordinodeClient.vector_search() with label/property defaults,
returning [{id, node, distance}, ...] sorted by ascending distance.
Guards against empty query_vector to match server validation behaviour.
Adds two integration tests: one seeding a :LCSim node and verifying the
seeded node appears in top-k results, one verifying empty-vector returns [].

Closes #20
@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Apr 12, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: c4a09e0f-348d-437b-aaaf-2e09cb2458fe

📥 Commits

Reviewing files that changed from the base of the PR and between 951b487 and ab3559e.

📒 Files selected for processing (3)
  • langchain-coordinode/langchain_coordinode/graph.py
  • tests/integration/adapters/test_langchain.py
  • tests/integration/adapters/test_llama_index.py

📝 Walkthrough

Summary by CodeRabbit

  • New Features

    • Added vector similarity search to find and return semantically similar content.
  • Bug Fixes / Improvements

    • Immediately returns [] for empty query vectors; results are ordered by distance and include per-item id, node data, and distance.
  • Tests

    • Added integration tests for similarity search (including empty-vector handling) and for relation upsert idempotency.

Walkthrough

Added CoordinodeGraph.similarity_search(...) which delegates to CoordinodeClient.vector_search(...), returns [] for empty query vectors, sorts results by ascending distance, and maps results to dicts with keys id, node, and distance. Also added integration tests for similarity search and relation-upsert idempotency.

Changes

Cohort / File(s) Summary
LangChain adapter
langchain-coordinode/langchain_coordinode/graph.py
Added similarity_search(self, query_vector: Sequence[float], k: int = 5, label: str = "Chunk", property: str = "embedding") which early-returns [] for empty vectors, delegates to self._client.vector_search(...), sorts results by ascending distance, and maps each result to {"id": id, "node": properties, "distance": distance}.
LangChain integration tests
tests/integration/adapters/test_langchain.py
Added two tests: one seeds a deterministic :LCSim node with an embedding, verifies similarity_search(...) returns results containing the seeded node and ordered by non-decreasing distance, and cleans up; second verifies similarity_search([]) returns [].
LlamaIndex integration test
tests/integration/adapters/test_llama_index.py
Added test_upsert_relations_idempotent which upserts the same relation twice and asserts exactly one relationship edge exists between the source and target node.

Sequence Diagram(s)

sequenceDiagram
    participant LangChain as CoordinodeGraph
    participant SDK as CoordinodeClient
    participant DB as CoordiNodeDB
    LangChain->>SDK: vector_search(label, property, vector, top_k)
    SDK->>DB: VectorService RPC (query vector, params)
    DB-->>SDK: results (nodes + distances)
    SDK-->>LangChain: raw results
    LangChain->>LangChain: sort by distance, map to {id, node, distance}
    LangChain-->>Caller: return formatted list
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The pull request title clearly summarizes the main changes: adding similarity_search() to LangChain and an idempotency test for upsert_relations() in LlamaIndex, which directly matches the file modifications.
Description check ✅ Passed The description comprehensively documents both features added: similarity_search() implementation details with test coverage, and the idempotency test for upsert_relations(), directly relating to the changeset.
Linked Issues check ✅ Passed The pull request addresses #20 by implementing similarity_search() in CoordinodeGraph with proper result formatting and empty-vector handling, and addresses #21 by adding the idempotency test for upsert_relations(), meeting the coding requirements of both issues.
Out of Scope Changes check ✅ Passed All changes are in scope: similarity_search() implementation and tests directly address #20, the idempotency test directly addresses #21, with no unrelated modifications detected.
Docstring Coverage ✅ Passed Docstring coverage is 90.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/#26-similarity-search-upsert-idempotency

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a LangChain-facing similarity_search() API on CoordinodeGraph and expands integration test coverage to ensure relationship upserts are idempotent.

Changes:

  • Implement CoordinodeGraph.similarity_search() as a thin wrapper over CoordinodeClient.vector_search() with sensible defaults.
  • Add LangChain integration tests for similarity_search() (happy path + empty-vector guard).
  • Add LlamaIndex integration test verifying upsert_relations() MERGE/idempotency semantics.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

File Description
langchain-coordinode/langchain_coordinode/graph.py Adds similarity_search() wrapper around the SDK’s vector_search().
tests/integration/adapters/test_langchain.py Adds integration coverage for similarity_search() results + empty-vector behavior.
tests/integration/adapters/test_llama_index.py Adds integration test to ensure upsert_relations() does not create duplicate edges.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@langchain-coordinode/langchain_coordinode/graph.py`:
- Around line 222-228: The similarity_search method promises results ordered by
ascending distance but currently returns backend order; after calling
self._client.vector_search (the results variable) sort results client-side by
the distance attribute (r.distance) in ascending order before mapping to the
returned list of dicts (the current list comprehension that yields {"id":
r.node.id, "node": r.node.properties, "distance": r.distance}) so the method is
deterministic regardless of backend ordering.

In `@tests/integration/adapters/test_langchain.py`:
- Around line 168-172: The test
test_similarity_search_empty_vector_returns_empty currently only asserts the
type of results from graph.similarity_search([], k=5); update it to also assert
the empty-vector contract by adding an equality assertion that results == [] so
the test fails if any items are returned, keeping the existing type check if you
want but ensuring an explicit empty-list equality check against
graph.similarity_search.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 92a6a461-1e85-4adc-933f-77351e06b86b

📥 Commits

Reviewing files that changed from the base of the PR and between 1101ac8 and 7c4d4c0.

📒 Files selected for processing (3)
  • langchain-coordinode/langchain_coordinode/graph.py
  • tests/integration/adapters/test_langchain.py
  • tests/integration/adapters/test_llama_index.py

…n test

- Sort vector_search results client-side by r.distance (ascending) to
  match the public contract promised in the docstring
- Replace isinstance(results, list) with results == [] assertion to
  explicitly enforce the empty-vector guard contract
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@tests/integration/adapters/test_langchain.py`:
- Around line 151-163: The test stores the returned node object in
seeded_internal_id but similarity_search returns integer internal IDs; change
the assignment so seeded_internal_id is the node's internal ID (use
seed_rows[0]["nid"].id_) after the graph.query CREATE ... RETURN n call,
ensuring the assertion compares the same integer ID type when checking
any(r["id"] == seeded_internal_id for r in results); update references to
seeded_internal_id accordingly.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: b7b5379e-d975-4149-abfb-6a471f1c558d

📥 Commits

Reviewing files that changed from the base of the PR and between 7c4d4c0 and c9246ac.

📒 Files selected for processing (2)
  • langchain-coordinode/langchain_coordinode/graph.py
  • tests/integration/adapters/test_langchain.py

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

…tibility

similarity_search() now uses len(query_vector) == 0 instead of truthiness
to avoid ValueError on numpy.ndarray and other Sequence types.
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@langchain-coordinode/langchain_coordinode/graph.py`:
- Around line 199-200: The default parameter k in the similarity search API is
set to 10 but the linked issue and intended behavior expect k=5; update the
function signature in graph.py (the parameter list containing query_vector:
list[float], k: int = 10) to use k: int = 5 and ensure any related docstrings,
tests, or examples that assume similarity_search(..., k=5) are updated to
reflect this default so the API and issue objective remain consistent.

In `@tests/integration/adapters/test_langchain.py`:
- Around line 163-167: The test currently only checks non-negativity for
results[0]["distance"] but must assert the ascending sort contract; update the
test that uses the results variable (and seeded_internal_id) to explicitly
assert ordering by checking adjacent distances are non-decreasing (e.g., assert
results[i]["distance"] <= results[i+1]["distance"] for the first couple of pairs
or across the whole list) so the search result list is verified to be ordered
ascending by distance.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: e93e83a7-e449-4781-ba78-cc09499d83f9

📥 Commits

Reviewing files that changed from the base of the PR and between c9246ac and 951b487.

📒 Files selected for processing (2)
  • langchain-coordinode/langchain_coordinode/graph.py
  • tests/integration/adapters/test_langchain.py

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

…tocol and issue spec

- Change query_vector type annotation from list[float] to Sequence[float]
  (matches existing code comment that explicitly documents numpy.ndarray support
  and aligns with CoordinodeClient.vector_search() signature)
- Change default k from 10 to 5 (matches issue #20 acceptance criteria)
- Strengthen test_similarity_search_returns_results: replace non-negativity
  check with full ascending-order assertion on adjacent distances
- Use store.structured_query(param_map=...) in test_upsert_relations_idempotent
  instead of store._client.cypher() — keeps integration test stable vs internal API
@sonarqubecloud
Copy link
Copy Markdown

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated no new comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@polaz polaz merged commit f0ad603 into main Apr 12, 2026
11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

2 participants