Add FlashRank reranker to HybridRetriever to improve retrieval quality#116
Open
GovindhKishore wants to merge 1 commit intoreactome:mainfrom
Open
Add FlashRank reranker to HybridRetriever to improve retrieval quality#116GovindhKishore wants to merge 1 commit intoreactome:mainfrom
GovindhKishore wants to merge 1 commit intoreactome:mainfrom
Conversation
Author
|
Hi @adamjohnwright @GFJHogue , Just flagging this PR for your attention when you get a chance. This directly addresses the retrieval noise issue mentioned across several issues, and since it touches Happy to:
Looking forward to any feedback! |
Contributor
|
@heliamoh are you able to take a look to see if this resolves the issue(s)? |
This was referenced Mar 4, 2026
This was referenced Mar 9, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds a reranking layer to
HybridRetrieverincsv_chroma.pyto address the issue of responses becoming increasingly long and noisy as more data sources are integrated into the retrieval pipeline.Problem
The current pipeline retrieves documents from multiple subdirectories using BM25 + SelfQuery + MultiQuery expansion, resulting in ~90 documents being passed directly to
create_stuff_documents_chain.There is no cross-subdirectory relevance filtering - all retrieved documents are stuffed into the LLM prompt regardless of how relevant they are to the original user query. This causes:
Solution
A new module
src/retrievers/reranker.pyis introduced using FlashRank (ms-marco-MiniLM-L-12-v2). Afterweighted_reciprocal_rankmerges results across all subdirectories, the reranker scores every retrieved document against the original user query using a cross-encoder model and returns only the top N most relevant documents.Two functions are provided:
rerank()- sync, called byretrieve_documents()arerank()- async, called byaretrieve_documents()arerank()usesasyncio.to_threadto run the blocking FlashRank inference in a background thread without freezing the async event loop.Changes
src/retrievers/reranker.py- new module containing reranking logicsrc/retrievers/csv_chroma.py- import reranker, update return statements in bothretrieve_documents()andaretrieve_documents()config_default.yml- add reranker configuration blockpyproject.toml/poetry.lock- add flashrank dependencyWhy FlashRank
list[Document]typereturned throughout
Impact
Since
csv_chroma.pyis shared by both Reactome and UniProt retrievers, reranking applies automatically to all current and future database integrations without any additional changes.Test
Note
This contribution was developed with AI assistance (Claude) for understanding the codebase and implementation guidance. All code has been reviewed and understood.
Closes #115
Happy to make any changes based on maintainer feedback.