Skip to content

BM25 case-sensitive retrieval causing failed retrievals #92

@GFJHogue

Description

@GFJHogue

Likely root-cause of #91

Seems like this modification to the rephrase prompt was the only attempt to fix it: 5f17d2d#diff-b0a23d0d0f62aac4e3392fd57d495b2e5311b90f0b58435efb296c210c3c5e5fR8

Current location on main:

- Optimized for both vector search (semantic meaning) and case-sensitive keyword search

This needs an actual text-processing fix by case-normalizing the BM25 retrieval.
BM25Retriever supports a preprocessing function - this could be helpful. Here is the default preprocess function.
This probably entails case-normalizing the CSVs as well.

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions