Hello, Pinecone community,
I’m writing to ask if it’s possible to have a vector similarity search with thresholding instead of just top K. Similar to what Weaviate has(Vector similarity search | Weaviate - Vector Database), For my use case: I need to return all of the relevant documents to ingest into my LLM pipeline whether its 100-10000.
Hi @johnkongtcheu, and thank you for your question!
While this is not currently supported natively, you can achieve this with a post-retrieval filtering step, like so:
# Perform the search
response = index.query(queries=[query_embedding], top_k=10, include_values=True)
# Define score cutoff
score_cutoff = 0.8
# Filter results based on the score cutoff
filtered_results = [result for result in response['results'][0]['matches'] if result['score'] >= score_cutoff]
# Print filtered results
for result in filtered_results:
print(f"ID: {result['id']}, Score: {result['score']}")
I hope that helps!
Best,
Zack
Hi @ZacharyProser
Is it possible to set a threshold in the function index.query()?
@loganju2000 we don’t have this functionality today. What is your use case for a threshold query?
Hi, thank you @gdj0nes and @ZacharyProser for the help on this question. I’m aware of the post-processing abilities that can be done with Pinecone. However, in my use case I may request 100K+ documents to be retrieved from Pinecone at a time for LLM ingestion, and performing the cutoff as a post-processing step incurs a latency penalty due to network transfer speed, which limits the usefulness of Pinecone’s fast queries.
@gdj0nes I was trying to integrate the rag into LLM where I need to retrieve the most relevant data from the index.