We’re relatively new to using Pinecone and are looking to optimize our queries by filtering results based on score, akin to a min_score filter. Our aim is to retrieve only relevant results without resorting to post-filtering. Additionally, we’re interested in extracting statistics for specific topics, prioritizing count over individual results.
Could someone advise on whether Pinecone offers a performant way to achieve these objectives? Any insights or guidance would be greatly appreciated. Thank you!
We do not currently offer a minimum score filter. Pinecone queries will always return the top_k most similar records in the query result.
If you wish to filter out records that do not meet a certain score threshold, you must perform a post-processing step. Fortunately, the top_k query results will be listed from most similar to least, so you can begin checking the score of the last result and stop checking once you hit a record with an acceptable score knowing that all other records meet the threshold. This should help speed up the extra step.
I encourage you to share your interest in this feature in the Feature Requests section of the forum. Our product team will see your request, and other users can provide votes to express their interest.