@stanislav.novokhatko and @jocelyn I’m encountering the same issue, with a slight difference in input parameters. My query includes both a vector and metadata filters.
query_dense = [......] # (openai/text-embedding-small)
topK = 5000
results = index.query(
top_k=topK,
include_metadata=True,
vector=query_dense,
namespace=namespace,
filter={
"category_child_id": {
"$in": [
102513,
102744,
# ..........
]
}
}
)
Issue:
- I have 334 IDs of type Number, all of which exist in the index. However, the query only returned 304 results, leaving out 30 IDs that are still present in the index.
Doubt:
- Does Pinecone intentionally prune results?
- I ran a simple experiment by creating a namespace with 50 unique records (988 records in the namespace). When I queried with a vector and metadata filter on all 50 IDs (same as the code above), it returned only 21 results—despite the same topK value.
- Does Pinecone apply any internal threshold or weightage to the result set based on the input vector?