Metadata Filter Not Excluding Specific Record in Query Results

I’m encountering an issue when querying my Pinecone index with a metadata filter intended to exclude a specific record. Despite applying the $ne filter on the opportunity_id field in the metadata, the record with the excluded opportunity_id still appears in the query results.

Query Setup:

query_response = index.query(
    id=semantic_vector_record_id,
    top_k=10,
    include_metadata=True,
    filter={"opportunity_id": {"$ne": semantic_vector_record_id}}
)

Metadata Structure Example:
Here’s an example of the metadata for one of the records in the index

{
    "opportunity_id": "bBvlrCoafmqbicUnTVSc",
    "original_text": "We are seeking to publish Match-3 puzzle games targeted for the mobile platform across Europe. We are looking for developers who are interested in partnering with a publisher to expand their game's reach and performance in the European market."
}

Issue:
The record with opportunity_id: "bBvlrCoafmqbicUnTVSc" is still being included in the query results even though the filter explicitly excludes it.

Expected Behavior:
The query results should exclude any record with opportunity_id matching the value specified in the $ne filter.

Actual Behavior:
The record with the excluded opportunity_id is still returned in the query results.

Based on the documentation, the $ne operator is supported for metadata filtering in Pinecone and should match vectors with metadata values that are not equal to a specified value. The operator works with number, string, and boolean data types.

Here’s the proper syntax for using the $ne operator according to the documentation:

index.query(namespace="example-namespace",
            vector=[0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1],
            filter={"genre": {"$ne": "documentary"}}, 
            top_k=1, 
            include_metadata=True)

A few things to check:

  • Verify that the metadata field name and value exactly match what’s stored in your index
  • Ensure the metadata value is stored as a string if that’s how you’re querying it
  • For serverless indexes specifically, be aware that highly selective metadata filters (filters that reject the majority of records) may affect the accuracy of results

If you continue experiencing issues, you may want to:

  1. Double check the metadata structure in your index is correct
  2. Verify the metadata size is within the 40KB per vector limit
  3. Consider if your use case might benefit from using namespaces instead of metadata filtering for data segmentatio