Metadata Limit Error

Hi,

I am new to using Pinecone but I am using it to store sentence embeddings along with some metadata including the original text for a similarity search use case.

I am trying to store my embeddings along with the metadata in batches of 100 but I am running into the following error:

...

HTTP response headers: HTTPHeaderDict({'content-type': 'application/json', 'date': 'Fri, 03 Jun 2022 17:48:20 GMT', 'x-envoy-upstream-service-time': '4', 'content-length': '113', 'server': 'envoy'})
HTTP response body: {"code":3,"message":"metadata size is 7271 bytes, which exceeds the limit of 5120 bytes per vector","details":[]}

It looks like there’s a limit to metadata of 5MB so I’m not sure why I’m running into the limit here. I assumed the 5MB limit was for each vectors but is the 5MB shared across all vectors?

1 Like

We are actually increasing the metadata limit from 5K to 10K as of today’s release. This should resolve the issue you are seeing. This will mean 10K for each vector for the second part of your question.

We are also introducing the concept of selective metadata indexing. This allows you to choose which metadata fields are indexed and which are stored. Indexed metadata is filterable and can be used at query time to limit the results. Stored fields are retrievable but not filterable so they can be brought back in the results for your application but they cannot be used to limit the results. The purpose of this feature is to minimize the overall footprint of your index while maintaining the usefulness of the metadata.

See: https://www.pinecone.io/docs/manage-indexes/#selective-metadata-indexing

Pinecone

1 Like