Approximate amount of text info in one vector

So, as it is given in docs here, the index on starter plan can store 100,000 vectors with 1536-dimensional embeddings and metadata. When you upgrade to the next level plan - it iscapacity for around 5M vectors of 768 dimensions for s1 pod .
So my question is - given that I am using ‘text-embedding-ada-002’ for generating embeddings for the text, how to calculate the maximum text size, what I can store in the index as metadata, along with the generated embeddings for it? Need to be able to understand, if I have, let’s say, 10 MB of text - and I generated embeddings for all if that stuff, want to store it as vectors with 1536 dimensions in this index - will one index fit all that? What would be a scheme to calculate that?
Thanks)

Hi @alanor87

as per documentation: Pinecone supports 40kb of metadata per vector.

If your only metadata will be text then you can roughly how many vectors you would need and then see if 2,5mil is enough :slight_smile:

Hope this helps

1 Like