Hi,
When you insert vectors into a serverless index, it takes some time for those vectors to become available for querying. This is a known issue, but the recommended workaround is to repeatedly call describe_index_stats
after the insertion until the total # of vectors in the namespace matches what you’d expect.
This works fine until you have multiple jobs inserting vectors into the same namespace, e.g. if you are indexing different groups of documents at the same time. The delta in total vector count won’t tell you if a specific indexing job is finished or not. This wouldn’t be a problem if I could use metadata filters in describe_index_stats
, since I could just filter by job ID, but filters aren’t available in serverless indexes for some reason.
What should I do here? I could repeatedly fetch every vector in the index until I get a valid response, but this seems ridiculous if you’re dealing with millions of vectors. Please let me know if there’s a better way of knowing when my vectors will actually become usable after insertion.
Thank you.