We are experiencing very high latency, up to minutes when querying our index.
We are using:
- serverless index.
- dimensions 768.
- about 42 000 000 vectors with metadata.
We’re using the python sdk (grpc and rest gave same results)
It often occurs when no query were performed since some time. Our guess is, the delay is caused by the time to load the vectors like explain here. However, more than a few seconds make it completely unready for production.