I was wondering if there documentation on how to conduct asynchronous queries of data (just like the example for upserts in parallel. My index is relatively small (20k vectors) but I need to query millions of ‘unseen’ vectors against the index to find similar vectors.
First, welcome to the Pinecone forums!
Queries are run in the order they’re received, and are non-blocking read operations. So you shouldn’t have to include any asynchronous logic when running them.
Can you share more details about the types of queries you’re running? If you’re running hundreds at a time we don’t have a mechanism for batched reads like with upserts; there’s a deprecated method in the docs to query multiple vectors simultaneously but we don’t recommend using that.
You might be able to average your vectors in groups, run a single read on the resulting vector, then only query the constituent vectors if that one meets a certain threshold. This may take some experimenting to get right (if your outliers are too close to your corpus of expected values they may get drowned out if they’re averaged with too many expected values).