The Pinecone Serverless docs say “In a multitenant solution, you need to ensure that the queries of one tenant do not affect the experience of other tenants/customers. To achieve this in Pinecone, target each tenant’s queries at the namespace for the tenancy”, but query read units per second per index is set at 2000. I believe this means that if we are submitting 2000 queries per second on behalf of one tenant, any queries for other tenants will be throttled. Pinecone Database limits - Pinecone Docs
Can queries be rate limited per namespace?
For example, record updates are already rate limited on a per namespace basis: “Update records per second per namespace=100”
If rate limiting per namespace does not make sense, can you please explain why this rate limit is imposed at the index level?
Yes, your assessment is correct. If you segment users by namespace, and have asymmetric usage via user, such that one user’s QPS takes up most of that rate limit, the rest of the users experience will suffer. There is no current capability to set limits based on individual namespaces.
After consulting with our engineering team, they suggested isolating your users with heavy workloads in different indexes, or chunking users themselves across indexes to avoid this issue. You can create several indexes within a project, which may help too!
If you are experiencing this issue near/in prod, I’m happy to connect you with our solutions engineering team to address your situation specifically.
Thanks, Arjun. Yes this helps. If I cannot know in advance which tenants will experience the higher workloads and when, then I may need to resort to creating one index per tenant, which may put me above the index per project quota. As long as increasing that quota is no problem, then I should be good!
Heads up, but you should be able to create up to 5 indexes for 1 project in the starter tier, and 20 projects per org with 20 indexes each in the Standard tier.
I just corresponded with our solutions eng/support team, and they agree that you should try to start with namespaces/index splits first. The majority of our users experience workloads that comfortably fit in these.
If you end up getting there and need a quota increase, we can totally help with the quota increase via our solutions engineers/account managers. Please follow up here if that is the case!