Maximum number of vectors in any index

sandeep · September 29, 2023, 7:35am

What is the max number of vectors possible in an index ?

If I pick s1.x8, is the max limit going to be 40M ?

Jasper · September 29, 2023, 8:15am

as per documentation:

Each s1 pod has enough capacity for around 5M vectors of 768 dimensions.
(Understanding indexes)

and

By changing the pod size, you can scale to x2, x4, and x8 pod sizes, which means you are doubling your capacity at each step.
(Scale indexes)

So capacity depends on your index dimension. But yes. 40M would be the estimate.

Hope this helps

sandeep · September 29, 2023, 8:21am

we have a use-case where the number of vectors to compare against could exceed 40M

how do I solve for that

Jasper · September 29, 2023, 8:30am

You can increase the total number of pods per index? So instead of 1 pod of type s1x8 you can have 3 pods of the s1x4.

Documentation: Scale indexes

sandeep · September 29, 2023, 9:03am

interesting - didnt know

so whats the max number possible ? is there a limit or no limit whatsoever ?

Jasper · September 29, 2023, 9:07am

Not sure, but I would say there should be a limit besides the budget the documentations talks about 10 pods of some type so you should be good even with more than 40M vectors I think.

gdj0nes · September 29, 2023, 10:12am

An index can have multiple pods. If you can fit 40 million vectors on an s1.x8 then a 4 pod s1.x8 would fit 160 million. We recommend having large pod sizes over more pods. The tradeoff is that you can’t change the number of pods on a running index. So you if anticipate the size of the index growing over time then it may be better to choose more pods and lower pod size that you can then increase later.

sandeep · September 29, 2023, 10:15am

how many max pods can I have per index

is there a limit on the number of pods

gdj0nes · September 29, 2023, 10:23am

Im unsure of the actual limit but it’s in the hundreds of pods. If you plan to run a continuous workload in the hundreds of millions consider upgrading and contacting support for further advice.

sandeep · September 29, 2023, 10:47am

yes, we will contact support

first, I just wanted to ensure the limit is not too low given our use-case

silas · October 3, 2023, 11:01pm

We put together a best-fit sizing tool that tries to find the best deployment fit for a use case based on storage needs, QPS, and cost, trying to find the right fit for your use case, with headroom for growth, at the lowest cost.

We’d be happy to get together with you (or anyone) and help with sizing/estimating.

You can reach out to me at silas@ninetack.io.

sandeep · October 4, 2023, 5:17am

whats the Rank column mean ?

silas · October 4, 2023, 5:22am

It’s our ranking of the various configurations that satisfy the use case for the best cost, while providing sufficient headroom for storage capacity growth and query capacity growth and/or spikes.

Note, these numbers are not based on your use case – we’d need to sit down with you and get into the numbers, including expected query performance requirements.

mario · October 16, 2023, 8:29pm

Thanks for sharing the spreadsheet. This tool was very useful.