Those seem like newbie questions - they are basic, and nevertheless important in planning UI and interaction with Pinecone.
- What is actually an Index? Is it a separate DB or separate part of DB? or some kind of artificial boundary of data?
- If a user is a company with 10 employees, do all of them need to use the same Index - or simply put, is Index something that organizes a specific corpus of data on a Subject X?
- Why would scenario like the above, or any other scenario, will need multiple Indexes?
I highly recommend going through the documentation: Understanding indexes
As per documentatiuon: “An index is the highest-level organizational unit of vector data in Pinecone. It accepts and stores vectors, serves queries over the vectors it contains, and does other vector operations over its contents. Each index runs on at least one pod.”
No. They do not need to share the same index. But if you want them to get the same data it is recommended as each index or more accurately each pod is charged separately. Multi-tenancy can sometimes be kind of a dilemma: Understanding multitenancy How you use an index and how you organize the data inside is up to you, but it is recommended that you do not have the same subject stored in multiple indexes as searching multiple indexes for the same subject every time your user would search would be a waste (imho).
I personally need multiple indexes just because of organizational aspects of having some data separated (I could just use namespaces instead, but I have enough data that I would need another pod anyway). Some clients may not want their data to be “shared” in the same index as other. You might reach too much data and will have to create a new index to store it.
Hope this helps