Sub-index upsert and search


I am wondering if there is a way to create sub-indexes. The reason I want this application, is that I would like to have small collections of embeddings that can be referenced together, because they are all embeddings of the same document. If this document changes, I want to be able to re-embed the document, which my turn out to give you a different number of vectors if it grows or shrinks in size, and upsert them all at once with the new collection of embeddings.

Is this doable? And can it be done efficiently with Pinecone?

I’d also like this application to be such that when I do a cosine similarity with some other embedding against my whole index, it can select individual embeddings from different subindexes like it would if they were all grouped together.

I see I can maybe use namespaces. But is there an issue if I am creating many namespaces? And it does not look like there is a way for me to run a query against multiple namespaces at one.

You may want to consider having a parallel database in MySQL or similar that keeps track of the embedding ID’s and links them to a master document ID. Then when you query Pinecone, you can get the ID of the embedding and then figure out which document it was from in a separate query. Unless I am not understanding your question.