I have a question regarding the namespace values for vectors:
If l insert a handful of vectors with a namespace of “test-1”. And another handful with a namespace of “test-2”
When l run a query - without passing the namespace value - it appears that these namespaces test-1 and test-2 are ignored from my query and not checked for similarity nor returned.
It is only after l add the namespace=test-1 parameter to my query that pinecone will return results from the specified namespaces. Having said that what if l want to query through all vectors regardless of namespace? Or perhaps l want all where namespace = test-1 OR test-2? Is that possible?
That’s correct, queries can only be run in one namespace at a time. This includes the null or blank namespace, which is separate from any namespaces that are actually named.
We don’t have a mechanism to search across multiple namespaces or to perform a JOIN like action. But I’m happy to share this with our product team as a potential future enhancement. Can you share more of what you’re thinking of when searching multiple namespaces? What’s the use case you’re envisioning that would be improved by that? Any additional color you can share would help with design and prioritization.
Thank you ever so much for your time & response. I’m not necessarily suggesting a new feature; just that l’m new to pinecone/vector indexing and expected that if no namespace was passed it would query all namespaces within the index; not just those with no namespace specified. So really l was just trying to better understand how l should think about my architecture and setting up my data within the indexes.
As an example use-case let’s assume l’m my goal is to create a semantic search engine for my website.
Each page has:
Article Content
Title Tag
Keywords Tag
Meta Description Tag
If l wanted to weight each of these elements a little differently (for example; say the document title weighed more heavily than the content). Would you recommend setting each of these in their own namespace and thus requiring 4 API queries per user search, or should this type of use-case be done within the metadata of each index; adding a “data-type” meta field for example?
I see. Yeah, in this case, if you’re going to search across multiple tags at once, I would use metadata to separate them. Then you can use the $in operator to search for several. Or not filter at all and search all of them.