[Use-case] How can I limit the search results based on metadata such as a user role?

I want to build an application that enables organizations to query their documents with natural language.
The basic solution would be to upload all documents to the vectorDB and the query for the nearest neighbors. The issue is that not all users in the organization have access to all documents.
Ideally, we can limit the search over documents from Pinecone based on the role of the user. Is this possible? Is this functionality on the roadmap?

Example data:

	"id": "doc1",
	"metadata": {
		"allowed_roles": "hr",
	"values": [-0.031287473, -0.024716083, -0.0017911823, ...]
	"id": "doc2",
	"metadata": {
		"allowed_roles": "finance",
	"values": [-0.031287473, -0.024716083, -0.0017911823, ...]

Example query:

    "vector": [0.0040269415, -0.028688831, 0.015932681, -0.02544977,  ...],
    "filter": {
		"allowed_roles": ["hr"]


1 Like

Very interested in this. Did you find a solution? I assumed that namespace / metadata could be used for this but what about the growing size of the index?

Hi @joshua.milkovitch

Yes! This is absolutely the way to go about what you want to do. I would suggest you take a look at Metadata filtering which I think is exactly what you are looking for :slight_smile:

What I would do just for future proofing is making “allowed_roles” an array of roles. Just in case some might be available to more than one role.

If you get stuck do not hesitate to ping me :slight_smile:

Hope this helps

Thank you for the provided information. This is very helpful.
What should be the approach in cases where new roles are added/removed for existing documents? I mean for docs that are already added/indexed in Pinecone with previous roles. Are you familiar with a commercial solution that resolves this matter of dynamic access provisioning?