Is there a way to query all the vectors and/or metadata from a namespace ?

Hello,

For now I am using pinecone since 2 mounths and I really like it ! I saw in the doc that you can only fetch a maximum of 1000 vectors or metadata and it is only possible by IDs, but in my case I need (not often) to get all the vectors from a namespace for doing my own math… There is another possibility where I create a collection every time I want that (but it wil create a collection for every namespaces), or I duplicate on another third party my vectors, but it is really not optimised and it can led to errors and stuffs…

Do you have any advice or tricks that I can use ?

Thanks a lot for your time !

We are working on a brand new architecture which will have many more import/export features over the next several months. We are aware of this kind of request from customers and are developing solutions as we speak.

However, this functionality is not available today/yet.

1 Like

@kbutler will these updates also allow users to fetch all the data in an index to see what exists? The challenge I’m finding is not knowing what’s in my indexes

2 Likes

Any update on this? I feel that is a basic feature that a lot of other functionalities depend on

1 Like

Update? Also very interested in this feature!

1 Like

I finally created a function myself doing this in a dirty way… Please, Pinecone, add this feature officially.

Would you happen to have a snippet? Trying to make the same thing.

I ended up hacking something together using the scripts from here and added on this helper to extract that dynamic id key:

def get_first_item(my_dict):
    if my_dict:
        first_key = next(iter(my_dict))
        return my_dict[first_key]
    else:
        return None

This is the dirty implementation:

    // Get all the vectors from the vector store
    const allVectorsRequest = {
      topK: 10000,
      vector: new Array(1536).fill(0),
      namespace: namespace,
      includeMetadata: true,
      includeValues: false,
    };

    const queryResponse = await pineconeIndex.query({
      queryRequest: allVectorsRequest,
    });

The function works, but Pinecone should implement a cleaner way of incorporating it into their API.

Hi, I have built out a script in my library for this purpose: GitHub - AI-Northstar-Tech/vector-io: Use the universal VDF format for vector datasets to easily export and import data from all vector databases
Examples commands here: Dhruv Anand on LinkedIn: Quick Migration to Pinecone Serverless I've been working on a library of…

Please this feature is required !