Bulk export
For a bulk export endpoint, pagination is important to consider. There are various approaches to pagination.
Limit/offset params. Consistent response ordering is important here, so either the server needs to guarantee order or else the client needs to be able to specify an ORDER BY field.
Some sort of cursor-based API. This is less desirable in my opinion, especially if it requires using additional server resources to maintain cursor state for a client that may never come back and get the next set of results. If so, I’d at least want the cursor pagination to be “opt-in” so that I don’t have to deal with the extra overhead unless I know I am going to be paging.
One issue is how to support clients in retrieving all data from an index that is being actively written to, and thus retrieving “all data” is a moving target.
Workarounds currently being used to get around the lack of export APIs kind of address this, because they write a unique ID to the metadata that acts as a marker/e-tag indicating a given vector has been “seen” (ref: Returning list of IDs - #20 by sangbuinhu).
Semantic search
For semantic query, does pagination make sense? Maybe, but I think it’s less critical unless you frequently see folks running into issues with large top_k values resulting in huge responses and hitting HTTP timeouts, etc.
If so, I imagine it would be somewhat similar to the cursor-based approach, where you probably only want to run the semantic search one time but allow clients to get the search results a page at a time. So maybe you’re returning back a query ID that clients use to retrieve the search results for a given query, along with limit/offset params.
Any updates on this? Built something scrappy to circumvent it, but will probably have to find a new solution if this isn’t going to be on the roadmap. Looks like it’s been on here for a full year and one of the oldest requests.
For sure! The ability to paginate query results (with a cursor or offset).
If I want to potentially return a thousand of results to users, right now I have to query the full thousand each time and then slice them up. This is an expensive query. If I had an “offset” or “cursor” to pass into the query, I could essentially only query 50 or 100 each time instead of the full 1000.
Basically for potentially large datasets pinecone doesn’t scale as you will end up returning potentially thousands of results each time when u really just want 25-100 max. Same concept as any other DB with an offset or cursor for querying.
I’m curious about what you mean by “expensive.” Are you referring to cost, latency, or something else?
One solution you might consider is setting include_values and include_metadata to False. This way, the query will only return a list of IDs. You can then use the Fetch endpoint to retrieve the values and metadata for the specific slice of data you actually need right now.
Got it. Thanks for the idea. This isn’t a bad short term solution and I’ll look into this. I think your solution still doesn’t solve the underlying problem though. Just about every DB out there offers some sort of pagination to skip a certain number of records.
If for example I was on record #10,000, I would have to query 10k ID records, then use the fetch endpoint to query the last 25 records of data I actually need. Vs if I had an offset I could just pass in “skip: 9,975” and then get the 25 records I need.
With your option I have to run 2 requests and fetch 10k records of data.
Get 10k records w/ ID only.
Get the 25 records by ID on fetch endpoint.
With offset/skip:
Get 25 records w/ one API call.
Skip is faster, less data transfer, less code to write, (probably costs less financially), and this is a standard DB solution for querying. This is what I mean by expensive.
Other vector DBs offer this functionality on queries (see offset). I think Pinecone is one of the few vector DBs that don’t offer this querying capability.