(Continuing the discussion from Import/export of embeddings: )
For a bulk export endpoint, pagination is important to consider. There are various approaches to pagination.
Limit/offset params. Consistent response ordering is important here, so either the server needs to guarantee order or else the client needs to be able to specify an ORDER BY field.
Some sort of cursor-based API. This is less desirable in my opinion, especially if it requires using additional server resources to maintain cursor state for a client that may never come back and get the next set of results. If so, I’d at least want the cursor pagination to be “opt-in” so that I don’t have to deal with the extra overhead unless I know I am going to be paging.
One issue is how to support clients in retrieving all data from an index that is being actively written to, and thus retrieving “all data” is a moving target.
Workarounds currently being used to get around the lack of export APIs kind of address this, because they write a unique ID to the metadata that acts as a marker/e-tag indicating a given vector has been “seen” (ref: Returning list of IDs - #20 by sangbuinhu).
For semantic query, does pagination make sense? Maybe, but I think it’s less critical unless you frequently see folks running into issues with large
top_k values resulting in huge responses and hitting HTTP timeouts, etc.
If so, I imagine it would be somewhat similar to the cursor-based approach, where you probably only want to run the semantic search one time but allow clients to get the search results a page at a time. So maybe you’re returning back a query ID that clients use to retrieve the search results for a given query, along with limit/offset params.