Pagination support

(Continuing the discussion from Import/export of embeddings: )

Bulk export
For a bulk export endpoint, pagination is important to consider. There are various approaches to pagination.

  • Limit/offset params. Consistent response ordering is important here, so either the server needs to guarantee order or else the client needs to be able to specify an ORDER BY field.

  • Some sort of cursor-based API. This is less desirable in my opinion, especially if it requires using additional server resources to maintain cursor state for a client that may never come back and get the next set of results. If so, I’d at least want the cursor pagination to be “opt-in” so that I don’t have to deal with the extra overhead unless I know I am going to be paging.

  • One issue is how to support clients in retrieving all data from an index that is being actively written to, and thus retrieving “all data” is a moving target.
    Workarounds currently being used to get around the lack of export APIs kind of address this, because they write a unique ID to the metadata that acts as a marker/e-tag indicating a given vector has been “seen” (ref: Returning list of IDs - #20 by sangbuinhu).

Semantic search
For semantic query, does pagination make sense? Maybe, but I think it’s less critical unless you frequently see folks running into issues with large top_k values resulting in huge responses and hitting HTTP timeouts, etc.

If so, I imagine it would be somewhat similar to the cursor-based approach, where you probably only want to run the semantic search one time but allow clients to get the search results a page at a time. So maybe you’re returning back a query ID that clients use to retrieve the search results for a given query, along with limit/offset params.