Inaccurate Freshness

We have been using freshness via the API, rather than the Python SDK, for several months now. By that I mean we upsert large batches of vectors, get the x-pinecone-request-lsn value from the response, then use /query to find out what the x-pinecone-max-indexed-lsn is, polling until that is at or larger than the largest LSN we got from each of the upserts.

At some point that seems to have broken, and our integration code hasn’t really changed at all for this since we added it, at least not for this part. Consider the following psuedo-code:

target_lsn = upsert_all(…)
wait_for(target_lsn)

while True:
  results = query({ a filter matching the unique metadata of the vectors that were just upserted })
  print(results)

Particularly when a lot of vectors were uploaded, the first few queries return no results. We don’t use a top_k, so even nothing was even close to being a good match, it should have returned something. But the first few queries get nothing. Then suddenly, with no further calls to Pinecone on our end, and with the same LSN still present, we get results after a little bit, presumably after it’s actually caught up on freshness.

So, it seems like there’s an issue with freshness. No matter what, it should have gotten something from Pinecone, but it gets nothing. And then suddenly gets results, making me think the freshness LSN is inaccurate in some way. Any help would be appreciated, this is a considerable problem on our end, though we can mitigate it to an extent for now. :slight_smile:

Other notes:

  • We use a serverless index.
  • We embed with OpenAI
  • Paid account, no issues otherwise, no errors
1 Like

This sounds like maybe something to do with asynchronous requests getting processed later in your application?

Could you open a support ticket so we can look into this for you?