How long is this “slight delay” typically? Can someone please provide an estimate?
can you provide some additional context?
@heliumtrope When I upsert to Pinecone serverless, I notice that the upserted vectors aren’t immediately retrievable. According to the docs, “there can be a slight delay before new or changed records are visible to queries”. I’d like a ballpark estimate on how long this delay is. Is it 1 second? 2 seconds? 10 seconds? Over a minute? I have no idea. It would be nice to have an estimate to build around this constraint.
did you try checking the vector counts returned by the describe_index_stats() operation to see if they have been updated?
I’m sure I can write a script to measure the delay myself, but I’d like an official estimate. Would be helpful to have this estimate noted in the docs in addition to vague language like “slight delay”
@drainedouticedout Right now it’s tricky to answer this question as Serveless is in public preview and we are making improvement in this area daily.
To give you a move direct answer it depends on the workload, if you’re doing a bulk load into the system it will be larger than when updating a few indexes.
We have recently updated the documentation to include Serverless Architecture, which I think you will find interesting.
If this is having an impact on your application please let us know!
@patrick1 My application, like many others, frequently runs into the use-case where files are uploaded and “used” shortly afterwards. For example, take a summarization use-case. People upload files and want to summarize them based on k-means clustered vectors. This was not a problem pre-serverless. But now it is. I have added a manual timeout in my application to wait 5 seconds before attempting to retrieve. This is far from ideal.
Typically when working with a database, you expect that when you upsert something into the database and get an OK response back from the DB server, that your records are available and ready to be queried. You don’t expect that you have to wait or even worse, do some kind of while loop to “describe” the index to verify your records are available. This is not the case with Pinecone Serverless. It’s a shame because it used to work fine before I migrated to serverless.
@drainedouticedout this is actually really common with eventually consistent databases like Pinecone and MongoDB. But as Patrick said, serverless is still in public preview, so you should expect slight hiccups like this while we finalize some things. We’re working to make data freshness as close to immediate as we can.
Keep in mind, though, that, barring a future release which adds transactions and locks, you should continue to expect some slight delay in freshness, even if it’s measured in single digit milliseconds.
This is still happening and is quite frustrating. The forced migration for certain plans to serverless is compounding the problem. With pods, it’s guaranteed (at least from what I’ve seen) to show up in a query after an upsert. I’ve never seen my automated tests fail.
Right now, my automated tests fail randomly - let alone the prior-mentioned production cases of a user not seeing what they’ve uploaded. I’ve added arbitrary sleep() calls but this is far from ideal.
Mandatory migration should be delayed until this issue is addressed. Or provide a mechanism for us to check that the “transaction” is complete and has fully flowed through the system.
Yes, this was not a problem pre-serverless
Hello @Cory_Pinecone and Team,
Can you guys please provide an estimated time by which I can actually be sure that the vectors are indexed? This is causing a real pain, In our use case, users upload their docs and can immediately query upon them. We have a sleep of 15 secs still vector retrieval fails sometimes.
We did not face this issue previously in pod-based indexes. Your serverless architecture mentions that the freshness index is specifically maintained to solve this problem but looks like that ain’t working properly.
In the meantime, can you guys please let us know of a fool-proof solution for this problem as it’s impacting our business.
@arnab Can you please create a ticket and include the name of the index on which you’ve observed this behavior?
We’ve been hitting a lot of problems with this delay. It’s different behaviour to the non-serverless version, which was always immediately available.
We’ve implemented a polling to wait for data to be present. For this we use an id search. The time this take is anywhere from 100ms to 20 seconds, but we suspect we haven’t seen the longest duration yet.
Unfortunately, we’re still seeing the occasional vector search fail with zero results 20 seconds after we’ve confirmed that the id search works, which making us wonder whether we now need to do vector search polling as well (a lot of API calls for each insert) or whether to look at an alternative solution.
This is our current process for indexing and ensuring it’s in the index…
- Add to the index
- Check if it’s available using search by id
- If not present, then exponential back-off of the above search by id
- Check if it’s available using search by metadata + vector
- If not present, then exponential back-off of the above search by metadata + vector
- Use vector search
This is fairly recently rolled out, but so far…
Step 2-3 can take anywhere up to the 30 seconds
Step 4-5 can take up to additional 10 seconds
Worth noting, that prior to using serverless, we only had steps 1 (insert) and 6 (use).
If you’re looking for a naive/simple wait-for-indexing, then I’d use at least 60 seconds, or 120 seconds to be safe.
Any updates on this? Due to this delay, I sometimes have problems with “LLM hallucinations”, because I’m trying to query something that’s not yet possible to query.
Currently I have a small timeout which fixes the problem in most cases, but I’d love to make sure that after something is added to the index, it’s immediately available.
Getting 200 OK and then not being able to query is counter-intuitive to me.
EDIT: I “fixed” it by immediately querying the namespace every 1s until it returns true.
const docIds = await vectorStore.addDocuments(texts, {
namespace: pNamespace,
})
const searchNamespace = pineconeIndex.namespace(pNamespace)
let results
do {
results = await searchNamespace.query({
topK: 1,
id: docIds[0],
})
if (results.matches.length === 0) {
await new Promise((resolve) => setTimeout(resolve, 1000))
}
} while (results.matches.length === 0)
Now you can query it immediately after and be sure that new records are there.
Hi @dom1. Welcome to the Pinecone forum!
Pinecone is an eventually consistent database, so there’s always going to be some amount of time between when you add records and when those records can be queried. Check out our Serverless architecture documentation to learn more about what’s happening during that delay. In general, a new record should be available within seconds due to the freshness layer, which ensures that data is available before it has been fully indexed. How long are the delays you’re seeing?
If it’s essential for your use case to lock reads (queries) until you can ensure data is available, we recommend either fetching by the ID of the record or comparing the LSN of the write to the LSN of a query. Every operation gets an LSN (log sequence number), and they are monotonically increasing, so as soon as the LSN of the query is higher than the LSN of the write, your data is available.
I hope that helps. Please let me know.
Best wishes,
Jesse
Hey Jesse,
Thanks for a quick reply and explanation! I’m doing exactly that - checking by ID to make sure that it’s fully indexed.
I would be nice to have that option natively; e.g. waitToBeIndexed: true
as a prop.
Cheers,
Dom
Hi @dom1,
It looks like you’re querying by ID and waiting for some matches. That’s fine, but it’s probably more efficient to fetch by ID. It should also incur fewer read units.
I know we’re always working on reducing the freshness delay, but I’ll also pass your suggestion to the team.
Best,
Jesse