Failed to update metadata in Pinecone (504)

josh · June 10, 2025, 5:23pm

Hi. Every once in a while we receive a 504 Error from Pinecone on the index host.
Here is the error:
Failed to update metadata in Pinecone: (504)
Reason: Gateway Time-out
HTTP response headers: HTTPHeaderDict({‘Server’: ‘awselb/2.0’, ‘Date’: ‘Thu, 05 Jun 2025 17:58:21 GMT’, ‘Content-Type’: ‘text/html’, ‘Content-Length’: ‘132’, ‘Connection’: ‘keep-alive’})

The way we have it set up, we are establishing a local connection with the Pinecone index and then upserting and updating objects to the Pinecone DB. Please let us know how we can fix this as soon as possible! Thank you!!

milen · June 11, 2025, 4:20pm

Hi @josh, and welcome to the community forum.

Please specify what you mean by “once in a while”. For example, once every few requests, once daily, weekly, etc. Also, providing a complete log of the networking communication will be very helpful in investigating this issue.

Generally speaking, time-outs can occur in any network communication. As there is no 100% guarantee that they will never happen, the best practice is to have the client account for it and act according to the domain rules. That may be retrying (reconnecting if needed), switching to another instance, persistent queuing and processing at a later time, etc. Any such case should be recorded together with relevant logs. That way, if the frequency of such failures suggests the remote service (the index in this case) is unreliable, you’ll have a solid base for your claim and valuable tracing data that may significantly reduce the investigation time.

josh · June 11, 2025, 5:11pm

HI Milen, thank you for your response. Let me provide more context about our setup and the error frequency:

Error Frequency:

We observe this 504 Gateway Timeout error approximately 3-5 times per day
The errors don’t follow any specific pattern and can occur at any time during our operations
The errors happen on both upserts and updates of metadata

Implementation Details:

Connection Setup:
- We initialize a Pinecone client at the application level
- This client is used for all Pinecone operations (upserts, metadata updates, and deletions)
Data Structure Being Updated:
When these 504 errors occur, we’re attempting to update metadata in Pinecone with this structure:

metadata = {
    # Text fields
    "field1": str,  # Primary text identifier
    "field2": str,  # ISO format datetime
    "field3": int,  # Epoch timestamp
    
    # Reference fields
    "field4": UUID,  # Foreign key reference
    "field5": UUID,  # Foreign key reference
    "field6": UUID,  # Foreign key reference
    
    # State fields
    "field7": str,  # Enum/status value
    "field8": list[str]  # Array of text tags
}

Error Handling & Retry Mechanism:
- Our operations are wrapped in a huey cron task with built-in retry logic
- Upon failure, the task retries 4 times with a 60-second backoff interval
- We maintain logs of each failure and retry attempt
Data Synchronization:
- We maintain a PostgreSQL database as our source of truth
- When these 504 errors occur, it creates inconsistencies between our PostgreSQL data and Pinecone metadata
- The retry mechanism will sometimes not resolve these inconsistencies, so we are concerned about the timeouts.

Please let me know if I could provide more information.

milen · June 12, 2025, 7:58am

Thanks for the detailed information @josh. It seems you have things in good shape on your side.

I checked with our engineering team, and it seems there’s an issue on the AWS infrastructure that is occasionally causing timeouts. They are investigating it now. I’ll update you as soon as I know more.

The retry mechanism will sometimes not resolve these inconsistencies …

Can you please elaborate on that? Generally speaking, a successful retry should be able to put things back in order. Do you mean there are cases when none of the four retries succeed? Or is it related to how synchronization works in your case?

Side note: If applicable in your situation you may consider applying a Saga pattern with compensating actions to avoid having inconsistencies in the first place.

josh · June 27, 2025, 11:38pm

Hi Milen, apologies for the late response. I just wanted to follow up to see if the issue was resolved. We are still seeing errors on our end occasionally.

For context, sometimes when we try the retry mechanism to send the object to the Pinecone DB, it does indeed fail during the four retries. We are utilizing the djhuey package to make these queues. Please let me know if I could provide more insight. Thank you

milen · June 30, 2025, 10:35am

Hi Josh,

A quick look at the internal repo suggests the issue is still being worked on. I’m checking with the team to see if they have an ETA. However, given the intermittent nature of the problem and its dependency on third-party infrastructure, it may be a while.

For context, sometimes when we try the retry mechanism to send the object to the Pinecone DB, it does indeed fail during the four retries. We are utilizing the djhuey package to make these queues. Please let me know if I could provide more insight.

Having several retries in a row fail is concerning. I’m unaware of other users having such an issue. If you can provide some logs that indicate how long apart those retries are, what data is being sent, how long it takes per request, what the exact response is, and in general, anything that would help “debug” the communication, that would be helpful. Feel free to DM me or email me (milen at pinecone.io) if posting such logs publicly is an issue.