Error using pinecone-client at lambda aws

ruly.altamirano · October 20, 2023, 5:47pm

i got an error thats seems is originates from the creation of a ThreadPool within the pinecone library. Specifically, the error occurs when you try add data to my pinecone index when i use from_documents , seems a multiprocessing not supported by AWS Lambda that has restrictions on the ability to use certain multiprocessing features.

pinecone has any settings that allow you to turn off the use of ThreadPool or change the way it handles concurrency?

tim · October 20, 2023, 5:59pm

Are you using the actual Pinecone python library or Langchains wrapper? The reason I ask is because I’m not familiar with a from_documents method in the Pinecone library, but I am with Langchain.

By default, upserts are synchronous (source) and can be made async using pool_threads kwarg.

If using Langchain the default kwarg for pool_threads is 4 (source code)

So you may need to forcefully pass in 1 or even None here? Or you can just do the upsert yourself using the Pinecone library natively. If using Langchain, the reason above is why pool thread is set to a non-singular value.

Lastly, I think the issue on the AWS Lamdba side preventing you from multi-processing is probably about how shared memory doesn’t exist so instead of multiprocessing.Queue you need to use Pipe

system · November 3, 2023, 5:59pm

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.