Hi everyone, I have a question regarding threads while embedding data into Pinecone. When I call the add_documents
method, I notice that seven new threads are spawned, which seems expected. However, after the operation completes, these threads remain alive, and with each subsequent request, additional threads are created and also kept alive. This results in an accumulation of threads and increased RAM usage over time, eventually causing my service to crash in production. I’ve ensured that both the Pinecone vector store and embedding instances follow a singleton pattern. Any suggestions or insights would be greatly appreciated!
print(f"Active threads before: {[t.name for t in threading.enumerate()]}")
text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=0)
print(f"Active threads 1 : {[t.name for t in threading.enumerate()]}")
pages = text_splitter.create_documents([text_input])
print(f"Active threads 2 : {[t.name for t in threading.enumerate()]}")
combined_training_text = ""
for doc in pages:
combined_training_text += doc.page_content + " " # Add space between texts if needed
characters_count = count_characters(combined_training_text)
print(f"Active threads 3 : {[t.name for t in threading.enumerate()]}")
pinecone_index = select_pinecone_manager(llm)
print(f"Active threads 4 : {[t.name for t in threading.enumerate()]}")
vectorstore = pinecone_index.get_vectorstore(namespace_id)
print(f"Active threads 5 : {[t.name for t in threading.enumerate()]}")
vector_ids = vectorstore.add_documents(documents=pages)
print(f"Active threads 6 : {[t.name for t in threading.enumerate()]}")
return vector_ids, characters_count, combined_training_text[:1000]