can not insert new vectors as they end up replacing the old ones is this a payment issue?
Hi @evansmumba14, and welcome to the Pinecone community forums!
Thank you for your question.
Could you please share all your relevant code, being careful not to include secrets such as your Pinecone API key? It’s difficult to debug what’s going wrong without being able to see your code.
Based on your description, it sounds like you’re using an embedding model to convert some data to vectors, and then you’re upserting those vectors to your Pinecone index, correct?
You can optionally provide an ID when upserting or updating data - if you don’t provide a unique ID for each set of vectors, it’s possible your index already includes the vectors you’re trying to upsert.
See our guide on updating data.
Hope this helps!
Best,
Zack
1 Like
def process_batch(batch: List[str], embedding_model, index_name: str, metadata: dict):
try:
embeddings = embedding_model.embed_documents(batch)
vectors = [
(f"id_{i}", embedding, {**metadata, "text": text})
for i, (text, embedding) in enumerate(zip(batch, embeddings))
]
pc.Index(index_name).upsert(vectors)
except Exception as e:
print(f"Error processing batch: {e}")
def main(pdf_path: str, index_name: str):
embedding_model = OpenAIEmbeddings()
all_splits = load_and_split_pdf(pdf_path)
batch_size = 100
metadata = {
"book": "..",
"author": "..",
"subject": "...",
"keywords":
}
with concurrent.futures.ThreadPoolExecutor() as executor:
futures = []
num_batches = (len(all_splits) + batch_size - 1) // batch_size
with tqdm(total=len(all_splits), desc="Uploading to Pinecone") as pbar:
for i in range(0, len(all_splits), batch_size):
batch = all_splits[i:i + batch_size]
future = executor.submit(process_batch, batch, embedding_model, index_name, metadata)
futures.append(future)
for future in concurrent.futures.as_completed(futures):
future.result()
pbar.update(batch_size)