Upserting to Pinecone Index

Hi, please help me. I am losing my mind. I am trying to upload data into my Pinecone db but have no idea what I’m doing. I try to run the script but the error I get is this error


" File “C:\Users\lee__\AppData\Local\Programs\Python\Python311\Lib\ssl.py”, line 1346, in do_handshake
self._sslobj.do_handshake()
urllib3.exceptions.ProtocolError: (‘Connection aborted.’, ConnectionResetError(10054, ‘An existing connection was forcibly closed by the remote host’, None, 10054, None))

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File “C:\Users\lee__\PycharmProjects\pinecone-tutorial\ConvertCSVtoJSON.py”, line 52, in
index.upsert(vectors=batch)
File “C:\Users\lee__\PycharmProjects\pinecone-tutorial\venv\Lib\site-packages\pinecone\core\utils\error_handling.py”, line 25, in inner_func
raise PineconeProtocolError(f’Failed to connect; did you specify the correct index name?') from e
pinecone.core.exceptions.PineconeProtocolError: Failed to connect; did you specify the correct index name?

Process finished with exit code 1

Can someone please to help me?

this is my code
import pinecone
import json
import numpy as np

Load the JSON file with ‘utf-8’ encoding

with open(‘dataset_pinecone-new_2023-06-13_06-51-26-875.json’, ‘r’, encoding=‘utf-8’) as f:
data = json.load(f)

Assume you have a function to convert text into a high-dimensional vector

def convert_to_vector(text):
# your code to convert text into vector
# for example, using a transformer model
vector = np.random.randn(1536).tolist() # generate a random vector as a list
return vector

Initialize Pinecone client

pinecone.init(api_key=‘xxx’)

Define an index

index_name = “xxx-lib”

Instantiate index

index = pinecone.Index(index_name=index_name)

Generate the vectors

vectors =
for item in data:
vector = convert_to_vector(item[‘text’])
if isinstance(vector, list):
vector_object = {
‘id’: str(item[‘url’]),
‘values’: vector,
‘metadata’: {‘text’: item[‘text’]}

    }
    vectors.append(vector_object)
else:
    print(f"Skipping item {item['url']} due to invalid vector {vector}")

Print the first 5 items to verify data format

print(vectors[:5]) # print the first 5 items

Break data into smaller chunks and upload

batch_size = 100
for i in range(0, len(vectors), batch_size):
batch = vectors[i:i + batch_size]
# Print the type of the batch and the type of a sample vector
print(f’Type of batch: {type(batch)}‘)
sample_vector = batch[0]
print(f’Type of vector for id “{sample_vector[“id”]}”: {type(sample_vector[“values”])}’)
print(batch) # print the batch to verify the format
index.upsert(vectors=batch)

Deinitialize the client when you’re done

pinecone.deinit()

You have to provide the environment your project is in, too. Pinecone doesn’t offer automatic API key-based routing today.

Thanks Cory. Do you mean environmental variables or the server in Pinecone?

Hi Lee, Were you abel to solve this issue? I am getting same issue even when i add environment variable and project name like:

pinecone.init(project_name=project_name, api_key=api_key, environment=environment)

@ankur.gupta I don’t see the project_name parameter in the init docs. Maybe try removing that and see if you’re still getting the error?

@ankur.gupta Solved this in another thread. Looks like everyone was having this issue today!

2 Likes