Hi, please help me. I am losing my mind. I am trying to upload data into my Pinecone db but have no idea what I’m doing. I try to run the script but the error I get is this error
" File “C:\Users\lee__\AppData\Local\Programs\Python\Python311\Lib\ssl.py”, line 1346, in do_handshake
self._sslobj.do_handshake()
urllib3.exceptions.ProtocolError: (‘Connection aborted.’, ConnectionResetError(10054, ‘An existing connection was forcibly closed by the remote host’, None, 10054, None))
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File “C:\Users\lee__\PycharmProjects\pinecone-tutorial\ConvertCSVtoJSON.py”, line 52, in
index.upsert(vectors=batch)
File “C:\Users\lee__\PycharmProjects\pinecone-tutorial\venv\Lib\site-packages\pinecone\core\utils\error_handling.py”, line 25, in inner_func
raise PineconeProtocolError(f’Failed to connect; did you specify the correct index name?') from e
pinecone.core.exceptions.PineconeProtocolError: Failed to connect; did you specify the correct index name?
Process finished with exit code 1
Can someone please to help me?
this is my code
import pinecone
import json
import numpy as np
Load the JSON file with ‘utf-8’ encoding
with open(‘dataset_pinecone-new_2023-06-13_06-51-26-875.json’, ‘r’, encoding=‘utf-8’) as f:
data = json.load(f)
Assume you have a function to convert text into a high-dimensional vector
def convert_to_vector(text):
# your code to convert text into vector
# for example, using a transformer model
vector = np.random.randn(1536).tolist() # generate a random vector as a list
return vector
Initialize Pinecone client
pinecone.init(api_key=‘xxx’)
Define an index
index_name = “xxx-lib”
Instantiate index
index = pinecone.Index(index_name=index_name)
Generate the vectors
vectors =
for item in data:
vector = convert_to_vector(item[‘text’])
if isinstance(vector, list):
vector_object = {
‘id’: str(item[‘url’]),
‘values’: vector,
‘metadata’: {‘text’: item[‘text’]}
}
vectors.append(vector_object)
else:
print(f"Skipping item {item['url']} due to invalid vector {vector}")
Print the first 5 items to verify data format
print(vectors[:5]) # print the first 5 items
Break data into smaller chunks and upload
batch_size = 100
for i in range(0, len(vectors), batch_size):
batch = vectors[i:i + batch_size]
# Print the type of the batch and the type of a sample vector
print(f’Type of batch: {type(batch)}‘)
sample_vector = batch[0]
print(f’Type of vector for id “{sample_vector[“id”]}”: {type(sample_vector[“values”])}’)
print(batch) # print the batch to verify the format
index.upsert(vectors=batch)
Deinitialize the client when you’re done
pinecone.deinit()