Hi @saugatthapa344 and welcome to the Pinecone community forums.
Thanks for your question and for providing a screenshot.
Itâs a little difficult to read your screenshot - could you please paste all your relevant code here?
The most common issues preventing upsert when using LangChain are having an improperly formatted vectors variable or having mismatched dimensionality (your embedding model outputs 1536 floating point numbers or vectors, but you set your Pinecone index to 384, for example).
From squinting at your screenshot, it appears that the serialization error is happening within the PineconeVectorStore class and itâs unable to serialize what it expects to be a properly formatted object - so this makes me suspect the data youâre passing in is incorrectly formatted.
Iâd recommend:
- Adding print statements after every line of code you have so you can verify the format of your data structures
- Pasting all your relevant code here for us to review - being careful not to include any secrets like your Pinecone API key
Hope that helps, and looking forward to your response!
Best,
Zack
hello @ZacharyProser sir, here is the code, please help
from langchain_community.document_loaders import PyPDFLoader
loader = PyPDFLoader(â/content/Research_of_YOLO_Architecture_Models_in_Book_Detec.pdfâ)
pdf_pages = loader.load()
text_splitter = RecursiveCharacterTextSplitter(
# Set a really small chunk size, just to show.
chunk_size=500,
chunk_overlap=20,
length_function=len,
is_separator_regex=False,
)
text_chunks = text_splitter.split_documents(pdf_pages)
import getpass
import os
from getpass import getpass
import os
Get Google API key from environment variable or set it if not present
api_key = os.environ.get(âGOOGLE_API_KEYâ)
if not api_key:
api_key = getpass("Provide your Google API key here: ")
os.environ[âGOOGLE_API_KEYâ] = api_key
Print to verify the API key (for debugging purposes only, remove in production)
print(f"Google API key set: {os.environ[âGOOGLE_API_KEYâ]}")
from pinecone import Pinecone
pc = Pinecone(api_key=âXXXXXâ)
index = pc.Index(âchatbotâ)
import pinecone
index = pinecone.Index(index, host=âhttps://chatbot-658rjfl.svc.aped-4627-b74a.pinecone.ioâ)
from langchain_pinecone import PineconeVectorStore
from langchain.vectorstores import Pinecone
from langchain_community.document_loaders import TextLoader
os.environ[
âPINECONE_API_KEYâ] = âyour api keyâ
index_name = âchatbotâ
embeddings = GoogleGenerativeAIEmbeddings(model=âmodels/embedding-001â)
docsearch = PineconeVectorStore.from_texts(
[t.page_content for t in text_chunks],
index_name=index_name,
embedding=embeddings
)
docsearch
docsearch.as_retriever()
query= âYOLOv7 outperforms which models?â
docs = docsearch.similarity_search(query)
usr/local/lib/python3.10/dist-packages/pinecone/core/client/api_client.py in sanitize_for_serialization(cls, obj)
286 if isinstance(obj, dict):
287 return {key: cls.sanitize_for_serialization(val) for key, val in obj.items()}
â 288 raise PineconeApiValueError(âUnable to prepare type {} for serializationâ.format(obj.class.name))
289
290 def deserialize(self, response, response_type, _check_type):
PineconeApiValueError: Unable to prepare type Repeated for serialization
@saugatthapa344 as a concerned community member, please do not paste your Pinecone API key in plain text on the forum. I very much recommend this post is removed.
@tjensen thank you sir for reminder, sorry for that, can you please help me with the codeđ
I have edited the post to remove the API key. @saugatthapa344 Please delete that API key and create a new one.
i deleted the api @ZacharyProser , can you please help me with the error?