Hi there! This is a great use case, I am hoping I can help provide a few options here.
In your code snippet, you’re attempting to create a Document object with a pandas DataFrame as the content. However, Pinecone’s serialization method doesn’t natively support DataFrame objects, hence the error. I am providing two options that can help.
Convert DataFrame to a Serializable Format:
Before storing the DataFrame in Pinecone, convert it to a format that Pinecone can understand. Common serializable formats include JSON or string representations.
You can convert a DataFrame to a JSON string using the to_json() method, or to a simple string using the to_string() method provided by pandas.
json_string = current_df.to_json(orient="split")
# or
string_representation = current_df.to_string()
Store String Representation:
Update the Document initialization to use the string or JSON representation of the DataFrame instead of the DataFrame object itself.
Finally, make sure Haystack is able to digest this format. You may need to adjust how you upset the query due to JSON representations. I hope that this helps get this up and running!