I'm looking forward to test out similarity search on pinecone.

jdranpariya · November 28, 2022, 2:01pm

I would like to know more about how can I Import sql table column/ spark DataFrame column as input and which further I want to vectorize it(only way I know of now is tf-idf let me know if you know how to do word embeddings at 2 or 3 grams level). And than get the closest match for the query I perform.

Regards,
Jaydeep

kbutler · November 28, 2022, 4:45pm

Hello jdranpariya, welcome to the community. Here is a specific example that works with pandas dataframes and imports flat file data. https://www.pinecone.io/docs/examples/semantic-text-search/
There are many ways to process data and create embeddings. There are even third party services that can help you do this such as Cohere.ai and HuggingFace. Best of luck and let us know how it goes!