I want to have a SQL table and want to convert it into Pinecone and allow user ask questions.
I am gonna using Pinecone and Langchain, OpenAI for this project.
My table looks like this :
company name, members, annual income
A, 39, 4000
B, 45, 5000
…
And I want to allow user ask these kind of question and get answer.
Sample question :
Which company have most members ?
I can convert SQL table into text and can ingest it and save it into Pinecone vector database.
But I am not sure if I can get correct reply using Pinecone similarity search.
In my experience, by using Pinecone similarity search we can get correct answer for this kind of questions :
What is annual income of company A ?
Because Pinecone similarity search can get correct chunk about this question based on similarity search and we can get correct answer by sending this question to OpenAI.
Please help me.
Thanks.
Hi @moongates5000, thanks for the post. This should be possible, though, as you note, you must convert the SQL table into text. Moreover, you would likely be better off converting the information into natural language. For example, instead of “Company XYZ, 500 employees, $80k”, you should embed the text, “Company XYZ has 500 employees, and the average annual salary is $80k.”
You also should explore using metadata and passing the retrieved metadata as context to the LLM after you perform your similarity search.
This said you may want to explore using a tabular database if you are interested in comparing metrics, such as asking what company has the most members. I encourage you to check out this post from the Open AI forums. The participants discuss some of the natural limitations of using embeddings with tabular data and how data that follows a schema, such as what you describe, is often better suited for a more traditional database.