Dear Colleagues,
I did read in a number of documents of a hotel, describing it´s services. I created the chunks and used OpenAIEmbeddings. Then I initialized pinecone
# initialize pinecone
pinecone.init(
api_key=PINECONE_API_KEY, # find at app.pinecone.io
environment=PINECONE_API_ENV
)
index_name = "hotel"
docsearch = Pinecone.from_texts([t.page_content for t in texts], embeddings, index_name=index_name)
my query now looks like:
from langchain.llms import OpenAI
from langchain.chains.question_answering import load_qa_chain
llm = OpenAI(temperature=0, openai_api_key=os.environ['OPENAI_API_KEY'])
chain = load_qa_chain(llm, chain_type="stuff")
query = "What are the room categories of the hotel?"
docs = docsearch.similarity_search(query, include_metadata=True)
chain.run(input_documents=docs, question=query)
and the results are something like "’ The hotel offers Komfort Doppelzimmer, Turmstudio, and Wohnstudio rooms.’
My questions:
- How can I define the model to generate the results? I would like to use GPT-4
- Am I right, that I am currently always building the index again and again?
If I want to deploy it and only search in pinecone without building the index again? How would I do that? - Finally I would like to have a simple streamlit app. So what is needed, that I don´t need to run the index again and only do the embedding of the input, search for similar vectors and get better results using GPT-4
Many questions… I know… Thank you!