Query engine returns 'empty response' from pinecone

kprosera · November 21, 2024, 4:51am

I have been searching and scratching my head more than a week.

Scenario:

Having markdown and text files stored to pinecone, and would like to read those vectors from pinecone in the future. I am using llamaindex.

Issues:

The markdown files and text files are vectorized and stored to pinecone successfully, but when I retrieve them back as query. It shows empty response.

Code:

Setting up connection.

pc = Pinecone(api_key=PINECONE_API_KEY)

pc.create_index(
        name = 'index-data',
        dimension = 1536,
        metric = 'cosine',
        spec = ServerlessSpec(
            cloud = 'aws',
            region = AWS_REGION_NAME,
        )
    )

pc_index = pc.Index('index-data')

There are multiple files stored as individual namespaces.

Settings.llm = llm_model
Settings.embed_model = embed_model
Settings.chunk_size = 512

This code is doing upserting to pinecone.

namespace = file_name.rsplit('.', 1)[0]
reader = SimpleDirectoryReader(input_files=[file_path])
documents = reader.load_data()
parser = SimpleNodeParser()
nodes = parser.get_nodes_from_documents(documents)
vector_store = PineconeVectorStore(pinecone_index=pc_index, namespace=namespace)
storage_context = StorageContext.from_defaults(vector_store=vector_store)
index = VectorStoreIndex(nodes, service_context=Settings, storage_context=storage_context)

This is code loading vectors from pinecone.

vector_store = PineconeVectorStore(pinecone_index=pc_index)
storage_context = StorageContext.from_defaults(vector_store=vector_store)
index = VectorStoreIndex.from_vector_store(vector_store, service_context = Settings, storage_context=storage_context, embedding_model=Settings.embed_model)

This shows all files are vectorized and stored to pinecone.

{'dimension': 1536,
 'index_fullness': 0.0,
 'namespaces': {'A': {'vector_count': 193},
                'B': {'vector_count': 42},
                'C': {'vector_count': 82},},
 'total_vector_count': 361}

This is query retriever, which shows empty response.

user_prompt = "what is acoustic"
query_engine = index.as_query_engine()
init_response = query_engine.query(user_prompt)
result_str = str(init_response)
print (result_str)

→ Empty Response

Just a reference here, I am able to retrieve query back if I use llamaindex in-memory storage.

if not os.path.exists(PERSIS_DIR):
    #create the new index
    reader = SimpleDirectoryReader(input_dir=input_directory, recursive=True)
    documents = reader.load_data()
    parser = SimpleNodeParser()
    nodes = parser.get_nodes_from_documents(documents)
    Settings.llm = llm_model
    Settings.embed_model = embed_model
    Settings.chunk_size = 512
    storage_context = StorageContext.from_defaults()
    index = VectorStoreIndex(nodes, service_context = Settings, storage_context=storage_context)
    index.storage_context.persist(persist_dir=PERSIS_DIR)
else:
    Settings.llm = llm_model
    Settings.embed_model = embed_model
    Settings.chunk_size = 512
    storage_context = StorageContext.from_defaults(persist_dir=PERSIS_DIR)
    index = load_index_from_storage(service_context = Settings, storage_context=storage_context)

Question:

How can I vectorize file and store to the pinecone?
How can I retrieve vectors from pinecone?
What is the best way to write prompt query?

Thanks

jocelyn · November 21, 2024, 5:05pm

hi @kprosera sounds like youve been pretty persistent about working through options to sort that out. Looking at your specific issues:

If you’re using namespaces for each file, make sure you’re specifying the correct namespace when querying. This is key when initializing your PineconeVectorStore.
Check that your index is correctly populated by reviewing the index stats after you upload your data.
For best results with prompts, retrieve relevant context using the retriever first. Then, use that context to craft a more informed query.

jocelyn · November 21, 2024, 5:07pm

to properly use LlamaIndex with Pinecone:

1.) initialize Pinecone and create an index:

python

from pinecone.grpc import PineconeGRPC as Pineconefrom pinecone import ServerlessSpec
pc = Pinecone(api_key="YOUR_API_KEY")
pc.create_index(    name="example-index",    dimension=1536,    metric="cosine",    spec=ServerlessSpec(        cloud="aws",        region="us-east-1"    ))

2.) To query data using LlamaIndex and Pinecone, you need to:

python

from llama_index.core import VectorStoreIndexfrom llama_index.core.retrievers import VectorIndexRetriever
# Create VectorStoreIndex from your vector_storevector_index = VectorStoreIndex.from_vector_store(vector_store=vector_store)
# Configure retriever with number of resultsretriever = VectorIndexRetriever(index=vector_index, similarity_top_k=5)
# Query vector DBanswer = retriever.retrieve('your query here')

For querying, you can create a query engine using the RetrieverQueryEngine:

python

from llama_index.core.query_engine import RetrieverQueryEngine
query_engine = RetrieverQueryEngine(retriever=retriever)response = query_engine.query('your query here')