Relevant query to a single-document Index returns zero results

Hello Friends:

I’m newbie to Pinecone so your help is greatly appreciated.

Here is my issue details:

  1. Via the WebUI, I manually created the following index, specifying the text-embedding-3-large model: my-test-index

  2. Then, using the first Python script below, I successfully uploaded a ./FAQ.txt document to it. It is the only document in the index, and it’s contents appears at the very bottom of this post.

  3. Finally, using the second Python script below, I queried the index with variations of the text: What classes are offered at your facility?

The issue I’m facing is that I always receive this response:

{'matches': [{'id': 'DOC0001', 'score': 0.576828599, 'values': []}],
 'namespace': 'namespace01',
 'usage': {'read_units': 6}}

Naturally, I’m missing something and would appreciate your guidance. Thank you in advance. :blush:

(1) Document upsert Python script:

#! /usr/bin/env python3
import openai
from pinecone import Pinecone, ServerlessSpec

# ----------------------------------------------------------------------------------
openai.api_key = 'sk-proj-yyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyy'
PINECONE_API_KEY = 'pcsk_xxxxxx_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx'
(DOC_PATH, DOC_UNIQUE_ID, DOC_PINECONE_NAMESPACE) = ('./FAQ.txt', 'DOC0001', 'namespace01') 
(PINECONE_INDEX, PINECONE_INDEX_MODEL, PINECONE_ENVIRONMENT) = ('my-test-index',
                                                                'text-embedding-3-large',
                                                                'us-east-1')
# ----------------------------------------------------------------------------------

def get_openai_embeddings(text):
    response = openai.embeddings.create(input=text, model=PINECONE_INDEX_MODEL)
    return response.data[0].embedding

pc = Pinecone(api_key = PINECONE_API_KEY)
if not pc.has_index(PINECONE_INDEX):
    pc.create_index(name=PINECONE_INDEX,
                    dimension=3072,
                    metric="cosine",
                    deletion_protection="disabled",
                    spec=ServerlessSpec(cloud="aws", region="us-east-1"))

with open(DOC_PATH, "r", encoding="utf-8") as f: text = f.read()
embedding = get_openai_embeddings(text)
index = pc.Index(PINECONE_INDEX)
index.upsert([(DOC_UNIQUE_ID, embedding),], namespace = DOC_PINECONE_NAMESPACE)
print(f"Document with ID '{DOC_UNIQUE_ID}' has been upserted into the Pinecone index.")

(2) Query Python script:

from pinecone import Pinecone, ServerlessSpec
import openai

OPENAI_API_KEY = 'sk-proj-yyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyy'
OPENAI_EMBEDDING_MODEL = 'text-embedding-3-large'
PINECONE_API_KEY = 'pcsk_xxxxxx_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx'
PINECONE_INDEX = 'my-test-index'
PINECONE_ENVIRONMENT = 'us-east-1'
PINECONE_NAMESPACE = "namespace01"
openai.api_key = OPENAI_API_KEY

pc = Pinecone(api_key = PINECONE_API_KEY)
index = pc.Index(PINECONE_INDEX)

def get_embedding(text: str, model: str = OPENAI_EMBEDDING_MODEL) -> list:
    response = openai.embeddings.create(input=text, model=model)
    return response.data[0].embedding

def query_pinecone(embedding: list, top_k: int = 5) -> list:
    query_response = index.query(
        namespace=PINECONE_NAMESPACE,
        top_k=top_k,
        vector=embedding,
        include_metadata=True
    )
    return [f"{query_response['namespace']}: {match}" for match in query_response['matches']]

def query_with_natural_language(query: str, top_k: int = 5) -> list:
    embedding = get_embedding(query, model=OPENAI_EMBEDDING_MODEL)
    responses = query_pinecone(embedding, top_k=top_k)
    return responses

if __name__ == "__main__":
    user_query = "What classes are offered at your facility?"
    responses = query_with_natural_language(user_query, top_k=5)

    print("\nTop Responses:")
    for i, response in enumerate(responses, start=1):
        print(f"{i}. {response}")

(3) Document contents:

Synergy Fitness Gym FAQ.
 
Q1: What are your hours of operation?
A1: Our gym is open Monday to Thursday from 5am to 10pm, Friday from 5am to 9pm, Saturday from 8am to 7pm, and Sunday from 9am to 6pm.


Q2: Do I need a membership to use the facilities?
A2: No, we offer day passes for out-of-town visitors or those who prefer not to commit to a membership. Day passes are available online or at the front desk.


Q3: What types of cardio equipment do you have?
A3: Our gym features treadmills, elliptical machines, stationary bikes, and rowing machines.


Q4: Can I take group fitness classes if I'm not a member?
A4: Yes, many of our group fitness classes are open to non-members. Please check the schedule at the front desk or online for availability.


Q5: How do I sign up for personal training sessions?
A5: To schedule a personal training session, please visit our website and fill out the online form. Our staff will contact you to arrange a time that suits your schedule.


Q6: Do you offer childcare services?
A6: Unfortunately, we do not offer on-site childcare services. However, there are several nearby daycare centers and family-friendly activities available in the area.


Q7: Can I bring my dog to the gym?
A7: No, dogs are not allowed inside the gym with the exception of service animals.


Q8: What is your policy on food and beverages?
A8: We have a small selection of healthy snacks and beverages for sale at our cafe. Outside food and drinks are permitted in designated areas only.


Q9: How do I cancel or modify my membership?
A9: To cancel or modify your membership, please contact our customer service team via phone or email.


Q10: Do you have showers and locker rooms available for members?
A10: Yes, we have clean and well-maintained showers and locker rooms available for use by all members.


Q11: Can I reserve a spot on a machine or equipment?
A11: Yes, we offer a reservation system for high-demand equipment such as treadmills and stationary bikes. Please check our website or mobile app for availability.


Q12: Do you offer discounts for students, seniors, or military personnel?
A12: Yes, we offer discounted membership rates for students, seniors, and military personnel with valid ID.


Q13: How do I track my workouts and progress?
A13: We provide access to our online fitness tracking platform, where you can log your workouts, view your progress, and set goals.


Q14: Can I rent equipment or machines if mine is not in working order?
A14: Yes, we offer a rental program for equipment that is not available for purchase. Please contact our staff to arrange a rental.


Q15: Do you have any special events or activities planned throughout the year?
A15: Yes, we regularly host workshops, seminars, and fitness events. Check our website or social media accounts for upcoming events and schedule updates.

Q16: How much does membership cost?
A16: There is a month-to-month membership costing $25/month; and a annual membership costing $275/year.

Q17: What Gym classes do you offer?
A17: We currently offer five (qty. 5) classes. They are: (1) Beginners Class, (2) Intermediate Class, (3) Cycling, (4) Jazzercise, and (5) Zumba.

Hi @nmvega, welcome to the forum!

The response object you’re getting back is what I would expect (see docs), so I think there might be something going on with how the data was indexed. values being empty in the response suggest that there is no actual vector data being stored/retrieved during the query. Let’s start first by ruling that out any issues with the query script. Can you try fetching the vector data by ID:

from pinecone import Pinecone

pc = Pinecone(api_key="YOUR_API_KEY")

index.fetch(ids="DOC0001", namespace="namespace01")

This should return the vector record and values. If the values are empty on the fetch, you can start to troubleshoot the embedding/upsert script. If there are indeed values in the vector record, we can start to troubleshoot the second query script.

1 Like

@nmvega - Actually, if you try adding include_values=True to your query, the values should print. Try this first!

1 Like

Hi @lauren.s Indeed your suggestion to add include_values=True remedied my issue (… I’m still familiarizing myself with the library and API).

Now, I can turn attention back to my n8n instance and resume troubleshooting why the same query isn’t returning anything there (… having confirmed proper behavior under the hood with Python).

Thank you. :blush: I marked your answer as the solution.

1 Like