The presence of metadata affects vector retrieval negatively

I created two indexes in a new project in Pinecone. They’re the same except for the presence or absence of metadata. Here’s the code I ended up using. Please excuse any naiveté. Long time developer, new to vectors and LLMs.

from keys import PINECONE_API_KEY, PINECONE_ENV
import openai, pinecone, json

pinecone.init(api_key=PINECONE_API_KEY, environment=PINECONE_ENV)
query = openai.Embedding.create(input="What is BC's ESG rating?",
    model="text-embedding-ada-002")["data"][0]["embedding"]
indexes = ['test-with-metadata', 'test-no-metadata']

for index_name in indexes:
      index = pinecone.Index(index_name)
      xc = index.query(query, top_k=4, include_metadata=True)
      for result in xc['matches']:
          print(f"{'='*80}\nIndex: {index_name}")
          print(f"{round(result['score'], 2)} (id: {result['id']}):")
          print(json.dumps(result['metadata'], indent=4))

Here’s a link to the output of that code. Note the repetition in the “text” field with metadata and the variety without metadata.