Got wrong Cos query resuts

Just wondering how to figure out the major difference of results:

  1. Generated embeds for each of 1500 text documents via text-embedding-ada-002.
  2. Inserted the 1500 embeds with meta to Pinecone Index(Cos).
  3. Generated an embed for a question via text-embedding-ada-002
  4. Used the openai cosine similarity method to obain top3 results, with high score and meaningful results.
  5. However, I sent the same embed of the question in a query to Pinecone, resulting in totally wrong and meaningless top3 results.
1 Like

Same here, even worst is when there are similar documents like

A is the capital of B, C is the capital of D and so on… and when searching for capital of D it returns wrong results.