Querying by ID returns the wrong result when index is of type cosine or dotproduct. only returns correctly when index type is euclidean

Let’s imagine I have these 5 vectors:

index.upsert([
    ("Andrew", [0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1], {"genre": "comedy", "year": 2020}),
    ("Lulu", [0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2], {"genre": "comedy", "year": 2021}),
    ("Love", [0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3], {"genre": "comedy", "year": 2020}),
    ("Dog", [0.4, 0.4, 0.4, 0.4, 0.4, 0.4, 0.4, 0.4], {"genre": "comedy", "year": 1999}),
    ("Children", [0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5], {"genre": "horror" })
], namespace="movies")

When I query pinecone by ID of “Andrew” and a top_k of 1 i get back


In fact, I need to set my top_k to 5 to get it back at all!

Similarly, if i query by a vector of [0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1] I would expect to get back the “Andrew” vector. But I do not.

Anyone know why? It works the way I expect it to when I use euclidean, but it appears to be the least related result to my search when i use cosine or dot product.

Hi @sudomonikers

I think the problem you are facing is a math one. :slight_smile:

In cosine terms all your vectors will have a similarity score of 1! (They are all the same basically). And if you think about it… they are. The only difference is magnitude, the direction is the same :slight_smile:

If you upsert vectors that are not the same the functionality works as intended.

Hope this helps

You can quickly try the math here: Cosine Similarity Calculator and Dot Product Calculator

ok i figured it was a math problem since it works when i changed the vector type. And that all makes sense if I was querying by vector… But why does it give a different result when i query by ID? I would expect that no matter the vector type if I query by ID then it gives me back the vector with that ID

and this omni calculator thing is awesome! thanks for sharing

1 Like

I think query by ID works the same as query by vector. All it does probably is fetch the vector by ID and then uses its values for querying. If you want to get the vector you specified then Fetch should be used :slight_smile: