Wrong results scoring

I’m indexing two simple documents, one contains embeddings for “kids bikes” and the second one contains embeddings for “women bikes”. Both generated via the openai text-embedding-3-large model.

Now I’m querying for “adult bikes” via generating the query’s embeddings (Again with the same model).

I’m getting that the “kids bikes” has higher score rather than the “women bikes”.

How could it be?

Hi @falkor, I’m a developer working on inference at Pinecone.

It’s a little hard to say exactly what might be happening with text-embedding-3-large in the case of your specific sample data/query, particularly for such small input which doesn’t carry a ton of semantic meaning.

However I did want to step back a bit, and note that your question in general is actually a great example of the need and value of using a reranking model to perform the final ordering of query results.

At Pinecone, we’re in the process of releasing a Reranking API endpoint, and we will initially support the bge-reranker-v2-m3 open source model. Later this week we are planning to make it available in Public Preview.

When I run your sample query through our Rerank API, it results in the ordering you are expecting. Here’s an example.

from pinecone import Pinecone
pc = Pinecone(api_key='<your api key>')

query = "adult bikes"
result = pc.inference.rerank(
    model="bge-reranker-v2-m3",
    query=query,
    documents=[
        "kids bikes",
        "women bikes"
    ],
    return_documents=True,
)
print(result)

output:

RerankResult(
  model='bge-reranker-v2-m3',
  data=[
    { index=1, score=0.2373506,
      document={text="women bikes"} },
    { index=0, score=0.007539987,
      document={text="kids bikes"} }
  ],
  usage={'rerank_units': 1}
)

Note that currently to use the Rerank API you’ll need to install a pre-release version of the pinecone-plugin-inference plugin:

# Install the base python SDK
pip install pinecone-client

# Install a prelease version of "pinecone-plugin-inference"
pip install --upgrade --pre pinecone-plugin-inference==1.1.0.*

We’d love for you to give this a try in your use case, and let us know any feedback!

Cheers,
Silas

Hi Silas

Thanks for your quick reply.

The reranking model sounds interesting, can you please explain further what’s the logic behind the scenes? Why does pinecone needs to rerank or to re-organize the original scoring? The model you mentiond is it pinecone’s model ? Who’s the owner of this model?

And my last question - does this new api exist also on your js library? Unfortunately I’m using node.js

In general if you can please share some docs regarding this reranking models

Thanks

We will be adding to Javascript SDK as soon as we can, but in the meantime the HTTP endpoint is available, so you could use any JS HTTP client like fetch, axios, etc. Unfortunately as we are in the middle of releasing this API, we don’t yet have the docs published for it – they should be ready later this week. In the meantime here’s a sample request:

curl 'https://api.pinecone.io/rerank' \
-H 'Content-Type: application/json' \
-H 'Accept: application/json' \
-H 'api-key: <your api key>' \
-H 'x-pinecone-api-version: 2024-10' \
-d '{
    "model": "bge-reranker-v2-m3",
    "query": "adult bikes",
    "return_documents": true,
    "top_n": 2,
    "documents": [
        {"text": "kids bikes"},
        {"text": "women bikes"}
    ],
    "rank_fields": ["text"]
}'

The model we currently support is bge-reranker-v2-m3, an open source model, and we will be adding more over time. In general, rerank models are trained to accept a query and set of documents as inputs, and identify which documents are most similar to the query.

For example, usage of a Rerank model in a retrieval flow would typically look like this, where you cast a wider net in the vector search, and then use a rerank model to refine:

  1. Embed query
  2. Run vector search for query with a somewhat larger top_k than usual. (For example, if current top_k is 10, maybe now use top_k=25, or top_k=50, etc.)
  3. Pass query and all 25 documents to a reranking model
  4. You can set a top_n for the rerank search, so set this to 10.
  5. Results come back with the 10-most closely matching documents, in order.

Hope this helps!

It is definitely helping I would check it today.

Few more questions please:

  1. As far as I understand this API still on alpha/beta mode, when do you think it will officially released?
  2. Does invoking this API may cost additional charges?
  3. May I bring minimal score or threshold score, so the re-rank will provide only results above the provided threshold? I will give an example for the bikes. Assuming we have the same “corpus” of “women bikes” and “kids bikes” and I’d like to query for “mountain bikes” in this case the re-rank API returns these two documents:

*. Kids bikes - score of 0.12890334
*. Women bikes - score of 0.11952913

We can see these scores are pretty low, so I would like to re-rank API to return only the ones above 0.5 score (as an example)

Of course everything can be done applicative wise, I just wonder if it’s something on your backlog.

Thanks

Thanks

Hi @falkor, great questions!

  1. We’re targeting a Public Preview release later this week. What that means is that it will be available and you can use it, but we do not yet recommend using it for production work loads. We are targeting GA or “General Availability” sometime this fall. You can read more about our feature availability labels here: Feature availability - Pinecone Docs
  2. This will be a paid offering, but is currently available in the free tier until the end of August. We’re still finalizing the pricing
  3. We don’t currently support a score threshold, but we can consider that for a future enhancement. In the meantime you would need to do score filtering in your application.

I’ll update this thread when we publish docs and other API information. Thanks for giving it a look!

Cheers!
Silas

1 Like

Rerank officially launches into Public Preview today!

1 Like

Thanks @silas highly appreciated

Do you have estimation when the js sdk for rerank will be released?