Hi @falkor, I’m a developer working on inference at Pinecone.
It’s a little hard to say exactly what might be happening with text-embedding-3-large in the case of your specific sample data/query, particularly for such small input which doesn’t carry a ton of semantic meaning.
However I did want to step back a bit, and note that your question in general is actually a great example of the need and value of using a reranking model to perform the final ordering of query results.
At Pinecone, we’re in the process of releasing a Reranking API endpoint, and we will initially support the bge-reranker-v2-m3
open source model. Later this week we are planning to make it available in Public Preview.
When I run your sample query through our Rerank API, it results in the ordering you are expecting. Here’s an example.
from pinecone import Pinecone
pc = Pinecone(api_key='<your api key>')
query = "adult bikes"
result = pc.inference.rerank(
model="bge-reranker-v2-m3",
query=query,
documents=[
"kids bikes",
"women bikes"
],
return_documents=True,
)
print(result)
output:
RerankResult(
model='bge-reranker-v2-m3',
data=[
{ index=1, score=0.2373506,
document={text="women bikes"} },
{ index=0, score=0.007539987,
document={text="kids bikes"} }
],
usage={'rerank_units': 1}
)
Note that currently to use the Rerank API you’ll need to install a pre-release version of the pinecone-plugin-inference
plugin:
# Install the base python SDK
pip install pinecone-client
# Install a prelease version of "pinecone-plugin-inference"
pip install --upgrade --pre pinecone-plugin-inference==1.1.0.*
We’d love for you to give this a try in your use case, and let us know any feedback!
Cheers,
Silas