In this doc: Vector Similarity Explained | Pinecone, it says:
The basic rule of thumb in selecting the best similarity metric for your Pinecone index is to match it to the one used to train your embedding model .
I am of the understanding of the following two facts:
- OpenAI models are trained on cosine similarity.
- Hybrid indexes must use dotproduct metric.
Should I switch to a different embedding model that was trained on dotproduct? Are there any such models? The score variable for reasonably good matches I’ve seen is 30 or 40, which is a weird scale.