I have multiple vectors stored in Pinecone that are binary. When defining the query vector, I want to return the vectors in order of the closest Hamming distance but based only on the indices in the query vector that are set to 1. As a simple example:
Query vector: [0, 1, 1, 0, 1]
DB vectors: [[0, 1, 0, 1, 0], [0, 1, 1, 1, 0], [1, 1, 1, 0, 1]]
These DB vectors should be ordered:
- [1, 1, 1, 0, 1] (1s in the 1st, 2nd, and 4th index all match the locations of 1s in query vector)
- [0, 1, 1, 1, 0] (matches the 1s in the 1st and 2nd index, not the 4th)
- [0, 1, 0, 1, 0] (only matches 1 in the 1st index)
Can I use existing metrics in Pinecone to accomplish this? If not, will it be possible to define a custom metric in Pinecone?