Quaternary System

I just need some sort of way of representing a quaternary system (i.e. a sequence with four options in each of the spots). I was thinking of representing the four options as (1,1), (0,1), (1,0) and (0,0) respectively. Another option I was considering would be to represent these tuples in pairs of vectors but I would need a way to link two vectors. Does PineCone have a way to do so?

1 Like

One way to link vectors would be to use metadata. You can have a field which could be used as an identifier for vectors that you want to couple together. Btw, I am curious if you just want to store vectors or search through them too?

1 Like

Thank you Rajat. Could you possibly point me in the right direction with regards to syntax for linking vectors using metadata?

And I am hoping to search through the vectors too. Would this be possible using linked vectors?

1 Like

You can insert a vector with metadata following the syntax here.
So your linked vectors could have metadata somehting like: { ‘field’: ‘123’}

Whenever you would want to fetch vectors with the same ‘field’ values, you could do a dummy query with metadata filter like:

index.query(queries=[[0.1]*vector_dim], filter={‘field’: {’$eq’: ‘123’}},include_values=True,include_metadata=True)

This will return all the vectors that will have value of field = ‘123’ which in your case are the linked vectors.

As for searching, you can just do a search across all values without filters, within a specific value of the like above or in a list of field values using the $in filter, eg. filter={‘field’: {’$in’:[‘123’,‘456’,‘789’]}}

1 Like

I’m not sure what you are trying to achieve with this encoding. Care to elaborate?

My answer assumes you want to encode the results of, for example, a multiple choice questionnaire where each question has four possible answers. In that case you should consider one-hot-encoding (One-hot - Wikipedia) which would be
a → [1, 0, 0, 0]
b → [0, 1, 0, 0]
c → [0, 0, 1, 0]
d → [0, 0, 0, 1]

This has the desired property that the dot product between two answer sequences is equal to the number of questions answered in the same way. Note that this does not hold for the mapping
a → (1,1), b-> (0,1), c → (1,0), and d → (0,0)

Btw, one hot encoding is still fairly naive… In real life, four options are usually not “equally different” from one another. You might want to encode that into your vectors. If you are interested in that let me know.

1 Like

Thank you very much Edo. The one hot encoding option might work for my project. Do you know how I might implement this in pinecone?