Pinecone Clarifications

greg · February 3, 2022, 5:30pm

Hey @sprabakar01 great to hear that you’re considering Pinecone!

For any object you’d like to search by similarity, you will need to convert that object into a dense vector. There are different embedding models for for doing this, depending on the use case. CNNs are usually used for images, while PDFs are typically broken down into small “chunks” of text data (roughly a paragraph long in many cases), and then embedded using sentence transformers. As for the metadata, you can upload them into Pinecone alongside your vectors, and later use them for metadata filtering. You do not need to embed metadata fields.
Use the metadata filter, like so:

query_response = index.query(
    queries=[
        (vector, {"firstName": {"$eq": ["John"]}}), // Replace vector with any vector embedding.
    ],
    top_k=10,
    include_values=True, // Optional. Indicates whether vector values are included in the response.
    include_metadata=True // Optional. Indicates whether metadata is included in the response as well as the ids.
)

The query() method does require a query vector. For this case, since you only care about the metadata filter, you can use any vector your want or even use a dummy value like [0,0,0, …]. Just be sure the length of that array (ie, the dimensionality of the query vector) exactly matches the dimensions of other vectors in the index.

See the query() API reference for details.

Yes, Pinecone is a fully managed service. We keep things running smoothly and securely so you don’t have to worry about the infrastructure. We also need the user to employ a sufficient number of replicas. On the Standard Plan, Pinecone use anti-affinity so that replica pods are spread across availability zones. In the event of a failure, replicas will take up the traffic. But the remaining replicas must have sufficient capacity to handle the throughput. Customers who require an SLA for availability should consider the Dedicated plan, and contact us to talk about their requirements.
For uploading/indexing large amounts of data we recommend using the gRPC client and parallel upserts. The Java client is not available yet. However, our API uses the OpenAPI standard so anyone can build clients on top of it.
Yes, semantic similarity models like those from the sentence transformers library allow you to map semantically similar words and phrases to more similar spaces. So when searching for “pizza” similar items like “Pizza Hut”, “Dominoes”, “pizza restaurant”, or “pizzeria” return higher similarity scores for correctly trained models. You can then use metadata filtering to show only transactions for that customer, or within a certain timeframe, or under a certain amount, and so on…
We monitor and take care of index health. If you try to add more data than an index can hold (roughly 1M 768-dim vectors on p1 pods and 5M on s1 pods, as of this writing) you will get an error. Just create an index with enough pods to hold your data, and that’s it.
Index management is quite simple: You can create, delete, and describe an index. More management and monitoring options are coming soon. As a managed service, we take care of things like monitoring, fault-tolerance, failure recovery, availability (see note about replicas above), security, and so on.

These are great questions!

Feb 8: Edited to say p1 and s1 pods hold 1M and 5M vectors, respectively, and not 1GB and 5GB of vectors.