Pinecone java client is slower compared to okHttpClient (Rest)

Pinecone java client performance with upsert operation is close to 3 times slower compared to using okHttpClient (Rest based).
Has someone faced the same issue?
What is the recommended way by Pinecone? What are the suggested ways to improve upsert performance?

Thanks.

Hi, thank you for posting. Would you please help me with the following:

  1. Since Java SDK doesn’t offer REST endpoints for upsert operations, would you please confirm what REST endpoints you used for the comparison?
  2. In your experimentations:
    1. Is it stream or batch processing?
    2. How many vectors did you upsert?
    3. What’s the structure of the data? I.e. what fields does it contain?
    4. What’s the size of the data?
    5. What is the index type?
    6. What is the SDK version?
    7. And lastly the java version you used.

Also since we only offer gRPC endpoints for upsert, that is the only possible and recommended approach :slight_smile:.

Thanks!

Hi Rohan,

Please find my answers inline:

  1. Since Java SDK doesn’t offer REST endpoints for upsert operations, would you please confirm what REST endpoints you used for the comparison?
    [Answer]
    (1) We used generic http tool (okHttp) APIs and connected to my endpoint pinecone index URL : https://pinecone-poc-4wo31au.svc.aped-4627-b74a.pinecone.io/vectors/upsert
    (upsert operation)

(2) We compared the performance of the above against the Java client provided by Pinecone: GitHub - pinecone-io/pinecone-java-client: The official Java client for the Pinecone vector database . Connected to the same index.
As per our analysis the Pinecone java client is performing 3 times slower compared to using generic okHttp.

  1. In your experimentations:
  2. Is it stream or batch processing?
    [Answer] Batch processing.
  3. How many vectors did you upsert?
    [Answer] 1,66,000
  4. What’s the structure of the data? I.e. what fields does it contain?
    [Answer] Contains value of 1024 dimensions in each vector (simple float values). And simple metadata with three string fields (total 100 bytes per metadata)
  5. What’s the size of the data?
    [Answer] 1,66,000 vectors (1024 dimension per vector) + 100 bytes metadata per vector.
  6. What is the index type?
    [Answer] Serverless
  7. What is the SDK version?
    [Answer] 1.2.2
  8. And lastly the java version you used.
    [Answer] 1.8

Thanks.