Timeout in serverless query

I migrated to serverless two days ago and it works great :star_struck: except I am noticing timeouts on queries. These are pretty random and repeating the same query a few minutes later generally works. This morning I was doing some more experimentation turning "includeMetadata": false to "includeMetadata": true. I ran that and it failed. I get back this:

network timeout at: https://–sanitized–zmb21lt.svc.apw5-4e34-81fa.pinecone.io/query

Here is an example:

{
  "url": "https://--sanitized--zmb21lt.svc.apw5-4e34-81fa.pinecone.io/query",
  "method": "POST",
  "body": "{\"vector\":[[-0.0123128835,...,-0.042882804]],\"topK\":20,\"includeMetadata\":false,\"includeValues\":false,\"namespace\":null}",
  "headers": {
    "User-Agent": "Retool/2.0 (+https://docs.tryretool.com/docs/apis)",
    "Content-Type": "application/json",
    "Api-Key": "--sanitized--",
    "Accept": "application/json",
    "ot-baggage-requestId": "undefined",
    "x-datadog-trace-id": "6945336282296118749",
    "x-datadog-parent-id": "5098154364569735983",
    "x-datadog-sampling-priority": "-1",
    "traceparent": "00-00000000000000006062ca8cffe46ddd-46c048622649532f-00",
    "tracestate": "dd=s:-1",
    "X-Retool-Forwarded-For": "72.132.39.62"
  }
}

The above example will work and fail intermittently. Here is what changed in the past few days:

  1. The URL was pointing to the non-serverless version of Pinecone. I switched to serverless.

  2. In the non-serverless version there was a namespace; in the serverless setup I did not choose a namespace (I left it as Default). Passing “null” seems to work.

  3. The non-serverless version of pinecone was hosted in GCP and the serverless is in AWS.

  4. My end is Retool which is hosted in GCP. So now there is a little more network in-between me and pinecone and this error says “network timeout”. However I find it hard to believe that is really the source of the error.

In the time it has taken to write this post everything is now working OK again. Here is an example of a successful transaction:

{
  "request": {
    "url": "https://--sanitized--zmb21lt.svc.apw5-4e34-81fa.pinecone.io/query",
    "method": "POST",
    "body": "{\"vector\":[[-0.0028248946,...,-0.016487952,0.0041766744,-0.0065760403]],\"topK\":20,\"includeMetadata\":true,\"includeValues\":false,\"namespace\":null}",
    "headers": {
      "User-Agent": "Retool/2.0 (+https://docs.tryretool.com/docs/apis)",
      "Content-Type": "application/json",
      "Api-Key": "--sanitized--",
      "Accept": "application/json",
      "ot-baggage-requestId": "undefined",
      "x-datadog-trace-id": "7497291204214570081",
      "x-datadog-parent-id": "5771550218606446442",
      "x-datadog-sampling-priority": "-1",
      "traceparent": "00-0000000000000000680bbaa0bcdebc61-5018aa58475d876a-00",
      "tracestate": "dd=s:-1",
      "X-Retool-Forwarded-For": "72.132.39.62"
    }
  },
  "response": {
    "data": {
      "results": [],
      "matches": [
        {
          "id": "20d5f585-5c35-48b6-bbdb-05fedec21917",
          "score": 0.864393771,
          "values": [],
          "metadata": {
            "model": "text-embedding-ada-002",
            "node": "f541b2a4-a7f5-4de6-9dc5-8f2592d3a6b9",
            "source": "api.openai.com",
            "tenant-guid": "00000000-0000-0000-0000-000000000001",
            "total-tokens": 1,
            "ts": "2023-08-22T22:14:09.978Z"
          }
        }, ...
        {
          "id": "57c8832d-dc68-47eb-b193-49f8fedb14c2",
          "score": 0.821951,
          "values": [],
          "metadata": {
            "model": "text-embedding-ada-002",
            "node": "f621facd-b6fa-4c11-8e7a-5039a736c1b2",
            "source": "api.openai.com",
            "tenant-guid": "00000000-0000-0000-0000-000000000001",
            "total-tokens": 6,
            "ts": "2023-04-13T20:16:04.646Z"
          }
        }
      ],
      "namespace": "",
      "usage": {
        "readUnits": 7
      }
    },
    "headers": {
      "date": [
        "Mon, 22 Jan 2024 18:53:33 GMT"
      ],
      "content-type": [
        "application/json"
      ],
      "content-length": [
        "6017"
      ],
      "connection": [
        "keep-alive"
      ],
      "x-pinecone-max-indexed-lsn": [
        "30450"
      ],
      "x-pinecone-request-latency-ms": [
        "63"
      ],
      "x-envoy-upstream-service-time": [
        "64"
      ],
      "grpc-status": [
        "0"
      ],
      "server": [
        "envoy"
      ]
    },
    "status": 200,
    "statusText": "OK"
  }
}
1 Like

Hey Roland - excited to hear that you’re trying out Pinecone serverless! One characteristic of Pinecone serverless that differs from pods is the concept of cold starts. The first serverless query requires loading your index data from S3 object storage (which is why serverless is so much cheaper) but results in longer latencies. Your index data will then be cached on SSD so subsequent queries will be significantly faster.

Do you know what your network timeout is set to be?

I’m going to ask Retool what the timeout is set for. However I doubt it is a cold start problem. Of course I suspected that especially when it fails after a long idle period. However the failure is equally likely to occur on the 2nd or 3rd try after a success. Of course my 2nd or 3rd try is generally some OTHER query and if they are being sharded then maybe I am warmstarting many times sequentially?

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.