Prometheus and request latency

Hello Pinecone users,

Has anyone experimented with the new Prometheus support? I’m getting data into Grafana but when I look at the pinecone_request_latency_seconds metric (for queries) the numbers do not match up with what I see on the client side (I’m using artillery to issue queries at various rates). I think the problem may be the size of the rolling window used to calculate the quantile values but that’s just a guess.

Hey @sjg

Prometheus will give you backend latencies while Artillery is measuring round-trip latencies, hence you might be seeing different numbers there.

Also, regarding Artillery, I recommend that you try thinking a bit about how you anticipate your load to behave IRL. Artillery has the option to use uniform or Poisson process arrival distribution along with adding jitter, our system would behave differently in both cases.

Hi @rajat , thanks for your reply and suggestions. I should have given a bit more information in the original posting. Here are the odd things I notice:

  1. artillery actually shows shorter p95 times than Prometheus rather than longer
  2. after artillery stops sending requests the Prometheus values do not drop but seem stuck at the last values, even after several hours

Here’s a Grafana screenshot showing latencies and request rates, the panel queries are:
top panel: pinecone_request_latency_seconds{… request_type=“query”}
bottom panel: rate(pinecone_request_count_total{… request_type=“query” }[30s])

@sjg

Thanks for the details. We will look into what can be causing this. Btw do you see similar behavior in the console graphs?

@rajat

Yes - something similar. Here’s a screenshot:

@sjg

I just checked our code to be certain. The graph you are seeing is the expected behavior from Prometheus histograms (we record request latencies as histograms).

Also regarding artillery differences, we use 10s window sizes for our calculations for the metrics, is there a mismatch there?

Hi @sjg,

Just checking to see if you still need a discussion or deep-dive on this. We are happy to jump on a call if needed. You can reach out to us via support@pinecone.io.

Thanks,
Vinay

Hi. Jitter is not documented, could you give me an example usage as I want to use it too
Thanks