My company is using Pinecone at the end of a API_Gateway lambda function call to do some querying to return back relevant nearby metadata.
When tested on a local pc, we get speeds ranging from 400-700 milliseconds when using the service.
When that same code is uploaded to the lambda function, the operation, regardless of coldboot status, can take anywhere between 3000->11000 milliseconds.
For logging, we have only been logging the speed performance specifically of the await index.query({}) call.
Is there something we are missing about the use of Pinecone with AWS that can circumvent this bottleneck?
We are using layers to import the pinecone api, and that client has to be established each time the function is invoked, but I can’t see how that would impact specifically the .query call.
Code for both sets is as follows:
import { Pinecone } from “@pinecone-database/pinecone”;
const data = {
vectors: [
[
/* Some Vector Data */
],
],
};
const API_KEY = “A valid API Key”;
const INDEX_NAME = “A valid Index Name”;
async function main() {
const pc = new Pinecone({
apiKey: API_KEY,
});
const index = pc.index(INDEX_NAME);
let now = Date.now();
const returnedData = await index.query({
topK: 10,
vector: data.vectors[0],
includeMetadata: true,
});
console.log(Query Took: ${Date.now() - now} milliseconds
);
console.log(returnedData.matches);
}