Help pushing vectors to pinecone DB in Next JS

So I need help pushing vectors into the db, currently something is wrong with the await function here, I am getting an error.

Pastebin link for my file that uploads to pinecode: import { Index, Pinecone, PineconeRecord, RecordMetadata,} from "@ - Pastebin.com

This is the error I’m getting right now:
Object literal may only specify known properties, and ‘vectors’ does not exist in type ‘PineconeRecord’.

Code if pastebin doesn’t work:
import {
Index,
Pinecone,
PineconeRecord,
RecordMetadata,
} from “@pinecone-database/pinecone”;
import { downloadFromS3 } from “./s3.server”;
import { PDFLoader } from “@langchain/community/document_loaders/fs/pdf”;
import {
Document,
RecursiveCharacterTextSplitter,
} from “@pinecone-database/doc-splitter”;
import { getEmbeddings } from “./embeddings”;
import { Vector } from “@pinecone-database/pinecone/dist/pinecone-generated-ts-fetch”;
import md5 from “md5”;
import { convertToAscii } from “./utils”;

let pinecone: Pinecone | null = null;
const api = process.env.PINECONE_API_KEY || “”;

export const getPineconeClient = () => {
if (!pinecone) {
pinecone = new Pinecone({
apiKey: api,
});
}
return pinecone;
};

type PDFPage = {
pageContent: string;
metadata: {
loc: { pageNumber: number };
};
};

export async function loadS3IntoPinecone(fileKey: string) {
try {
// 1. Obtain the PDF
console.log(“Downloading PDF from S3…”);
const file_name = await downloadFromS3(fileKey);

if (!file_name) {
  throw new Error("File not found");
}

// 2. Download and read PDF
console.log("Reading PDF...");
const loader = new PDFLoader(file_name as string);
const pages = (await loader.load()) as PDFPage[];

// 3. Split and segment the PDF
console.log("Splitting PDF...");
const documents = await Promise.all(pages.map(prepareDocument));

// 4. Vectorize and embed individual docs
console.log("Embedding documents...");
const vectors = await Promise.all(documents.flat().map(embedDocument));

// 5. Upload to Pinecone
const client = await getPineconeClient();
const pineconeIndex = client.Index("teachtalk");

console.log("inserting vectors into pinecone");


const namespace = convertToAscii(fileKey);

// Push vectors to Pinecone index
await pineconeIndex.upsert({
  vectors: vectors as PineconeRecord<RecordMetadata>[],
  namespace: namespace,
});

console.log("Upload complete");

} catch (error) {
console.error(“Error in loadS3IntoPinecone”, error);
}
}

async function embedDocument(doc: Document): Promise {
try {
const embeddings = await getEmbeddings(doc.pageContent);
const hash = md5(doc.pageContent);

return {
  id: hash,
  values: embeddings,
  metadata: {
    text: doc.metadata.text,
    pageNumber: doc.metadata.pageNumber,
  },
} as PineconeRecord;

} catch (error) {
console.error(“Error in embedding document”, error);
throw error; // Ensure errors are propagated
}
}

// Converts to bytes then to a string
export const truncateStringByBytes = (str: string, bytes: number) => {
const enc = new TextEncoder();
return new TextDecoder(“utf-8”).decode(enc.encode(str).slice(0, bytes));
};

async function prepareDocument(page: PDFPage): Promise<Document> {
let { pageContent, metadata } = page;

// Replace empty line with space
pageContent = pageContent.replace(/\n/g, “”);

// Split the docs
const splitter = new RecursiveCharacterTextSplitter();
const docs = await splitter.splitDocuments([
new Document({
pageContent,
metadata: {
pageNumber: metadata.loc.pageNumber,
text: truncateStringByBytes(pageContent, 36000),
},
}),
]);

return docs;
}

Hi @dattasumit2019, and Welcome to the Pinecone community forums!

Thanks for your question.

I think the issue relates to the typing of your embedDocument method:

Try updating it like this:

async function embedDocument(doc: Document): Promise<PineconeRecord<RecordMetadata>> {
  try {
    const embeddings = await getEmbeddings(doc.pageContent);
    const hash = md5(doc.pageContent);

    return {
      id: hash,
      values: embeddings,
      metadata: {
        text: doc.metadata.text,
        pageNumber: doc.metadata.pageNumber,
      },
    };
  } catch (error) {
    console.error("Error in embedding document", error);
    throw error;
  }
}

Now, in your loadS3IntoPinecone function, make sure your upsert call looks like this:

await pineconeIndex.upsert({
  vectors: vectors,
  namespace: namespace,
});

Give that a shot and let me know how it goes. If you’re still encountering an error after making these changes, please let us know:

  • The exact error message you’re seeing (if it’s different from the original one).
  • The version of the Pinecone SDK you’re using.
  • Any other relevant parts of your code that might be interacting with these functions.

Hope this helps!

Best,
Zack

tysmmm for the help!

Ok so I tried your solution and it had a same error :(, but that’s ok because I tweaked a few things to get this code:
Embed Document

async function embedDocument(doc: Document) {
  try {
    console.log(
      `Embedding document with page content: ${doc.pageContent.substring(
        0,
        100
      )}...`
    );
    const embeddings = await getEmbeddings(doc.pageContent);
    const hash = md5(doc.pageContent);

    return {
      id: hash,
      values: embeddings,
      metadata: {
        text: doc.metadata.text,
        pageNumber: doc.metadata.pageNumber,
      },
    } as PineconeRecord;
  } catch (error) {
    console.log("error embedding document", error);
    throw error;
  }
}

Upserting Code:

// 4. upload to pinecone
  const client = await getPineconeClient();
  const pineconeIndex = await client.index("teachtalk");
  const namespace = pineconeIndex.namespace(convertToAscii(fileKey));

  console.log("inserting vectors into pinecone");
  try {
    await namespace.upsert(vectors);
    console.log("vectors: " + vectors.length);
    return documents[0];
  } catch (error) {
    console.log("error inserting vectors into pinecone", error);
    throw error;
  }

So far it has no errors in the problems tab of vs code, but when the application is run, the terminal gives a weird error

Number of pages loaded from PDF: 0
Number of vectors created: 0
inserting vectors into pinecone
error inserting vectors into pinecone PineconeBadRequestError: No vectors provided for upsert request
    at mapHttpStatusError (webpack-internal:///(rsc)/./node_modules/@pinecone-database/pinecone/dist/errors/http.js:179:20)
    at eval (webpack-internal:///(rsc)/./node_modules/@pinecone-database/pinecone/dist/errors/handling.js:65:69)
    at step (webpack-internal:///(rsc)/./node_modules/@pinecone-database/pinecone/dist/errors/handling.js:33:23)
    at Object.eval [as next] (webpack-internal:///(rsc)/./node_modules/@pinecone-database/pinecone/dist/errors/handling.js:14:53)
    at fulfilled (webpack-internal:///(rsc)/./node_modules/@pinecone-database/pinecone/dist/errors/handling.js:5:58)
    at process.processTicksAndRejections (node:internal/process/task_queues:95:5) {
  cause: undefined
}

(I got this from debugging statements)
So some part of the code isn’t working, the rest of the code is unchanged

1 Like

Hi @dattasumit2019 and thanks for your reply!

Could you modify your console.log statement in the catch block of your try/catch to print out what vectors looks like? I suspect there could still be a formatting issue…

console.log(`error inserting vectors: %o into pinecone`, vectors, error);

It’s also curious to me that your other log statements suggest no content was loaded from a target PDF file - could you also share your filesystem layout (where are the docs) and the code you’re using to load and chunk your documents as well?

Best,
Zack