Failed to upload PDF file in Pinecone Assistant

nerstuff24 · October 7, 2025, 3:18pm

Hello community, I’m new here. I want to ask about uploading my PDF file. Why does it keep failing? My PDF has been compressed to 3 MB. Can anyone help? Thank you.

jocelyn · October 13, 2025, 11:37pm

Hey @nerstuff24 I can try to hlep! But I should first clarify that Pinecone doesn’t directly handle PDF file uploads - it’s a vector database that stores and searches vector embeddings

→ To use PDF content with Pinecone, you need to:

Extract and chunk your PDF content into smaller text segments
Convert the text to vector embeddings using an embedding model
Upsert the vectors into your Pinecone index

→ Upsert Limits to Consider

When working with data in Pinecone, there are specific limits for upsert operations:

Max batch size: 2 MB or 1000 records with vectors, 96 records with text
Max metadata size per record: 40 KB
Max length for a record ID: 512 characters

→ Example Data Structure

Here’s how you would structure your PDF content for Pinecone (link):

python

{

"_id": "document1#chunk1",

"chunk_text": "First chunk of the document content...", // Text to convert to a vector.

"document_id": "document1", // This and subsequent fields stored as metadata.

"document_title": "Introduction to Vector Databases",

"chunk_number": 1,

"document_url": "https://example.com/docs/document1",

"created_at": "2024-01-15",

"document_type": "tutorial"

}

For PDF processing with Pinecone, you’ll need to:

Use a PDF parsing library (like PyPDF2, pdfplumber, etc.) to extract text
Chunk the text into manageable segments
Convert chunks to embeddings
Upsert to Pinecone in batches under the 2MB limit

Could you provide more details about what specific error you’re encountering and what tool or method you’re using to upload your PDF?