AttributeError: type object 'Pinecone' has no attribute 'from_documents'

andycchan731 · January 31, 2024, 10:12pm

Hi everyone! I’ve been running a RAG model on my local machine using Chroma and now trying to transition over to Pinecone. I’ve found some starter code floating where people are using Langchain’s “.from_texts” and “.from_documents” to create the vector DB from a given chunk of text.

However, I’m getting the attribute error: “type object ‘Pinecone’ has no attribute ‘from_documents’”. Did this function get depreciated somewhere and I’m just not seeing it? And if so, what is the alternative way to take a document, convert it into a vector format, and then insert it into a Pinecone instance? I have the following code below w/ the following imports

import os

import time

from dotenv import load_dotenv

from langchain_community.document_loaders import PyPDFLoader

from langchain_openai import ChatOpenAI

from langchain_core.prompts import ChatPromptTemplate, PromptTemplate

from langchain_core.output_parsers import StrOutputParser

from langchain_openai import OpenAIEmbeddings

from langchain.text_splitter import RecursiveCharacterTextSplitter

from langchain_community.vectorstores import Chroma, Pinecone

from langchain_core.runnables import RunnableParallel, RunnablePassthrough

from langchain.chains.question_answering import load_qa_chain

from langchain.chains import RetrievalQA, RetrievalQAWithSourcesChain

import chromadb

from pinecone import Pinecone, ServerlessSpec

pdf_folder_path = "./pdf_folder"
    documents = []
    for file in os.listdir(pdf_folder_path):
        if file.endswith(".pdf"): #find all PDF's in pdf folder path
            pdf_path = os.path.join(pdf_folder_path, file) #create the folder path as a prep for the PyPDF loader
            print("loading PDF into loader...")
            loader = PyPDFLoader(pdf_path)
            documents.extend(loader.load()) #keep appending the load array
text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=10) 
    chunked_documents = text_splitter.split_documents(documents) #create the individual pages
vector_db = Pinecone.from_documents(
        documents = chunked_documents, 
        embedding = OpenAIEmbeddings(), 
        index_name = index
    )

andycchan731 · February 1, 2024, 3:24pm

Ah! Okay so thanks for @jamin.thalaivaa for pointing this out on a thread regarding that the same error for trying to use the ‘from_texts’ function in Langchain but apparently there is a namespace clash between Langchain and Pinecone that you can read here.

That has fixed my issue and now I can call the from_document function. However, now I’m running into the issue where I have a value error where it thinks the index object I pass through is not found ValueError: Index '<pinecone.data.index.Index object at...

system · February 15, 2024, 3:25pm

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.