ErrorWithoutStackTrace: PineconeClient: Error calling upsert: ErrorWithoutStackTrace: Vector dimension 1536 does not match the dimension of the index 2048

absiemon · April 17, 2023, 4:44am

Below is the code. I am using langchain DirectoryLoader to load the data directory in which three txt files are there. I think rest of the code is very simple. Please help me because i have tried creating different dimensions of indices but doesn’t worked.

const model = new OpenAI({
        openAIApiKey: apiKey,
        temperature: 0.9, 
    });
    
    const embeddings = new OpenAIEmbeddings();
    const loader = new DirectoryLoader(
        './data',
        {
            ".txt": (path) => new TextLoader(path),
        }
    );
    
    loader.load().then( async (docs)=>{
        const textSplitter = new RecursiveCharacterTextSplitter({ chunkSize: 1000 });
        const texts = await textSplitter.splitDocuments(docs)
        console.log(texts.length)        
        const pinecone = new PineconeClient();
        await pinecone.init({
            apiKey: process.env.PINECONE_API_KEY,
            environment: process.env.PINECONE_ENVIRONMENT,
        })
        const pineconeIndex = pinecone.Index(process.env.PINECONE_INDEX_NAME);
        const newTexts = texts.map((text)=>{
            return text.pageContent
        })
        const vectorStore = await PineconeStore.fromDocuments(texts, embeddings, { pineconeIndex:pineconeIndex});
        console.log(vectorStore);
    }).catch(err => {
        throw err;
    })

Jasper · April 17, 2023, 7:18am

Hi!

it seems like you created an index with vector size 2048, but you are creating GPT embeddings that are of different dimension (1536). Check your Pinecone dashboard and remove the PINECONE_INDEX_NAME index, then create a new one with the right dimensions. You can do that over the dashboard or in code.

If you need any more help, ask away!

Hope this helps.

absiemon · April 17, 2023, 2:03pm

But this is just a trial i will have to another directory where multiple text file could be so do i need to make changes to pinecone index dimension every time or is there any way to change dimension of GPT embeddings to a desired value before storing to Pinecone?. Please explain and thanks for reply

Jasper · April 17, 2023, 3:45pm

OpenAi embeddings are of size 1536 and should not change (for the time being at least). You may do some stuff to them to change the dimension (add padding of zeroes for example), but I do not recommend it. Is there a reason you set your index vector dimenstion to 2048?

No, you will not have to change the dimension if you will keep the embeddings model. So if you will use the OpenAi for embeddings then you only need to create the index once, set the dimension to 1536 and thats it You can delete the index via dashboard with a few clicks.

absiemon · April 17, 2023, 4:49pm

No, wasn’t any reason to set 2048, just wanted to set higher dimension bcz i didn’t know what exactly will be the dimension of embeddings. Thanks man.

muneebwaqas416 · January 16, 2024, 10:47pm

Worked for me thanks for helping