Building a Multi-User Chatbot with Langchain and Pinecone in Next.JS

I’ve gotten zero response on this, which ultimately led me to explore other solutions like Weaviate. If someone from Pinecone can chime on how to make this work, that would be great, because I would really like to hop on the Pinecone train, however it seems I can’t even get past the turnstile. When I run into progress-stopping errors like this with code literally copy-and-pasted from a partner website, and no response from their team, it makes me question how stable things are and if I should invest some more time. Maybe it’s just early though and with time it will get all smoothed out. I’m staying hopeful for now.

Anyone here interested in collaborating on a fork of the GitHub - Azure-Samples/azure-search-openai-demo: A sample app for the Retrieval-Augmented Generation pattern running in Azure, using Azure Cognitive Search for retrieval and Azure OpenAI large language models to power ChatGPT-style and Q&A experiences. project that Microsoft released (it’s a pretty good chatbot UX) to adapt it to use Pinecone instead of Azure Cognitive Search?
I’ve got the data chunking into pinecone, but I’m not getting the best query results (cognitive search is way better). I’d like to better understand the best course of action for either word or sentence embedding:

  1. For the Azure Demo’s PDFs, should I use their model (insert page by page of text), or perhaps should I use a smaller chunk of data for perhaps less loss from the all-MiniLM-L6-v2 limit of 256 tokens?
  2. Should I consider using another model? Perhaps even switching to Word Vector like Word2Vec at the loss of semantics?
  3. What is the best model to capture OOV (out of vocab) data?
  4. Do SentenceTransformers keep the OOV words, or do they discard them?
  5. Would it make sense to override all-MiniLM-L6-v2’s max_seq_length to perhaps 512, thereby almost guaranteeing 300-400 words won’t lose information due to model truncation?

Let me know!

Thanks,

Sean

Do you still plan on working on this?

Hello ! Thank you for sharing the code ! Where in the code you specify the list of URLs to chat with?

I had the same issue, but found that I was using an access token instead of an API token. Once I created an Ably App, I was given an API token and that resolved the error.