Indexing and Querying resources in mixed language

I am developing a RAG solution. For a certain topic in healthcare I have a large set of digital books written in both Dutch and English. I have ingested these books into a Pinecone vector database. When querying the vector database in Dutch, I only get reference to Dutch docs and when querying in English, I only get reference to English docs.
In a previous version I used FAISS instead of Pinecone, and query results were in mixed language.
Is Pinecone indexing language sensitive/dependant?

1 Like

Results between Pinecone and FAISS should be very similar. Did anything change such as the distance metric or some pre-processing steps?

Apart from running a new chunking and indexing process using Pinecone on a extended dataset, nothing else changed.
However, reading through some articles on internet, my guess is that Faiss translates chunked documents into English before embedding and ingesting.