Question sent to us: Does Pinecone support Japanese content?

sophiem · February 17, 2022, 4:36pm

Question we received: I just want to know if your search service supports Japanese contents to retrieve the meaning in terms of natural language processing perspective.

Amir · February 18, 2022, 8:14pm

Pinecone works with any dense vector embeddings, regardless of the format or language of the original data. You can choose any embedding model that fits your application. For Japanese, here are two models that you could use for sentence embedding:
colorfulscoop/sbert-base-ja · Hugging Face
sonoisa/sentence-bert-base-ja-mean-tokens · Hugging Face

Amir · February 18, 2022, 8:15pm

Another possible direction is to use multilingual embeddings, and we have an elaborate post on this here:
https://www.pinecone.io/learn/multilingual-transformers/.