Question we received: I just want to know if your search service supports Japanese contents to retrieve the meaning in terms of natural language processing perspective.
Pinecone works with any dense vector embeddings, regardless of the format or language of the original data. You can choose any embedding model that fits your application. For Japanese, here are two models that you could use for sentence embedding:
colorfulscoop/sbert-base-ja · Hugging Face
sonoisa/sentence-bert-base-ja-mean-tokens · Hugging Face
Another possible direction is to use multilingual embeddings, and we have an elaborate post on this here: