Best way to deploy using open source huggingface embeddings?

burberg92 · October 14, 2023, 2:13pm

In an application where users constantly upload documents, generating embeddings would be one of the biggest cost if using paid embeddings such as openai’s embedding. Using open source embeddings would be free but I am wondering on the best way to deploy using huggingface embeddings without degrading user experience too much. Any recommendations are appreciated.

gdj0nes · October 14, 2023, 2:35pm

A smaller model that can run on a CPU may be an option. SentenceTransformers has a list of models including their inference speed. The intfloat/e5-small-v2 · Hugging Face is a new small model. One option for deployment is HF’s Inference Endpoints.

Hope this helps!

system · October 15, 2023, 2:35pm

This topic was automatically closed 24 hours after the last reply. New replies are no longer allowed.