I asked this question in the Pinecone/Hugging Face joint presentation a while back:
In using the sentence-transformer models on Hugging Face, it defaults to the task of “sentence-similarity” rather than “feature-extraction” you’d need to get the vectors to use in Pinecone. I got around it by cloning the models and modifying the metadata in README.md. Is there a better way?
While more specific to Hugging Face, it might be of general interest here.
Thanks for your question. Yes, when your query the inference API, you can define what you want (sentence-similarity or ‘feature-extraction’). So even when the model defines ‘sentence-similarity’ you can get the embeddings from the inference API.
Can you give an example of how you pass
feature-extraction to the Hugging Face API?
I’m not aware of a way to pass
feature-extraction directly to the API, generally I would load the models using the
sentence-transformers library and create the embeddings with
model.encode, otherwise with HuggingFace directly there is the long way (pulled from SBERT model card):
from transformers import AutoTokenizer, AutoModel
def mean_pooling(model_output, attention_mask):
token_embeddings = model_output #First element of model_output contains all token embeddings
input_mask_expanded = attention_mask.unsqueeze(-1).expand(token_embeddings.size()).float()
return torch.sum(token_embeddings * input_mask_expanded, 1) / torch.clamp(input_mask_expanded.sum(1), min=1e-9)
sentences = ['This is an example sentence', 'Each sentence is converted']
tokenizer = AutoTokenizer.from_pretrained('sentence-transformers/bert-base-nli-mean-tokens')
model = AutoModel.from_pretrained('sentence-transformers/bert-base-nli-mean-tokens')
encoded_input = tokenizer(sentences, padding=True, truncation=True, return_tensors='pt')
model_output = model(**encoded_input)
sentence_embeddings = mean_pooling(model_output, encoded_input['attention_mask'])