Unsupervised Training for Sentence Transformers

Language represents a way for us to communicate abstract ideas and concepts. It has evolved as a human-only form of interaction for the best part of the past 100 million years. Translating that into something a machine can understand is (unsurprisingly) difficult.

Modern(ish) computers appeared during and around WW2. The first application of natural language processing (NLP) came soon after with the Georgetown machine translation (MT) experiment in 1954. In the first decade of research, many expected MT to be solvable within a few short years [1] — they were slightly too optimistic.


This is a companion discussion topic for the original entry at https://www.pinecone.io/learn/unsupervised-training-sentence-transformers/

Thank you so much for this tutorial! Really made me understand how to leverage my unlabeled dataset to improve my model (which is non-english and a very specific domain haha).
There is one thing that I am still having trouble with understanding, though. I see that you fine-tuned the model for only one epoch. How is this enough for the loss function to actually fine-tune the embeddings? Does TSDAE actually do several noise-decode passes for each sentence during one epoch to optimize the encoder? Or is this just an example, and we should set a higher number of epochs for training?