Domain Transfer with BERT

When building language models, we can spend months optimizing training and model parameters, but it’s useless if we don’t have the correct data.

The success of our language models relies first and foremost on data. We covered a part way solution to this problem by applying the Augmented SBERT training strategy to in-domain problems. That is, given a small dataset, we can artificially enlarge it to enhance our training data and improve model performance.

This is a companion discussion topic for the original entry at