Embedding dimensionality

Does the embedding size influence the quality of the results?

1 Like

It can, generally speaking the larger the embedding size the more information you can encode into your embeddings. But it doesn’t necessarily mean bigger is better. There are excellent SOTA semantic search models that embed vectors at a dimensionality of 786, and others (eg. OpenAI’s Davinci) that use 12288. Both work, and there can be an accuracy benefit for larger dimensionality, but not always. Another point to consider is that there will be a trade-off between high-accuracy (larger dims) and speed+storage (lower dims), your use-case may require that you prioritize one or the other.

1 Like