Transformers Are All You Need

A quick tour through the most popular Neural Net architecture

Have you ever thought about what happens when you read a book? Unless you have a unique ability, you don’t process and memorize every single word and special character, right? What happens is that we represent events and characters to build our understanding of the story. We do this with a selective memorization that allows us to keep the most relevant pieces of information without needing to accumulate each minor detail.

This is a companion discussion topic for the original entry at

Awesome Read! Is there any article by Pinecone that delves deeper into any particular architecture(s) used predominantly for semantic search?

1 Like

Thank you for the article! Noob here :slight_smile:

Stupid question but what is the difference between bidirectional and unidirectional encoders? What are the two directions of bidirectional encoders?

BERT, or B idirectional E ncoder R epresentations from T ransformers

Unlike BERT, GPT models are unidirectional

1 Like

Glad you enjoyed it Suraj :slight_smile:
Bidirectional language models evaluate the text that both precedes and follows a target section of text, as opposed to unidirectional systems that only evaluate the text that precedes a target section of text. Find here an example of what this means.