The Imagenet Large Scale Visual Recognition Challenge (ILSVRC)[1] was a world-changing competition hosted annually from 2010 until 2017. During this time, the competition acted as the catalyst for the explosion of deep learning[2] and was the place to find state-of-the-art image classification, object localization, and object detection.
Researchers fine-tuned better-performance computer vision (CV) models to achieve ever more impressive results year-after-year. But there was an unquestioned assumption causing problems.
This is a companion discussion topic for the original entry at https://www.pinecone.io/learn/zero-shot-object-detection-clip/