Fixing YouTube Search with OpenAI's Whisper

OpenAI’s Whisper is a new state-of-the-art (SotA) model in speech-to-text. It is able to almost flawlessly transcribe speech across dozens of languages and even handle poor audio quality or excessive background noise.

The domain of spoken word has always been somewhat out of reach for ML use-cases. Whisper changes that for speech-centric use cases. We will demonstrate the power of Whisper alongside other technologies like transformers and vector search by building a new and improved YouTube search.


This is a companion discussion topic for the original entry at https://www.pinecone.io/learn/openai-whisper/

Hi James,
I need to build metadata for my own channel and a couple of other channels. Could you give some pointers as how you built that. Is it possible to do this via pytube.
Thank you!

I’m pretty sure the i_end is excluded from the sublist when you split it and therefore it should not be used as an index for example:

'end': data[i_end]['end'],