It would probably be better to send the transcripts in paragraphs to avoid pushing too much data in one go. I’m guessing you would be creating embeddings through OpenAI or similar LLMs? Those usually have a token limit as well, and breaking them into smaller chunks can also help with getting more accurate contexts.
You can use metadata to include the start and end time of each transcript and the creator of each document and query/filter them as needed. Pinecone has some pretty good docs!