Best practices for constructing queries around sentiment

I’m curious about any resources or guidance for “sentiment” semantic search, ie queries that return context vectors generally matching a descriptor like “funny”, “wise”, “animated”, etc.

My goal is to extract interesting quotes from podcast transcripts. I’ve segmented a bunch of transcripts and uploaded them to an index. Naively, I have tried queries like “most interesting topics”, “most relevant to a marketer”, and these have returned a fair quality of results but I’m not sure which direction to go to improve them.

I’m imagining this as part of a pipeline where I can take the most “interesting” segments returned from pinecone and use an LLM to trim them down to the best quotes.

Are there any best practices for constructing these types of queries to produce relevant results? I’m calling them “sentiment” queries in my head, but unsure if there is a more broadly-used term I should be using.

Appreciate any thoughts or input, thank you so much!

you could try and extract these sentiment labels using a classifier model (you could even use a LLM to create these classes for you) and then add them to the metadata of your vectors in pinecone, and filter based on this metadata

We don’t have any good resources describing this exact use-case, but we did something similar with entity extraction here