Product Search: Hybrid search vs. vector search with keywords

roman · April 24, 2024, 3:44pm

Hi Community,

I’m developing a product search feature for ecommerce and wonder about the best approach to do this. Specifically I want users to perform semantic search based on image & product meta-data (think “Elvis Presley style pants”).

My idea is to use gpt4v to create one textual description, which holds information on the image + product meta-data, encode this description, and perform search.

On the other hand, I’ve seen the hybrid-search offering from pinecone, which combines dense vectors(image info) with sparse vectors (text info).

I wonder now what the better approach would be. Any thoughts?

zeke · April 29, 2024, 3:00pm

Hi @roman, thanks for the post. Both approaches present viable options, so the choice depends on your key performance goals.

If your focus is purely on semantic search, then you should just use dense vectors. You can also store any essential metadata with the vectors and use Filtering on metadata as necessary.

If you need to incorporate elements of keyword search, then you should explore the hybrid search offering.

If you’re unsure, as always, we recommend testing both options at a smaller scale and comparing the performance in relation to your use-case-specific goals!