Blob storage + vector search

ntkris · February 6, 2023, 4:33pm

I’m currently using the following format:

Store all my data as a blob on cloud storage
Store embeddings and id of the blob on Pinecone

When someone searchs:

I find the most relevant ids
Locate the blob on cloud storage
Filter for relevant info (based on vector search)
Return to client

Does this flow make sense? The other option is to store in “metadata” but I tried this and hit a limit on the max size. I wondered if my current approach introduces a lot of latency.