Blob storage + vector search

I’m currently using the following format:

  • Store all my data as a blob on cloud storage
  • Store embeddings and id of the blob on Pinecone

When someone searchs:

  1. I find the most relevant ids
  2. Locate the blob on cloud storage
  3. Filter for relevant info (based on vector search)
  4. Return to client

Does this flow make sense? The other option is to store in “metadata” but I tried this and hit a limit on the max size. I wondered if my current approach introduces a lot of latency.