When you say you want to cluster your vectors, what are you solving by doing so? If you could give more insight into what you’re doing we can come up with a more vector-friendly way of doing that.
You shouldn’t need to query for all vectors in the database. If you’re looking for clusters of matching vectors you should be using queries in the index to do that for you.
The goal is to label reviews. I don’t know what the labels are going to be ahead of time.
My first approach was to run DBSCAN on the vectors, choose the top 50 clusters, and then analyze what was clustered together and label them appropriately: “slow to load”, “used daily”, “crashing”, “too expensive”, etc