Hey all,
I am trying to implement Generative Pseudo-Labeling (GPL) in a Sentence Transformer model where I am making use of a Pinecone index in the negative mining step.
ApiException: (400)
Reason: Bad Request
HTTP response headers: HTTPHeaderDict({'content-type': 'application/json', 'date': 'Thu, 09 Mar 2023 19:25:15 GMT', 'x-envoy-upstream-service-time': '1', 'content-length': '110', 'server': 'envoy'})
HTTP response body: {"code":3,"message":"Query vector dimension 38400 does not match the dimension of the index 384","details":[]}
Is seems the part of the process that is generating the vector for the query is generating a vector with 38400 dimensions. It looks like you want it to be 384 instead. The maximum vector dimension for Pinecone is 20K (which is huge).
Hey There,
Any Solutions to this Problem? Run into the same issue. After some testing I´m quite sure that the problem is related to the index.query line not handling the list input in this line
res = index.query(query_embs.tolist(), top_k=10)
(it seems to put all the entries in the list together to one large vector, therefore the vector size is 100 times as big as the original one because the list has 100 vectors of the original size). I found that this workaround theoretically works in the gpl code, but is super slow and not working in the colab version (i guess cause of querying one by one, but not sure):
res = []
for vector in query_embs_list:
result = index.query(vector, top_k=10)
res.append(result)```
Anybody got an idea, how to query with the tolist() statement?
Thanks a lot!
as per documentation, querying only works with one vector as a parameter (Query data).
If I understand you are inputting 100 vectors into query method and it flattens them out into one long vector? You second solution is correct. Each query is a single vector search, searching for the closest vectors in vector space.
If you have any more problems you can explain why you would be inputting 100 vectors at once and we can see what can be done
Hi @Jasper,
Thanks for your quick response:)
I´m trying to apply the GPL adaptation of this notbook. There is a version on the pinecone learn series to, which uses kind of the same implementation regarding the problem with the dimensions. Those two official implementations of GPL are using the “res = index.query(query_embs.tolist(), top_k=10)” line so I´m a little bit confused, why for me it isn´t working.
Here is the part of the original code (both original implemenatations use the part with the index.query(query_embs.tolist()), which throws the error with the Dimensions:
batch_size = 100
triplets = []
for i in tqdm.tqdm(range(0, len(query_doc_pairs), batch_size)):
# embed queries and query pinecone in batches to minimize network latency
i_end = min(i+batch_size, len(query_doc_pairs))
queries = [pair[0] for pair in query_doc_pairs[i:i_end]]
pos_docs = [pair[1] for pair in query_doc_pairs[i:i_end]]
query_embs = org_model.encode(queries, convert_to_tensor=True, show_progress_bar=False)
res = index.query(query_embs.tolist(), top_k=10)
# iterate through queries and find negatives
for query, pos_doc, query_res in zip(queries, pos_docs, res['results']):
top_results = query_res['matches']
random.shuffle(top_results)
for hit in top_results:
neg_doc = corpus[int(hit['id'])]
if neg_doc != pos_doc:
triplets.append([query, pos_doc, neg_doc])
break
The rest of the variables used should be in the same format as the official implementations i adopted.
The Problem with my workarround is, that it is to slow (I´m working in google colab and it shuts down tasks that take to long)
So i thought that my changes of searching one vector at a time could be the problem for the poor performance compared to searching in a batch of 100 what the tolist() code seems to do, so i wanted to make the original version work. Maybe for context, my index is 27.000 vectors large with 768 dimensions and i have 270.000 queries so query_embs is as well 270.000 in length.
Thanks a lot for your tim, really appreciate your help!
Hmm not familiar with the GPL approach, but I checked the provided links and code.
From article: The vector database is set up for us to begin negative mining. We loop through each query, returning 10 of the most similar passages by setting top_k=10. (mind the each query part).
After checking I think your code has an error somewhere. The queries you are encoding is one query and query_embs should be of the same length as your original vectors inserted into Pinecone. Debug what happens and how queries look like. If it is anything else than one query, then you should check how your pairs are created and added into list.
.tolist() is used to create a list from numpy array, not so you can query multiple vectors at once. I think this may be the misunderstanding.
I am getting a similar error-
ApiException: (400)
Reason: Bad Request
HTTP response headers: HTTPHeaderDict({‘content-type’: ‘text/plain; charset=utf-8’, ‘Content-Length’: ‘140’, ‘date’: ‘Sun, 19 Nov 2023 10:32:27 GMT’, ‘x-envoy-upstream-service-time’: ‘3’, ‘server’: ‘envoy’, ‘Via’: ‘1.1 google’, ‘Alt-Svc’: ‘h3=“:443”; ma=2592000,h3-29=“:443”; ma=2592000’})
HTTP response body: Capacity Reached. Starter Projects support a single index. Create a new project to add more. Your Starter Project remains free post-upgrade.
Can someone please help?
I think you reached the capacity of the Free Index. The error message: Capacity Reached. Check how many vectors you already have in your index and check the limitations of the Free environment here gcp-starter environment