Error on Query endpoint : ApiException: (400)

avinaashak · March 9, 2023, 7:35pm

Hey all,
I am trying to implement Generative Pseudo-Labeling (GPL) in a Sentence Transformer model where I am making use of a Pinecone index in the negative mining step.

The output of index.describe_index_stats() is

{'dimension': 384,
 'index_fullness': 0.0,
 'namespaces': {'': {'vector_count': 15025}},
 'total_vector_count': 15025}

When I try to query the index using

res = index.query(query_embs.tolist(), top_k=10)

the query endpoint throws a 400 Bad request error

ApiException: (400)
Reason: Bad Request
HTTP response headers: HTTPHeaderDict({'content-type': 'application/json', 'date': 'Thu, 09 Mar 2023 19:25:15 GMT', 'x-envoy-upstream-service-time': '1', 'content-length': '110', 'server': 'envoy'})
HTTP response body: {"code":3,"message":"Query vector dimension 38400 does not match the dimension of the index 384","details":[]}

kbutler · March 10, 2023, 1:50pm

Is seems the part of the process that is generating the vector for the query is generating a vector with 38400 dimensions. It looks like you want it to be 384 instead. The maximum vector dimension for Pinecone is 20K (which is huge).

nassferdinand · October 24, 2023, 1:22pm

Hey There,
Any Solutions to this Problem? Run into the same issue. After some testing I´m quite sure that the problem is related to the index.query line not handling the list input in this line

res = index.query(query_embs.tolist(), top_k=10)

(it seems to put all the entries in the list together to one large vector, therefore the vector size is 100 times as big as the original one because the list has 100 vectors of the original size). I found that this workaround theoretically works in the gpl code, but is super slow and not working in the colab version (i guess cause of querying one by one, but not sure):

    res = [] 
    for vector in query_embs_list:
      result = index.query(vector, top_k=10)
      res.append(result)```
Anybody got an idea, how to query with the tolist() statement? 
Thanks a lot!

Jasper · October 24, 2023, 1:31pm

Hi @nassferdinand

as per documentation, querying only works with one vector as a parameter (Query data).

If I understand you are inputting 100 vectors into query method and it flattens them out into one long vector? You second solution is correct. Each query is a single vector search, searching for the closest vectors in vector space.

If you have any more problems you can explain why you would be inputting 100 vectors at once and we can see what can be done

Hope this helps

nassferdinand · October 24, 2023, 3:19pm

Hi @Jasper,
Thanks for your quick response:)
I´m trying to apply the GPL adaptation of this notbook. There is a version on the pinecone learn series to, which uses kind of the same implementation regarding the problem with the dimensions. Those two official implementations of GPL are using the “res = index.query(query_embs.tolist(), top_k=10)” line so I´m a little bit confused, why for me it isn´t working.

Here is the part of the original code (both original implemenatations use the part with the index.query(query_embs.tolist()), which throws the error with the Dimensions:

batch_size = 100
triplets = []

for i in tqdm.tqdm(range(0, len(query_doc_pairs), batch_size)):
    # embed queries and query pinecone in batches to minimize network latency
    i_end = min(i+batch_size, len(query_doc_pairs))
    queries = [pair[0] for pair in query_doc_pairs[i:i_end]]
    pos_docs = [pair[1] for pair in query_doc_pairs[i:i_end]]
    query_embs = org_model.encode(queries, convert_to_tensor=True, show_progress_bar=False)
    res = index.query(query_embs.tolist(), top_k=10)
    # iterate through queries and find negatives
    for query, pos_doc, query_res in zip(queries, pos_docs, res['results']):
        top_results = query_res['matches']
        random.shuffle(top_results)
        for hit in top_results:
            neg_doc = corpus[int(hit['id'])]
            if neg_doc != pos_doc:
                triplets.append([query, pos_doc, neg_doc])
                break

The rest of the variables used should be in the same format as the official implementations i adopted.
The Problem with my workarround is, that it is to slow (I´m working in google colab and it shuts down tasks that take to long)
So i thought that my changes of searching one vector at a time could be the problem for the poor performance compared to searching in a batch of 100 what the tolist() code seems to do, so i wanted to make the original version work. Maybe for context, my index is 27.000 vectors large with 768 dimensions and i have 270.000 queries so query_embs is as well 270.000 in length.

Thanks a lot for your tim, really appreciate your help!

Jasper · October 25, 2023, 7:34am

Hmm not familiar with the GPL approach, but I checked the provided links and code.

From article:
The vector database is set up for us to begin negative mining. We loop through each query, returning 10 of the most similar passages by setting top_k=10. (mind the each query part).

After checking I think your code has an error somewhere. The queries you are encoding is one query and query_embs should be of the same length as your original vectors inserted into Pinecone. Debug what happens and how queries look like. If it is anything else than one query, then you should check how your pairs are created and added into list.

.tolist() is used to create a list from numpy array, not so you can query multiple vectors at once. I think this may be the misunderstanding.

Also this video seems to be the one that you are following by @jamesbriggs ? https://youtu.be/uEbCXwInnPs?feature=shared&t=1995

I’ll run the code myself, but as of now I think the code is good and you have a bug

Good luck!

Edit: Also I am not sure, but there is a possibility that there was once this feature in query? (multiple vector query).

nik · November 19, 2023, 10:35am

I am getting a similar error-
ApiException: (400)
Reason: Bad Request
HTTP response headers: HTTPHeaderDict({‘content-type’: ‘text/plain; charset=utf-8’, ‘Content-Length’: ‘140’, ‘date’: ‘Sun, 19 Nov 2023 10:32:27 GMT’, ‘x-envoy-upstream-service-time’: ‘3’, ‘server’: ‘envoy’, ‘Via’: ‘1.1 google’, ‘Alt-Svc’: ‘h3=“:443”; ma=2592000,h3-29=“:443”; ma=2592000’})
HTTP response body: Capacity Reached. Starter Projects support a single index. Create a new project to add more. Your Starter Project remains free post-upgrade.
Can someone please help?

Jasper · November 20, 2023, 6:30am

Hi @nik

I think you reached the capacity of the Free Index. The error message: Capacity Reached. Check how many vectors you already have in your index and check the limitations of the Free environment here gcp-starter environment

Hope this helps

askyourdocai · January 25, 2024, 2:34pm

Hallo Everyone
I am getting this error

pinecone.core.client.exceptions.PineconeApiException: (400)
Reason: Bad Request
HTTP response headers: HTTPHeaderDict({'Date': 'Thu, 25 Jan 2024 14:23:50 GMT', 'Content-Type': 'text/plain', 'Content-Length': '90', 'Connection': 'keep-alive', 'server': 'envoy'})   
HTTP response body: queries[358]: invalid value -0.062515430152416229 for type type.googleapis.com/QueryVector

I tried using different embeddings even, still the same error. Can anyone help me with this, please?

Thanking you in advance.

risticmilan89 · January 27, 2024, 5:32pm

I am getting the same error, while following:

When I get query embeddings and try to run the query in pinecone console, it works fine

jamesbriggs · January 31, 2024, 3:20pm

That is an old version of both OpenAI and Pinecone code. I’ve created a PR to update it, in the meantime you can find the updated code here.

system · February 6, 2024, 4:17pm

This topic was automatically closed 24 hours after the last reply. New replies are no longer allowed.