Message too large

GeorgePearse · May 16, 2022, 4:04pm

ServiceException: (500)
Reason: Internal Server Error
HTTP response headers: HTTPHeaderDict({‘content-type’: ‘application/json’, ‘date’: ‘Mon, 16 May 2022 15:57:03 GMT’, ‘x-envoy-upstream-service-time’: ‘293’, ‘content-length’: ‘115’, ‘server’: ‘envoy’})
HTTP response body: {“code”:13,“message”:“Message production error: MessageSizeTooLarge (Broker: Message size too large)”,“details”:[]}

Vector dimensions are 7680, trying to upsert in a batch of 1000, metadata is a single key and short string

GeorgePearse · May 16, 2022, 4:08pm

Okay bumped down to 10 and it’s working fine I think. Just thought it’d be able to handle more in one go.

mutayroei · May 16, 2022, 5:03pm

Hey,
There is a 2MB limit on maximum request size, as documented here. Anyway, thanks for the input! We’ll look into ways to increase that limit.

Roei

GeorgePearse · May 16, 2022, 6:31pm

I think the below in the docs is what I’m after.

# Upsert data with 100 vectors per upsert request asynchronously
# - Create pinecone.Index with pool_threads=30 (limits to 30 simultaneous requests)
# - Pass async_req=True to index.upsert()
with pinecone.Index('example-index', pool_threads=30) as index:
   # Send requests in parallel
   async_results = [
       index.upsert(vectors=ids_vectors_chunk, async_req=True)
       for ids_vectors_chunk in chunks(example_data_generator, batch_size=100)
   ]
   # Wait for and retrieve responses (this raises in case of error)
   [async_result.get() for async_result in async_results]

Resolves my actual problem which was just upload speed.