I’ve just signed up for Pinecone and started using it, but immediately I ran into a strange and inexplicable error. I insert items to my serverless index using the /vectors/upsers
endpoint, and I get a positive response with the field “upsertedCount” matching the number of records I sent.
For the first 1381 records everything was fine. However, after that, any new records simply failed to appear in the index. They are not there in the Pinecone console. There are not there when I use the /vectors/list
endpoint. Even though the upsert command returns success, the records go into some blackhole. The number of records in the index remains 1381, even though I insert new records.
Since Pinecone is “eventually consistent”, I thought I would wait for a while for them to appear. But I waited 8 hours and nothing, the records are still not there. So, per documentation, I checked the x-pinecone-request-lsn
header in the last upsert command, and it does match the x-pinecone-max-indexed-lsn
header in my query command, which would mean that the index is consistent. But upserted records do not appear.
I thought that maybe there was something wrong with the index, so I created another one. Unfortunately, it has the same problem, but instead of 1381 records, it is now capped at 600 records. Any further inserts fail to appear.
This is such a strange error, I have no idea what to do. Any suggestions?
Edit: Created yet another index, and the number of items now is capped at 299. So every time, the problem gets worse.
Edit 2: I upgraded to a paid plan, just in case I was hitting some limit on the free plan. Now all the inserted items are visible in the console, BUT they are still not visible when retrieving via the /vectors/list
endpoint
Edit 3: When I place the full record ID as prefix in /vectors/list
, then it is always returned. So it seems like it is in the database after all. However, when I browse all the record IDs via /vectors/list
, only the first 600 are returned and then the pagination token is empty. Which is a problem for me, because I want to do a differential synchronization of the database, which means that I need to know all the IDS that are currently in the index.