0

I have data which pymongo fails to upload, and, I cannot understand why.

Data available here

main2.py:

from pymongo import MongoClient
import json

client = MongoClient(host='localhost')
db = client.get_database('journal')
col = db.get_collection('entries')

file = '1758480758.json'

with open(file) as f:
    obj = json.load(f)

# not printing since obj is large

print(col.insert_one(obj))

The last print statement does not happen. And the programs waits a long time after which it reports an connection related error. (More than 15-20min). However, if a certain property is removed the problem does not happen and the insert succeeds.

main3.py:

from pymongo import MongoClient
import json

client = MongoClient(host='localhost', port=27017)
db = client.get_database('journal')
col = db.get_collection('entries')

file = '1758480758.json'

with open(file) as f:
    obj = json.load(f)

del obj['bundle'][0]['data']
print(obj)
print(col.insert_one(obj))

In the case above, the program works as expected. It so happens that data in this context is a base64 encoded string. Is this behavior by design?

The record being inserted is not greater than 16mb in size either...

Edit (after 11 hours): Stumped. Tried all the above from Linux (specifically ubuntu in wsl). On pure windows this worked!

Edit (after 20 hours): Stumped again. A restart helped.

1
  • 2
    "not greater than 16mb" I count 917_280 b64 characters. But still, there could be a length restriction involved, somewhere in the stack. I recommend you store pdf = obj['bundle'][0]['data'], assign n = 1, and then iterate with storing the first pdf[:n] characters, then n *= 2; print(n), and keep going till it fails. Then you'll know more about how small that limit is. // Let us know how it goes. SO welcomes "self answers" from the OP. Commented Sep 21 at 21:12

0

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.