I have data which pymongo fails to upload, and, I cannot understand why.
Data available here
main2.py:
from pymongo import MongoClient
import json
client = MongoClient(host='localhost')
db = client.get_database('journal')
col = db.get_collection('entries')
file = '1758480758.json'
with open(file) as f:
obj = json.load(f)
# not printing since obj is large
print(col.insert_one(obj))
The last print statement does not happen. And the programs waits a long time after which it reports an connection related error. (More than 15-20min). However, if a certain property is removed the problem does not happen and the insert succeeds.
main3.py:
from pymongo import MongoClient
import json
client = MongoClient(host='localhost', port=27017)
db = client.get_database('journal')
col = db.get_collection('entries')
file = '1758480758.json'
with open(file) as f:
obj = json.load(f)
del obj['bundle'][0]['data']
print(obj)
print(col.insert_one(obj))
In the case above, the program works as expected. It so happens that data
in this context is a base64 encoded string. Is this behavior by design?
The record being inserted is not greater than 16mb in size either...
Edit (after 11 hours): Stumped. Tried all the above from Linux (specifically ubuntu in wsl). On pure windows this worked!
Edit (after 20 hours): Stumped again. A restart helped.
917_280
b64 characters. But still, there could be a length restriction involved, somewhere in the stack. I recommend you storepdf = obj['bundle'][0]['data']
, assignn = 1
, and then iterate with storing the firstpdf[:n]
characters, thenn *= 2; print(n)
, and keep going till it fails. Then you'll know more about how small that limit is. // Let us know how it goes. SO welcomes "self answers" from the OP.