0

I have a function that I apply to a json file. It works if it looks like this:

import json

def myfunction(dictionary):
     #does things
    return new_dictionary
                

data = """{
"_id": {
"$oid": "5e7511c45cb29ef48b8cfcff"
},
"description": "some text",
"startDate": {
"$date": "5e7511c45cb29ef48b8cfcff"
},
"completionDate": {
"$date": "2021-01-05T14:59:58.046Z"
},
"videos":[{"$oid":"5ecf6cc19ad2a4dfea993fed"}]
}"""

info = json.loads(data)
refined = key_replacer(info)
new_data = json.dumps(refined)
print(new_data)

However, I need to apply it to a whole while and the input looks like this (there are multiple elements and they are not separated by commas, they are one after another):

{"_id":{"$oid":"5f06cb272cfede51800b6b53"},"company":{"$oid":"5cdac819b6d0092cd6fb69d3"},"name":"SomeName","videos":[{"$oid":"5ecf6cc19ad2a4dfea993fed"}]}
{"_id":{"$oid":"5ddb781fb4a9862c5fbd298c"},"company":{"$oid":"5d22cf72262f0301ecacd706"},"name":"SomeName2","videos":[{"$oid":"5dd3f09727658a1b9b4fb5fd"},{"$oid":"5d78b5a536e59001a4357f4c"},{"$oid":"5de0b85e129ef7026f27ad47"}]}

How could I do this? I tried opening and reading the file, using load and dump instead of loads and dumps, and it still doesn't work. Do I need to read, or iterate over every line?

1 Answer 1

2

You are dealing with ndjson(Newline delimited JSON) data format.

You have to read the whole data string, split it by lines and parse each line as a JSON object resulting in a list of JSONs:

def parse_ndjson(data):
    return [json.loads(l) for l in data.splitlines()]


with open('C:\\Users\\test.json', 'r', encoding="utf8") as handle:
    data = handle.read()
    dicts = parse_ndjson(data)

for d in dicts:
    new_d = my_function(d)
    print("New dict", new_d)
Sign up to request clarification or add additional context in comments.

1 Comment

I wrote your function and then tried opening the file I mentioned above: with open('C:\\Users\\test.json', 'r', encoding="utf8") as handle: info = [json.loads(line) for line in handle] parse_ndjson(info) print(info) But it says AttributeError: 'list' object has no attribute 'splitlines'.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.