23

I have a Json file as follows. It's a list of dicts.

[{"city": "ab", "trips": 4, "date": "2014-01-25", "value": 4.7, "price": 1.1, "request_date": "2014-06-17", "medium": "iPhone", "%price": 15.4, "type": true, "Weekly_pct": 46.2, "avg_dist": 3.67, "avg_price": 5.0}, {"city": "bc", "trips": 0, "date": "2014-01-29", "value": 5.0, "price": 1.0, "request_date": "2014-05-05", "medium": "Android", "%price": 0.0, "type": false, "weekly_pct": 50.0, "avg_dist": 8.26, "avg_price": 5.0}.....]

When I read this using this:

data=pd.read_json('dataset.json')

I get the following error:

ValueError: Expected object or value

I tried this too:

from ast import literal_eval

with open('dataset.json') as f:
    data = literal_eval(f.read())

df = pd.DataFrame(data)

It gives the following error:

ValueError: malformed string

Edit:

Even Json.loads doesn't work. Tried this:

import json
data=json.loads('dataset.json')

ValueError: No JSON object could be decoded

The Json file is 13.5MB but it seems to have huge amounts of data.

3
  • Does json.loads work? (import json first...) Commented Apr 25, 2016 at 10:24
  • No. I tried that it gave me this error. Check Edit. Commented Apr 25, 2016 at 10:25
  • I think you might benefit from this answer: stackoverflow.com/a/20644150/5276797 Commented Apr 25, 2016 at 10:27

6 Answers 6

17

I think you can use modul json for reading file.json and then DataFrame constructor:

import pandas as pd
import json

with open('file.json') as f:
   data = json.load(f)
print data
[{u'city': u'ab', u'medium': u'iPhone', u'request_date': u'2014-06-17', u'price': 1.1, u'Weekly_pct': 46.2, u'value': 4.7, u'%price': 15.4, u'avg_price': 5.0, u'date': u'2014-01-25', u'avg_dist': 3.67, u'type': True, u'trips': 4}, {u'city': u'bc', u'medium': u'Android', u'request_date': u'2014-05-05', u'price': 1.0, u'weekly_pct': 50.0, u'value': 5.0, u'%price': 0.0, u'avg_price': 5.0, u'date': u'2014-01-29', u'avg_dist': 8.26, u'type': False, u'trips': 0}]

print pd.DataFrame(data)

   %price  Weekly_pct  avg_dist  avg_price city        date   medium  price  \
0    15.4        46.2      3.67        5.0   ab  2014-01-25   iPhone    1.1   
1     0.0         NaN      8.26        5.0   bc  2014-01-29  Android    1.0   

  request_date  trips   type  value  weekly_pct  
0   2014-06-17      4   True    4.7         NaN  
1   2014-05-05      0  False    5.0        50.0  
Sign up to request clarification or add additional context in comments.

7 Comments

I think the example OP gave works, and that the error is buried somewhere in the large file...
Hmmm, I get first error (ValueError: Expected object or value) and second error (ValueError: malformed string) too with sample. But my solution works very well.
ok. I just did what @jezrael suggested. And it worked. However my column order is different. Like city should be first column but its coming in different order as he is getting too. Any idea how to get same order of column names?
@jezrael any idea why read_json would fail? And why your solution works? Even json.loads (with an s) fails...
I think it fails, because list of dictionaries in json file. It is valid json, but seems read_json doesnt support these type of json.
|
10

I had the same error. Turns out it couldn't find the file. I modified the path and pd.read_json worked fine. As for json.loads, this might be helpful.

1 Comment

had the same error because I moved the jupyter notebook and jupyter would not adapt the file path. Pandas is returning the worst error message possible here.
8

You need to indicate to Pandas that "records" formatting (where the JSON appears like a list of dictionaries) is used in datasets.json.

res = pd.read_json('input/dataset.json', orient='records')

print(res.iloc[:, :5])
   %price  Weekly_pct  avg_dist  avg_price city
0    15.4        46.2      3.67          5   ab
1     0.0         NaN      8.26          5   bc

Comments

5

The following worked for me when pd.read_json failed: open file, load with normal json.load, then load into a pandas dataframe.

    import pandas as pd
    import json

    openfile=open('file.json')
    jsondata=json.load(openfile)
    df=pd.DataFrame(jsondata)

    openfile.close()
    print(df)

Comments

0

For me it was a problem with the path. The path I had to use depended on the directory from where I run the python file. Maybe try to 'cd' into the directory of your python file and then data=pd.read_json('dataset.json') should work.

Comments

0

I had to add the parameter lines=True to make it work, e.g:

pd.read_json("dataset.json", lines=True)

Alternatively you could do it like this:

import json
import pandas as pd

with open("dataset.json") as f:
  df = pd.DataFrame([json.loads(l) for l in f.readlines()])
print(df)  # Shows data frame as expected 

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.