0

I am trying to parse a json and insert the results in pandas dataframe.

My json looks like

{'result': {'data': [{'dimensions': [{'id': '219876173',
      'name': 'Our great product'},
     {'id': '2021-03-01', 'name': ''}],
    'metrics': [41, 4945]},
   {'dimensions': [{'id': '219876173',
      'name': 'Our great product'},
     {'id': '2021-03-02', 'name': ''}],
    'metrics': [31, 2645]},
   {'dimensions': [{'id': '219876166',
      'name': 'Our awesome product'},
     {'id': '2021-03-01', 'name': ''}], ....

So far, I've managed to get to this point:

[{'dimensions': [{'id': '219876173',
    'name': 'Our great product'},
   {'id': '2021-03-01', 'name': ''}],
  'metrics': [41, 4945]},
 {'dimensions': [{'id': '219876173',
    'name': 'Our great product'},
   {'id': '2021-03-02', 'name': ''}],
  'metrics': [31, 2645]},

However, when I place it in Pandas I get

   dimensions                                                                metrics
0   [{'id': '219876173', 'name': 'Our great product...   [41, 4945]
1   [{'id': '219876173', 'name': 'Our great product...   [31, 2645]
2   [{'id': '219876166', 'name': 'Our awesome product...   [27, 2475]

I can now manually split the results in columns using some lambdas

df = pd.io.json.json_normalize(r.json().get('result').get('data'))
df['delivered_units'] = df['metrics'].apply(lambda x: x[0])
df['revenue'] = df['metrics'].apply(lambda x: x[1])
df['name'] = df['dimensions'].apply(lambda x: x[0])
df['sku'] = df['name'].apply(lambda x: x['name'])

Is there a better way to parse json directly without lambdas?

1
  • Kindly post your expected output. Data not pics please Commented May 28, 2021 at 13:30

1 Answer 1

1

Look into flatten_json:

data = {'result': {'data': [{'dimensions': [{'id': '219876173',
      'name': 'Our great product'},
     {'id': '2021-03-01', 'name': ''}],
    'metrics': [41, 4945]},
   {'dimensions': [{'id': '219876173',
      'name': 'Our great product'},
     {'id': '2021-03-02', 'name': ''}],
    'metrics': [31, 2645]},
   {'dimensions': [{'id': '219876166',
      'name': 'Our awesome product'},
     {'id': '2021-03-01', 'name': ''}]}]}}

from flatten_json import flatten
dic_flattened = (flatten(d, '.') for d in data['result']['data'])
df = pd.DataFrame(dic_flattened)

  dimensions.0.id    dimensions.0.name dimensions.1.id dimensions.1.name  metrics.0  metrics.1
0       219876173    Our great product      2021-03-01                         41.0     4945.0
1       219876173    Our great product      2021-03-02                         31.0     2645.0
2       219876166  Our awesome product      2021-03-01                          NaN        NaN
Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.