parse specific key in json to dataframe column

Question

How do you parse a specific key to multiple rows and column from a response?

I am getting a response:

response = requests.post('https://api.kewords.com/v1/get_keyword_data', data=my_data, headers=my_headers)

which returns:

{
"data": {
    "heart disease": {
        "vol": 18100,
        "cpc": {
            "currency": "$",
            "value": "2.01"
        },
        "keyword": "keyword planner",
        "competition": 0.21,
        "trend": [
            {
                "month": "January",
                "year": 2022,
                "value": 18100
            },
            {
                "month": "February",
                "year": 2022,

However, when I normalize it before I save it to a .csv, it places everything into a single row:

How do you parse a specific field, like data.heart disease.trend in my example, with separate columns for year and month based on value to a pandas dataframe? Every method from normalizing it to dealing with it as a dict has failed.

my code:

my_data = {
    'country': 'us',
    'currency': 'USD',
    'dataSource': 'cli',
    
    f'kw[{keyword}]': ["keywords tool", "keyword planner"]
}
my_headers = {
    'Accept': 'application/json',
    'Authorization': ''
}
response = requests.post('https://api.keywords.com/v1/get_keyword_data', data=my_data, headers=my_headers)

#write to json
with open('output.json', 'wb') as outf:
    outf.write(response.content)
    
if response.status_code == 200:
    print('success\n\n', response.content.decode('utf-8'))
else:
    print("An error occurred\n\n", response.content.decode('utf-8'))

post the (full or meaningful/completed part) testable response dict — RomanPerekhrest
– RomanPerekhrest, Commented Jan 30, 2023 at 20:19

Jason Baker · Accepted Answer · 2023-01-30 20:39:05Z

1

Using json_normalize:

data = json.loads(response.text)

df = pd.json_normalize(
    data=data["data"]["heart disease"],
    meta=["vol", "keyword", "competition", "cpc"],
    record_path="trend"
)
df = pd.concat([df.drop(["cpc"], axis=1), df["cpc"].apply(pd.Series)], axis=1)
print(df)

Output:

      month  year  value    vol          keyword competition currency value
0   January  2022  18100  18100  keyword planner        0.21        $  2.01
1  February  2022  20000  18100  keyword planner        0.21        $  2.01

answered Jan 30, 2023 at 20:39

Jason Baker

3,7262 gold badges14 silver badges18 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Doug Over a year ago

Thanks Jason! I see, so it was a few things but the key is to really set the record_path and then concat.

Jason Baker Over a year ago

The last line is to handle the dict that the "cpc" holds and to concat it to the rest of the df. Please also mark the issue resolved when you have a chance.

Collectives™ on Stack Overflow

parse specific key in json to dataframe column

1 Answer 1

2 Comments

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

2 Comments

Related