1

Need to extract value from a json string stored in pandas column and assign it to a column with a conditional apply to rows with null values only.

df = pd.DataFrame({'col1': [06010, np.nan, 06020, np.nan],
                   'json_col': [{'Id': '060',
                                 'Date': '20210908',
                                 'value': {'Id': '060',
                                           'Code': '06037'}
                                 },
                                 {'Id': '061',
                                 'Date': '20210908',
                                 'value': {'Id': '060',
                                           'Code': '06038'}
                                 },
                                 {'Id': '062',
                                 'Date': '20210908',
                                 'value': {'Id': '060',
                                           'Code': '06039'}
                                 },
                                 {'Id': '063',
                                 'Date': '20210908',
                                 'value': {'Id': '060',
                                           'Code': '06040'}
                                 }],
                })
                         

# Check for null condition and extract Code from json string

df['Col1'] = df[df['Col1'].isnull()].apply(lambda x : [x['json_col'][i]['value']['Code'] for i in x])

Expected result:

Col1

06010
06038
06020
06040
1
  • for performance reasons, it is better if you preprocess the json data into flat form before creating the dataframe. Commented Sep 12, 2021 at 4:48

2 Answers 2

1

To extract field from a dictionary column, you can use .str accessor. For instance, to extract json_col -> value -> code you can use df.json_col.str['value'].str['Code']. And then use fillna to replace nan in col1:

df.col1.fillna(df.json_col.str['value'].str['Code'])

0    06010
1    06038
2    06020
3    06040
Name: col1, dtype: object
Sign up to request clarification or add additional context in comments.

2 Comments

For some reason, this isn't working for me. Doesn't throw any errors, but it isn't assign values either.
You can just assign it to a column with: df['col1'] = df.col1.fillna(df.json_col.str['value'].str['Code'])
0

Try with this:

>>> df['col1'].fillna(df['json_col'].map(lambda x: x['value']['Code']))
0    06010
1    06038
2    06020
3    06040
Name: col1, dtype: object
>>> 

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.