I have a dataframe(df) with the following structure.
Input:
ID data
1 {"customerinfo":{zip:834247}}
2 {"score":535,"customerinfo":{zip:834244}}
I wanted to parse the json data as a new columns, such as to get the following:
Expected Output:
ID zipcode score
1 834247 NULL
2 834244 535
Original Solution:
df['bureauScore'] = df['data'].transform(lambda x: json.loads(x)['score'])
df['zipcode'] = df['data'].transform(lambda x: json.loads(x)['customerinfo']['zipcode'])
Problem:
If a field is missing the code fails, thus I tired to add a get function as shown below, but that fails here with the following error
Attempted Solution
df['bureauScore'] = df['data'].transform(lambda x: json.loads(get(x))['score'])
Error:
NameError: name 'get' is not defined
Any help would be appreciated.
P.S: I know about json_normalize but I have multiple fields, thus wanted this approach