1

In Python I am trying to create a new column(degree) within a dataframe and to set its value based on if logic based on two other columns in the dataframe (whether single rows of one or both these columns are null values or not..). Per row it should assign to the new column the value of either one of these columns based on the presence of null values in the column.

I have tried the below code, which gives me the following error message:

KeyError: 'degree'

The code is -

for i in basicdataframe.index:
    if pd.isnull(basicdataframe['section_degree'][i]) and pd.isnull(basicdataframe['model_degree'][i]):
        basicdataframe['degree'][i] = basicdataframe['model_degree'][i]
    elif pd.notnull(basicdataframe['section_degree'][i]) and pd.isnull(basicdataframe['model_degree'][i]):
        basicdataframe['degree'][i] = basicdataframe['section_degree'][i]
    elif pd.isnull(basicdataframe['section_degree'][i]) and pd.notnull(basicdataframe['model_degree'][i]):
        basicdataframe['degree'][i] = basicdataframe['model_degree'][i]
    elif pd.notnull(basicdataframe['section_degree'][i]) and pd.notnull(basicdataframe['model_degree'][i]):
        basicdataframe['degree'][i] = basicdataframe['model_degree'][i]

Does anybody know how to achieve this?

2 Answers 2

1

Let's say you have pandas Dataframe like this:

import pandas as pd
import numpy as np

df = pd.DataFrame(data={
    "section_degree": [1, 2, np.nan, np.nan], 
    "model_degree": [np.nan, np.nan, np.nan, 3]
})

You can define function that will be applied to DataFrame:

def define_degree(x):
    if pd.isnull(x["section_degree"]) and pd.isnull(x["model_degree"]):
        return x["model_degree"]
    elif pd.notnull(x['section_degree']) and pd.isnull(x['model_degree']):
        return x["section_degree"]
    elif pd.isnull(x['section_degree']) and pd.notnull(x['model_degree']):
        return x["model_degree"]
    elif pd.notnull(x['section_degree']) and pd.notnull(x['model_degree']):
        return x["model_degree"]
df["degree"] = df.apply(define_degree, axis=1)

df

# output

    section_degree  model_degree    degree
0   1.0             NaN             1.0
1   2.0             NaN             2.0
2   NaN             NaN             NaN
3   NaN             3.0             3.0
Sign up to request clarification or add additional context in comments.

Comments

0

The error is because you are trying to assign values inside a column which does not exist yet.

Since you are setting a new column as degree, it makes sense if you add the column first with some default value.

basicdataframe['degree'] = ''

This would set an empty string for all rows of the dataframe for this column.

After that, you can set the values.

P.S. Your code is likely to give you warnings about SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame.

To fix that, you could take help from https://stackoverflow.com/a/20627316/1388513

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.