0

I want to create a new column in which stores boolean values when two columns (one and two) present the same value and another column (three) presents the value True.

If column three == True AND column two == column one ---> column four = True

If column three == false  ---> column four = Na

If column three == True AND column two != column one ---> column four =  False

Example dataframe:

data = [['True', 0,0], ['True', 0, 1], ['False', 0, 1]]
df = pd.DataFrame(data, columns = ['One', 'Two', ''True])

one  Two Three
True  0   0
True  0   1
False 0   1

Disable output

one Two Three Four
True  0   0   True
True  0   1   False
False 0   1   Na 
3
  • @HenryYik, I'm agree with you if the OP doesn't select the accepted answer with np.where. Here, np.select is the right choice for multiple conditions (even if in fact, it's possible to reduce the problem to a simple binary condition) Commented Aug 18, 2021 at 15:58
  • Also for your given dataframe your expected output is wrong Commented Aug 18, 2021 at 16:04
  • Why @AnuragDabas Commented Aug 19, 2021 at 6:37

2 Answers 2

1

Use np.select:

Input data:

>>> df
   One  Two  Three
0    0    0   True
1    0    1   True
2    0    1  False
df['Four'] = np.select([df['Three'] & df['One'].eq(df['Two']),
                        df['Three'] & df['One'].ne(df['Two'])],
                       choicelist=[True, False],
                       default=pd.NA)

Output result:

>>> df
   One  Two  Three   Four
0    0    0   True   True
1    0    1   True  False
2    0    1  False   <NA>

You can cast the column Four to boolean dtype:

>>> df.astype({'Four': 'boolean'}).info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 3 entries, 0 to 2
Data columns (total 4 columns):
 #   Column  Non-Null Count  Dtype
---  ------  --------------  -----
 0   One     3 non-null      int64
 1   Two     3 non-null      int64
 2   Three   3 non-null      bool
 3   Four    2 non-null      boolean  # <- HERE
dtypes: bool(1), boolean(1), int64(2)
memory usage: 185.0 bytes
Sign up to request clarification or add additional context in comments.

Comments

0

You can try with a custom function, you can modify the function based on the realistic condition you want to have, this just a walk-through approach.

Function:

def check_df(df):
  if (df['Three'] and df['One'] == (df['Two'])):
    return True
  elif (df['Three'] and df['One'] != (df['Two'])):
    return False
  else:
    return np.nan

DataFrame Sample:

print(df)
     One  Two  Three
0   True    0      0
1   True    0      1
2  False    0      1

Now use df.apply and apply the function on the axis 1.

df['newcolumn'] = df.apply(check_df, axis=1)
print(df)
     One  Two  Three newcolumn
0   True    0      0       NaN
1   True    0      1     False
2  False    0      1      True

2 Comments

This is not the output I need
@JorgeAlbertoPalacios, thats what i told , you may change the condition as you want itto be.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.