2

I attempt to apply a different formula to a new column "result" based on the column 'C' containing the condition. If C is 'add' then I would like to add X and Y. When C is 'mult' the result should be X * Y.

df = pd.DataFrame({'X': [0, 1, 2, 3, 4],
                   'Y': [5, 6, 7, 8, 9],
                   'C': ['add', 'add', 'mult', 'mult', 'mult']})
df['result'] = df['X'] * df['Y']

df.loc[df.C =='add', 'result'] = df.loc[df['C'] =='add', 'X'] \
                                 + df.loc[df['C'] =='add', 'Y']
df

The result I get is:

      C  X  Y  result
0   add  0  5       5
1   add  1  6       5
2  mult  2  7      14
3  mult  3  8      24
4  mult  4  9      36

What I need is 'result' in row 1 being 7

      C  X  Y  result
0   add  0  5       5
1   add  1  6       7
2  mult  2  7      14
3  mult  3  8      24
4  mult  4  9      36
2
  • Using your code I have the right result on my machine using pandas 0.24.2 Commented May 17, 2019 at 9:22
  • 2
    Ur code is working fine please cross check Commented May 17, 2019 at 9:22

2 Answers 2

3

your code gives right results, but if you want a direct way

df['result'] = df.apply(lambda x :  x['X']+x['Y'] if x['C'] == 'add' else x['X']*x['Y'], axis=1 ) 

output :

   X  Y     C  result
0  0  5   add       5
1  1  6   add       7
2  2  7  mult      14
3  3  8  mult      24
4  4  9  mult      36
Sign up to request clarification or add additional context in comments.

1 Comment

like every other function it has its downsides and its upsides, if you are talking about performance, you should know that performance depends on your dataset size. So if you have a small dataset just use it, thank you anyway
2

Your solution working nice, also is posible use this alternative with numpy.where:

mask = df.C =='add'
df['result'] = np.where(mask, df['X'] + df['Y'], df['X'] * df['Y'])        
print (df)

   X  Y     C  result
0  0  5   add       5
1  1  6   add       7
2  2  7  mult      14
3  3  8  mult      24
4  4  9  mult      36

If more conditions is possible use numpy.select:

m1 = df.C =='add'
m2 = df.C =='mult'
m3 = df.C =='div'
v1 = df['X'] + df['Y']
v2 = df['X'] * df['Y']
v3 = df['X'] / df['Y']

df['result'] = np.select([m1, m2, m3], [v1, v2, v3])        
print (df)

   X  Y     C     result
0  0  5   add   5.000000
1  1  6   add   7.000000
2  2  7  mult  14.000000
3  3  8  mult  24.000000
4  4  9   div   0.444444

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.