0

I am new to def function , I am trying to get the logic in def function with multiple if condition. I want x,y,z to be flexible parameter so I can change parameter value in x,y,z. but i can't get the desired output. anyone help ?

df =

    date      comp  mark    value   score   test1
0   2022-01-01  a      1       10     100   
1   2022-01-02  b      2       20     200   
2   2022-01-03  c      3       30     300   
3   2022-01-04  d      4       40     400   
4   2022-01-05  e      5       50     500   

Desired ouput =

        date    comp    mark    value   score   test1
0   2022-01-01  a          1    10       100    200
1   2022-01-02  b          2    20       200    400
2   2022-01-03  c          3    30       300    600
3   2022-01-04  d          4    40       400    4000
4   2022-01-05  e          5    50       500    5000

I can get the result use:

    def frml(df):
        if (df['mark'] > 3) and (df['value'] > 30):
            return df['score'] * 10
        else:
            return df['score'] * 2

df['test1'] = df.apply(frml,axis=1)

but i can't get the result use this: isn't the logic is the same?

 x = df['mark']
 y = df['value']
 z = df['score']

def frml(df):
    if (x > 3) and (y > 30):
        return z * 10
    else:
        return z * 2

df['test1'] = df.apply(frml,axis=1)
3
  • No, the logic is not the same. Within the function, df represents a single row of the dataframe, because that's what the apply operation gives you. Outside the function, df['mark'] refers to an entire COLUMN. Why can't you use the first format, which is correct? Commented Dec 3, 2022 at 4:45
  • Does this help? Another resource here. TL;DR: apply is not good pandas. Use mask or numpy.where instead. Commented Dec 3, 2022 at 5:52
  • sorry for late reply, doing some year end thesis.. yes cotton tail thanks for the reference link. it is really helpful.. thanks for your advice Commented Dec 12, 2022 at 12:11

1 Answer 1

1

you can use mask instead apply

cond1 = (df['mark'] > 3) & (df['value'] > 30)
df['score'].mul(2).mask(cond1, df['score'].mul(10))

output:

0     200
1     400
2     600
3    4000
4    5000
Name: score, dtype: int64

make output to test1 column

df.assign(test1=df['score'].mul(2).mask(cond1, df['score'].mul(10)))

result:

    date        comp    mark    value   score   test1
0   2022-01-01  a       1       10      100     200
1   2022-01-02  b       2       20      200     400
2   2022-01-03  c       3       30      300     600
3   2022-01-04  d       4       40      400     4000
4   2022-01-05  e       5       50      500     5000



It's possible to explain why your 2nd function doesn't work, but it's complicated.

Also, making your output don't need apply def func.

So tell you another way.


use mask or np.where or np.select instead apply def func

Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.