0

Let's say I have the following dataframe:

index A
1 3
2 5
3 20
4 8
5 7
6 13
7 33
8 2

I want to create a new column which is based on column A and create groups depending on the value of column A.

df = pd.DataFrame([['1', 3],['2', 5],['3', 20],['4',8],['5',7],['6',13],['7',33],['8',2]], columns=['index', 'A'])
df['groups'] =  df['A'].apply(lambda x: 'high' if x>15 else 'medium' if 15>=x>10 else 'low')

How could I do the same using assign?

df = df\
.assign(groups = ?)
1
  • pd.cut(df['A'], bins=(-np.inf, 10, 15, np.inf), labels=['low','med','high']) Commented Apr 7, 2021 at 15:22

2 Answers 2

2

You could use np.select here for a much more efficient approach for cases with multiple conditions. assign is just a method to add a new column, which could also just be done with panda's indexing methods.

import numpy as np
df.assign(groups=np.select([df.A>15, (df.A<=15)&(df.A>10)],
         ['high','medium'], 'low'))

  index   A  groups
0     1   3     low
1     2   5     low
2     3  20    high
3     4   8     low
4     5   7     low
5     6  13  medium
6     7  33    high
7     8   2     low
Sign up to request clarification or add additional context in comments.

Comments

1

Use assign() method:

df=df.assign(groups =df['A'].apply(lambda x: 'high' if x>15 else 'medium' if 15>=x>10 else 'low'))

OR

value=df['A'].apply(lambda x: 'high' if x>15 else 'medium' if 15>=x>10 else 'low')
df=df.assign(groups=value)

Now if you print your df you will get your desired output:

#output

    index   A   groups
0   1       3   low
1   2       5   low
2   3       20  high
3   4       8   low
4   5       7   low
5   6       13  medium
6   7       33  high
7   8       2   low

Edit: you can also do this by:

result=pd.read_csv(file).assign(groups=pd.read_csv(file)['A'].apply(lambda x: 'high' if x>15 else 'medium' if 15>=x>10 else 'low'))

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.