Insert value based on row index number in a pandas dataframe

Question

I need to insert value into a column based on row index of a pandas dataframe.

import pandas as pd
df=pd.DataFrame(np.random.randint(0,100,size=(11, 4)), columns=list('ABCD'))
df['ticker']='na'
df

Sample DataFrame In the above sample dataframe, the ticker column for first 25% of the total number of records must have value '$" the next 25% of the records must have value "$$" and so on.

I tried to get the length of the dataframe and calculate 25,50,75 percent on it and then access one row at a time and assign value to "ticker" based on row index.

total_row_count=len(df)
row_25 = int(total_row_count * .25)
row_50 = int(total_row_count * .5)
row_75=int(total_row_count*.75)

if ((row.index >=0) and (row.index<=row_25)):
    return"$"
elif ((row.index > row_25) and (row.index<=row_50)):
    return"$$"
elif ((row.index > row_50) and (row.index<=row_75)):
    return"$$$"
elif (row.index > row_75):
    return"$$$$"

But I'm not able to get the row index. Please let me know if there is a different way to assign these values

sacuL · Accepted Answer · 2018-03-07 23:02:17Z

1

I like to use np.select for this kind of task, because I find the syntax intuitive and readable:

# Set up your conditions:
conds = [(df.index >= 0) & (df.index <= row_25),
         (df.index > row_25) & (df.index<=row_50),
         (df.index > row_50) & (df.index<=row_75),
         (df.index > row_75)]

# Set up your target values (in the same order as your conditions)
choices = ['$', '$$', '$$$', '$$$$']

# Assign df['ticker']
df['ticker'] = np.select(conds, choices)

returns this:

>>> df
     A   B   C   D ticker
0   92  97  25  79      $
1   76   4  26  94      $
2   49  65  19  91      $
3   76   3  83  45     $$
4   83  16   0  16     $$
5    1  56  97  44     $$
6   78  17  18  86    $$$
7   55  56  83  91    $$$
8   76  16  52  33    $$$
9   55  35  80  95   $$$$
10  90  29  41  87   $$$$

edited Mar 7, 2018 at 23:02

answered Mar 7, 2018 at 22:51

sacuL

51.6k9 gold badges88 silver badges115 bronze badges

Sign up to request clarification or add additional context in comments.

4 Comments

sdp Over a year ago

The "$$$$" wont populate in the last 2 records. Any idea why it wont populate?

sacuL Over a year ago

try: df['ticker'] = np.select(conds, choices, default = 'test'), if the last 2 records are filled with the value test, it means that none of the conditions provided were satisfied in those rows. Otherwise, I'm not sure...

sdp Over a year ago

Your solution worked. I'm not sure why it does'n show in my df . When I saved it as a csv, I was able to see '$$$$'. Thanks sacul

Mactilda Over a year ago

I am trying this as I thought this would solve my problem but I get an error saying that ```` 'row_6 ' is not defined ```` (what would've been row_25 in this example). Would you happen to know a way of trouble shooting this?

BENY · Accepted Answer · 2018-03-07 23:08:53Z

1

I think cut can solve this problem

df['ticker']=pd.cut(np.arange(len(df))/len(df), [-np.inf,0.25,0.5,0.75,1], labels=["$","$$",'$$$','$$$$'],right=True)
df
Out[35]: 
     A   B   C   D ticker
0   63  51  19  33      $
1   12  80  57   1      $
2   53  27  62  26      $
3   97  43  31  80     $$
4   91  22  92  11     $$
5   39  70  82  26     $$
6   32  62  17  75    $$$
7    5  59  79  72    $$$
8   75   4  47   4    $$$
9   43   5  45  66   $$$$
10  29   9  74  94   $$$$

edited Mar 7, 2018 at 23:08

answered Mar 7, 2018 at 22:50

BENY

324k22 gold badges176 silver badges250 bronze badges

3 Comments

sdp Over a year ago

I'm not sure what I'm missing but When I ran the code its returning "$" for all the rows in ticker column.

BENY Over a year ago

@sow it work fine on my side , would you mind paste the code you are using here ?>

sdp Over a year ago

import pandas as pd import numpy as np df=pd.DataFrame(np.random.randint(0,100,size=(11, 4)), columns=list('ABCD')) df['ticker']=pd.cut(np.arange(len(df))/len(df), [-np.inf,0.25,0.5,0.75,1], labels=["$","$$",'$$$','$$$$'],right=True) df

Connor John · Accepted Answer · 2018-03-07 22:30:55Z

0

You can set up a few np.where statements to handle this. Try something like the following:

import numpy as np
...
df['ticker'] = np.where(df.index < row_25, "$", df['ticker'])
df['ticker'] = np.where(row_25 <= df.index < row_50, "$$", df['ticker'])
df['ticker'] = np.where(row_50 <= df.index < row_75, "$$$", df['ticker'])
df['ticker'] = np.where(row_75 <= df.index, "$$$$", df['ticker'])

answered Mar 7, 2018 at 22:30

Connor John

4332 silver badges8 bronze badges

Comments

jpp · Accepted Answer · 2018-03-07 22:58:57Z

0

This is one explicit solution using .loc accessor.

import pandas as pd

df = pd.DataFrame(np.random.randint(0,100,size=(11, 4)), columns=list('ABCD'))
n = len(df.index)

df['ticker'] = 'na'
df.loc[df.index <= n/4, 'ticker'] = '$'
df.loc[(n/4 < df.index) & (df.index <= n/2), 'ticker'] = '$$'
df.loc[(n/2 < df.index) & (df.index <= n*3/4), 'ticker'] = '$$$'
df.loc[df.index > n*3/4, 'ticker'] = '$$$$'

#      A   B   C   D ticker
# 0   47  64   7  46      $
# 1   53  55  75   3      $
# 2   93  95  28  47      $
# 3   35  88  16   7     $$
# 4   99  66  88  84     $$
# 5   75   2  72  90     $$
# 6    6  53  36  92    $$$
# 7   83  58  54  67    $$$
# 8   49  83  46  54    $$$
# 9   69   9  96  73   $$$$
# 10  84  42  11  83   $$$$

answered Mar 7, 2018 at 22:58

jpp

166k37 gold badges301 silver badges362 bronze badges

4 Comments

sdp Over a year ago

The "$$$$" wont populate any idea on what I'm missing?

jpp Over a year ago

That's strange, when I try print(df) I see output as per my post.

sdp Over a year ago

Your solution worked. I'm not sure why it does'n show in my df . When I saved it as a csv, I was able to see '$$$$'. Thanks @jpp

jpp Over a year ago

@sow, no problem. Feel free to accept (tick on left) if it solved your problem.

Collectives™ on Stack Overflow

Insert value based on row index number in a pandas dataframe

4 Answers 4

4 Comments

3 Comments

Comments

4 Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

4 Comments

3 Comments

Comments

4 Comments

Linked

Related