Adding new column to DataFrame with values dependent on index ref

Question

I want to add a new column to this DataFrame in Pandas where I assign a StoreID rolling thru the indexes:

It currently looks like this:

   Unnamed: 12  Store  
0          NaN      1  
1          NaN      1  
2          NaN      1  

0          NaN      1  
1          NaN      1  
2          NaN      1  

0          NaN      1  
1          NaN      1  
2          NaN      1  

0          NaN      1  
1          NaN      1  
2          NaN      1

I want it to look like this:

   Unnamed: 12  Store  StoreID
0          NaN      1  1
1          NaN      1  1
2          NaN      1  1
0          NaN      1  2
1          NaN      1  2
2          NaN      1  2
0          NaN      1  5
1          NaN      1  5
2          NaN      1  5
0          NaN      1  11
1          NaN      1  11
2          NaN      1  11

The variable changes upon the index hitting 0. The report will have variable numbers of items - most of them being 100's of 1000s of records per store.

I can create a new column easily but I can't seem to work out how to do this! Any help much appreciated - I'm just starting out with Python.

It's just a list of references from stores that have no logic. I could map the 0,1,2 sequence to the customer sequence (0=1, 1=2, 2=5, 3=11) but is there a simpler way that doesn't require another operation ? — lmonty
– lmonty, Commented Aug 1, 2018 at 18:39
Okay, then I think one of the three solutions below answers you question. — Scott Boston
– Scott Boston, Commented Aug 1, 2018 at 18:46

rafaelc · Accepted Answer · 2018-07-31 23:12:11Z

1

You can also get the cumsum of the diff of the indexes

df['g'] = (df.index.to_series().diff() < 0).cumsum()

0    0
1    0
2    0
0    1
1    1
2    1
0    2
1    2
2    2
0    3
1    3
2    3

answered Jul 31, 2018 at 23:12

rafaelc

59.4k15 gold badges64 silver badges87 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

jpp · Accepted Answer · 2018-07-31 23:35:34Z

1

Using np.ndarray.cumsum:

df['g'] = (df.index == 0).cumsum() - 1

print(df)

   col  Store  g
0  NaN      1  0
1  NaN      1  0
2  NaN      1  0
0  NaN      1  1
1  NaN      1  1
2  NaN      1  1
0  NaN      1  2
1  NaN      1  2
2  NaN      1  2
0  NaN      1  3
1  NaN      1  3
2  NaN      1  3

answered Jul 31, 2018 at 23:35

jpp

166k37 gold badges301 silver badges362 bronze badges

3 Comments

BENY Over a year ago

I like the idea directly get the result from the index

lmonty Over a year ago

These are good suggestions but ideally I want the new column to roll through a custom number or text sequence (i.e. 1, 2, 5, 11) as opposed to (0, 1, 2, 3...). Any thoughts on how I could achieve this?

jpp Over a year ago

@user10011212, So, to be clear, you have an additional input specifying the "custom sequence", e.g. we can use L = [1, 2, 5, 11] as an input? Can you update your question accordingly?

BENY · Accepted Answer · 2018-07-31 23:53:26Z

1

IIUC Try cumcount

df.groupby(df.index).cumcount()
Out[11]: 
0    0
1    0
2    0
0    1
1    1
2    1
0    2
1    2
2    2
0    3
1    3
2    3
dtype: int64

answered Jul 31, 2018 at 23:53

BENY

324k22 gold badges176 silver badges250 bronze badges

Comments

lmonty · Accepted Answer · 2018-08-01 20:31:16Z

Thanks for everyone's reply. I have ended up solving the problem with:

table['STORE_ID'] = (table.index == 0).cumsum() - 1

then adding some logic to lookup the store_id based on the sequence:

table.loc[table['STORE_ID'] == 3, 'STORE_ID'] = 11
table.loc[table['STORE_ID'] == 2, 'STORE_ID'] = 3
table.loc[table['STORE_ID'] == 1, 'STORE_ID'] = 2
table.loc[table['STORE_ID'] == 0, 'STORE_ID'] = 1

I imagine there's a simpler solution to get to the Store_ID sequence quicker but this gets the job done for now.

Collectives™ on Stack Overflow

Adding new column to DataFrame with values dependent on index ref

4 Answers 4

Comments

3 Comments

Comments

Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

Comments

3 Comments

Comments

Comments

Linked

Related