Iterating over groups (Python pandas dataframe)

Question

I want to iterate over groups that are grouped by strings or dates.

df = pd.DataFrame({'A': ['foo', 'bar'] * 3,
                   'B': ['me', 'you', 'me'] * 2,
                   'C': [5, 2, 3, 4, 6, 9]}) 
groups = df.groupby('A')

For eg in this code, I have groups by their names 'foo' and 'bar', and I can loop over them using;

for name, group in groups:
   print name

My problem is I need to run another loop inside this loop and everytime I need to call different set of groups. like (assume groups has size n)

for name,group in groups:
   for name1 in range(name, name + 9):  # + 9 to get first 9 groups for every iteration`

Since, name is a string I am unable to do that. In short I just want a method by which I can access groups by numbers so that I can easily call required groups for computation. Something like

groups = df.group('A')
for i in range(0,n):
    print group(i)[] + group(i+1)[]

so if I have following groups [g1,g2,g3,g4,g5], i want to iteratively call them in pairs like [g1,g2], [g2,g3], [g3,g4] .... and take the intersection of the 2 groups of series everytime. I am looking for way to call groups [g1,g2,..g5] by index or some no. so that I can use them for loop operations. Currently only way I know to call groups is through the names of the group, as mentioned above in example 'foo' and 'bar'. I want power to do operations such as:

for name,group in groups-1:
   print gb.get_group(name)
   print gb.get_group(name+1)

I know this might be a simple problem, but I have been struggling for this part since a while. I would appreciate any kind of help.

What are you actually trying to do? At the moment this sounds like an XY problem... — Andy Hayden
– Andy Hayden, Commented Apr 15, 2015 at 6:32
It isn't clear what you need -- is it enough to get any 9 groups at a time for working, or is there some ordering on the groups? If the groups are ordered, is it because their keys are ordered? — cphlewis
– cphlewis, Commented Apr 15, 2015 at 6:34
so if I have following groups [g1,g2,g3,g4,g5], i want to iteratively call them in pairs like [g1,g2], [g2,g3], [g3,g4] .... and take the intersection of the 2 groups of series everytime. I am looking for way to call groups [g1,g2,..g5] by index or some no. so that I can use them for loop operations. Currently only way I know to call groups is through the names of the group, as mentioned above in example 'foo' and 'bar'. — Bunny
– Bunny, Commented Apr 15, 2015 at 12:36
Please take some data (even random is sufficient), and give the expected output, this will make it easier to understand. — Zero
– Zero, Commented Apr 20, 2015 at 19:58

S Anand · Accepted Answer · 2015-04-21 03:05:53Z

The .groupby() object has a .groups attribute that returns a Python dict of indices. In this case:

In [26]: df = pd.DataFrame({'A': ['foo', 'bar'] * 3,
   ....:                    'B': ['me', 'you', 'me'] * 2,
   ....:                    'C': [5, 2, 3, 4, 6, 9]})

In [27]: groups = df.groupby('A')

In [28]: groups.groups
Out[28]: {'bar': [1L, 3L, 5L], 'foo': [0L, 2L, 4L]}

You can iterate over this as follows:

keys = groups.groups.keys()
for index in range(0, len(keys) - 1):
    g1 = df.ix[groups.groups[keys[index]]]
    g2 = df.ix[groups.groups[keys[index + 1]]]
    # Do something with g1, g2

However, please remember that using for loops to iterate over Pandas objects is generally slower than vector operations. Depending on what you need done, and if it needs to be fast, you may want to try other approaches.

William Miller · Accepted Answer · 2022-01-15 08:12:07Z

0

Since dict_keys in Python 3 are not subscriptable, change:

df.ix[groups.groups[keys[index]]]

to

df.ix[groups.groups[list(keys)[index]]]

edited Jan 15, 2022 at 8:12

William Miller

10.4k4 gold badges30 silver badges50 bronze badges

answered Aug 27, 2021 at 22:00

Hillary Murefu

963 bronze badges

Collectives™ on Stack Overflow

Iterating over groups (Python pandas dataframe)

2 Answers 2

Comments

Comments

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

Comments

Related