how to iterate over a pandas groupby object while using if statement in python [duplicate]

Question

I have a pandas dataframe df which looks like this:

| source_num| source_date| text      | category    |location    | source |
+---------+------------+-------------+-------------+------------+--------+---
|  0      | 15/12/2020 | text1       | cat 1       | loc1       |soucrce1|
|  1      | 15/12/2020 | text2       | cat 2       | loc2       |source 2|
|  2      | 15/12/2020 | text3       | cat 3       | loc2       |source 3|
|  3      | 15/12/2020 | text4       | cat 2       | loc3       |source 2|
| ...     | ...        | ...         |             |            |        |

When running GroupBy function Then filter for the specific values in location it returns the correct answers.

grouped = df.groupby(['category','source_num',"source","location"], as_index = False).aggregate('sum')

 grouped.loc[grouped["location"] == "loc2"]

My question is that how can i perform a filter more than one like this:

First filter :

grouped.loc[grouped["location"] == "loc2"]

Second filter :

grouped.loc[grouped["location"] == "loc2" & grouped["category"] .str.contains('cat1')]

Third filter: ....

I think I can perform these above filter by iterating over the groupby with if/else statement object right ??

EXCPECTED RESULT after the filtering based on the first and second filter:

| source_num| source_date| text      | category    |location    | source |
+---------+------------+-------------+-------------+------------+--------+---
|  0      | 15/12/2020 | text2       | cat 2       | loc2       |soucrce2|
|  1      | 15/12/2020 | text3       | cat 3       | loc2       |source 3|

Where the first filter is done and the second is not meet the if statement so the system its not entering into the second filter.

Does this answer your question? Python Pandas: Boolean indexing on multiple columns — skuzzy
– skuzzy, Commented Dec 18, 2020 at 6:17
Please read the documentation at the link below and state what you have already tried and where you currently are. pandas.pydata.org/pandas-docs/stable/user_guide/… — skuzzy
– skuzzy, Commented Dec 18, 2020 at 6:18
@skuzzy no i want like for statement to iterate over the result of the groupby then based on the several if /else statement the last result will display the final result. — DevLeb2022
– DevLeb2022, Commented Dec 18, 2020 at 6:20
@skuzzy i do not understand what i have with the indexing ?? untill now i am able to get the groupby object than perform the first filter.. what i want is to be able to perfrom several filter and return the final result as one dataframe .... maybe i can perform each filter aside then combine all the results from each filter in one dataframe is this can be done ?? — DevLeb2022
– DevLeb2022, Commented Dec 18, 2020 at 6:23
You don't need a for-loop iteration to apply conditional tests - one or many. The results of groupby are also in a valid dataframe and follow the same indexing rule as any other. pandas dataframe Please refer to the links in my comments to see how boolean indexing works with one or many conditional clauses. Pandas strongly advise against itreation over a dataframe - pandas.pydata.org/pandas-docs/stable/user_guide/… — skuzzy
– skuzzy, Commented Dec 18, 2020 at 6:28

Kenan · Accepted Answer · 2020-12-18 06:18:42Z

0

If you want a for loop with if/statements loop through the grouped object

for name, grouped in grouped:
   if ...

answered Dec 18, 2020 at 6:18

Kenan

14.2k9 gold badges47 silver badges56 bronze badges

Sign up to request clarification or add additional context in comments.

5 Comments

DevLeb2022 Over a year ago

so based on your answer in the if/statement i put

if (grouped.loc[grouped["location"] == "loc2"]):       grouped.loc[grouped["location"] == "loc2"] elif (grouped.loc[grouped["location"] == "loc2" & grouped["category"] .str.contains('cat1')]):       (grouped.loc[grouped["location"] == "loc2" & grouped["category"] .str.contains('cat1')]  else:         ....

Kenan Over a year ago

correct; however, remember an if statement checks a bool not array, you can use all() to get a bool for (grouped["location"] == "loc2").all()

DevLeb2022 Over a year ago

but where i use name

Kenan Over a year ago

it doesn't look like you need need it, print it as your looping and see

DevLeb2022 Over a year ago

i tried and it crash and display the below error:

ValueError                                Traceback (most recent call last) <ipython-input-154-83aeadeaa384> in <module> ----> 1 for name, grouped in grouped:     ValueError: too many values to unpack (expected 2)

Collectives™ on Stack Overflow

how to iterate over a pandas groupby object while using if statement in python [duplicate]

1 Answer 1

5 Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

5 Comments

Linked

Related