I have a pandas dataframe df which looks like this:
| source_num| source_date| text | category |location | source |
+---------+------------+-------------+-------------+------------+--------+---
| 0 | 15/12/2020 | text1 | cat 1 | loc1 |soucrce1|
| 1 | 15/12/2020 | text2 | cat 2 | loc2 |source 2|
| 2 | 15/12/2020 | text3 | cat 3 | loc2 |source 3|
| 3 | 15/12/2020 | text4 | cat 2 | loc3 |source 2|
| ... | ... | ... | | | |
When running GroupBy function Then filter for the specific values in location it returns the correct answers.
grouped = df.groupby(['category','source_num',"source","location"], as_index = False).aggregate('sum')
grouped.loc[grouped["location"] == "loc2"]
My question is that how can i perform a filter more than one like this:
First filter :
grouped.loc[grouped["location"] == "loc2"]
Second filter :
grouped.loc[grouped["location"] == "loc2" & grouped["category"] .str.contains('cat1')]
Third filter: ....
I think I can perform these above filter by iterating over the groupby with if/else statement object right ??
EXCPECTED RESULT after the filtering based on the first and second filter:
| source_num| source_date| text | category |location | source |
+---------+------------+-------------+-------------+------------+--------+---
| 0 | 15/12/2020 | text2 | cat 2 | loc2 |soucrce2|
| 1 | 15/12/2020 | text3 | cat 3 | loc2 |source 3|
Where the first filter is done and the second is not meet the if statement so the system its not entering into the second filter.
groupbyare also in a valid dataframe and follow the same indexing rule as any other. pandas dataframe Please refer to the links in my comments to see how boolean indexing works with one or many conditional clauses. Pandas strongly advise against itreation over a dataframe - pandas.pydata.org/pandas-docs/stable/user_guide/…