1

I have a pandas dataframe df which looks like this:

| source_num| source_date| text      | category    |location    | source |
+---------+------------+-------------+-------------+------------+--------+---
|  0      | 15/12/2020 | text1       | cat 1       | loc1       |soucrce1|
|  1      | 15/12/2020 | text2       | cat 2       | loc2       |source 2|
|  2      | 15/12/2020 | text3       | cat 3       | loc2       |source 3|
|  3      | 15/12/2020 | text4       | cat 2       | loc3       |source 2|
| ...     | ...        | ...         |             |            |        |

When running GroupBy function Then filter for the specific values in location it returns the correct answers.

grouped = df.groupby(['category','source_num',"source","location"], as_index = False).aggregate('sum')

 grouped.loc[grouped["location"] == "loc2"]

My question is that how can i perform a filter more than one like this:

First filter :

grouped.loc[grouped["location"] == "loc2"]

Second filter :

grouped.loc[grouped["location"] == "loc2" & grouped["category"] .str.contains('cat1')]

Third filter: ....

I think I can perform these above filter by iterating over the groupby with if/else statement object right ??

EXCPECTED RESULT after the filtering based on the first and second filter:

| source_num| source_date| text      | category    |location    | source |
+---------+------------+-------------+-------------+------------+--------+---
|  0      | 15/12/2020 | text2       | cat 2       | loc2       |soucrce2|
|  1      | 15/12/2020 | text3       | cat 3       | loc2       |source 3|

Where the first filter is done and the second is not meet the if statement so the system its not entering into the second filter.

5
  • 4
    Does this answer your question? Python Pandas: Boolean indexing on multiple columns Commented Dec 18, 2020 at 6:17
  • 1
    Please read the documentation at the link below and state what you have already tried and where you currently are. pandas.pydata.org/pandas-docs/stable/user_guide/… Commented Dec 18, 2020 at 6:18
  • @skuzzy no i want like for statement to iterate over the result of the groupby then based on the several if /else statement the last result will display the final result. Commented Dec 18, 2020 at 6:20
  • @skuzzy i do not understand what i have with the indexing ?? untill now i am able to get the groupby object than perform the first filter.. what i want is to be able to perfrom several filter and return the final result as one dataframe .... maybe i can perform each filter aside then combine all the results from each filter in one dataframe is this can be done ?? Commented Dec 18, 2020 at 6:23
  • You don't need a for-loop iteration to apply conditional tests - one or many. The results of groupby are also in a valid dataframe and follow the same indexing rule as any other. pandas dataframe Please refer to the links in my comments to see how boolean indexing works with one or many conditional clauses. Pandas strongly advise against itreation over a dataframe - pandas.pydata.org/pandas-docs/stable/user_guide/… Commented Dec 18, 2020 at 6:28

1 Answer 1

0

If you want a for loop with if/statements loop through the grouped object

for name, grouped in grouped:
   if ...
Sign up to request clarification or add additional context in comments.

5 Comments

so based on your answer in the if/statement i put if (grouped.loc[grouped["location"] == "loc2"]): grouped.loc[grouped["location"] == "loc2"] elif (grouped.loc[grouped["location"] == "loc2" & grouped["category"] .str.contains('cat1')]): (grouped.loc[grouped["location"] == "loc2" & grouped["category"] .str.contains('cat1')] else: ....
correct; however, remember an if statement checks a bool not array, you can use all() to get a bool for (grouped["location"] == "loc2").all()
but where i use name
it doesn't look like you need need it, print it as your looping and see
i tried and it crash and display the below error: ValueError Traceback (most recent call last) <ipython-input-154-83aeadeaa384> in <module> ----> 1 for name, grouped in grouped: ValueError: too many values to unpack (expected 2)

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.