5

I have a pandas dataframe containing rows with numbered columns:

    1  2  3  4  5
a   0  0  0  0  1
b   1  1  2  1  9             
c   2  2  2  2  2
d   5  5  5  5  5
e   8  9  9  9  9

How can I filter out the rows where a subset of columns are all above or below a certain value?

So, for example: I want to remove all rows where columns 1 to 3 all values are not > 3. In the above, that would leave me with only rows d and e.

The columns I am filtering and the value I am checking against are both arguments.

I've tried a few things, this is the closest I've gotten:

df[df[range(1,3)]>3]

Any ideas?

0

2 Answers 2

5

I used loc and all in this function:

def filt(df, cols, thresh):
    return df.loc[(df[cols] > thresh).all(axis=1)]

filt(df, [1, 2, 3], 3)

   1  2  3  4  5
d  5  5  5  5  5
e  8  9  9  9  9
Sign up to request clarification or add additional context in comments.

2 Comments

@piRSquared NP --- here to help
Awesome, very concise! Thanks!
1

You can achieve this without using apply:

In [73]:
df[(df.ix[:,0:3] > 3).all(axis=1)]

Out[73]:
   1  2  3  4  5
d  5  5  5  5  5
e  8  9  9  9  9

So this slices the df to just the first 3 columns using ix and then we compare against the scalar 3 and then call all(axis=1) to create a boolean series to mask the index

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.