2

I want to consider only rows which have one or more columns greater than a value. My actual df has 26 columns. I wanted an iterative solution. Below I am giving an example with three columns.

My code:

df = pd.DataFrame(np.random.randint(5, 15, (10, 3)), columns=list('abc'))
# In this dataframe I want to select rows that have one or more columns greater than 10. 
# solution dataframe 
sdf = df[(df['a']>10)|(df['b']>10)|(df['c']>10)]

How do I apply the same solution with 26 columns. Writing 26 columns inside [] I don't think is a pythonic way.

3 Answers 3

4

Try this:

df[df.gt(10).any(axis=1)]

Output:

    a   b   c
0   7   13  14
2   8   8   12
3   8   12  5
5   13  14  7
6   11  12  10
8   14  5   14
9   12  10  7
Sign up to request clarification or add additional context in comments.

Comments

2

You can use apply function to calculate min of all columns. And min (all columns) should be greater than 10.

df[df.apply(lambda x: min(x), axis=1) > 10]

Comments

2

You could drop columns that has any Nan value in df[df>10].

>>> df[df>10].dropna(how='any')
      a     b     c
5  14.0  12.0  13.0

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.