2

I have looked this up and I think what I have should work, but it isn't. The first condition (>= 80) is being evaluated but the second (<= 100) is not.

I want every row in which ANY column value is between 80 and 100 inclusive, BUT, if any column value is greater than 100 then exclude it.

I should only see the rows AP-2, AP-8 and AP-9.

import pandas as pd

df = pd.DataFrame({'AP-1': [30, 32, 34, 31, 33, 35, 36, 38, 37],
                   'AP-2': [30, 32, 34, 80, 33, 35, 36, 38, 37],
                   'AP-3': [30, 32, 81, 31, 33, 101, 36, 38, 37],
                   'AP-4': [30, 32, 34, 95, 33, 35, 103, 38, 121],
                   'AP-5': [30, 32, 34, 31, 33, 144, 36, 38, 37],
                   'AP-6': [30, 32, 34, 31, 33, 35, 36, 110, 37],
                   'AP-7': [30, 87, 34, 31, 111, 35, 36, 38, 122],
                   'AP-8': [30, 32, 99, 31, 33, 35, 36, 38, 37],
                   'AP-9': [30, 32, 34, 31, 33, 99, 88, 38, 37]}, index=['1', '2', '3', '4', '5', '6', '7', '8', '9'])


df1 = df.transpose()

print(df1)
print()

df2 = df1[(df1.values >= 80).any(1) & (df1.values <= 100).any(1)]

print(df2)

df2 is coming out as: 1 2 3 4 5 6 7 8 9 AP-2 30 32 34 80 33 35 36 38 37 AP-3 30 32 81 31 33 101 36 38 37 AP-4 30 32 34 95 33 35 103 38 121 AP-5 30 32 34 31 33 144 36 38 37 AP-6 30 32 34 31 33 35 36 110 37 AP-7 30 87 34 31 111 35 36 38 122 AP-8 30 32 99 31 33 35 36 38 37 AP-9 30 32 34 31 33 99 88 38 37

2 Answers 2

4

Here is another idea, separate the masks and use & to join:

import pandas as pd

df = pd.DataFrame({'AP-1': [30, 32, 34, 31, 33, 35, 36, 38, 37],
                   'AP-2': [30, 32, 34, 80, 33, 35, 36, 38, 37],
                   'AP-3': [30, 32, 81, 31, 33, 101, 36, 38, 37],
                   'AP-4': [30, 32, 34, 95, 33, 35, 103, 38, 121],
                   'AP-5': [30, 32, 34, 31, 33, 144, 36, 38, 37],
                   'AP-6': [30, 32, 34, 31, 33, 35, 36, 110, 37],
                   'AP-7': [30, 87, 34, 31, 111, 35, 36, 38, 122],
                   'AP-8': [30, 32, 99, 31, 33, 35, 36, 38, 37],
                   'AP-9': [30, 32, 34, 31, 33, 99, 88, 38, 37]}, 
                   index=['1', '2', '3', '4', '5', '6', '7', '8', '9'])

# This is the actual frame you want
df = df.transpose()

m1 = (df >= 80).any(1) 
m2 = ~(df >= 100).any(1) #<-- Invert the statement with ~

df2 = df.loc[m1&m2]
print(df2)

Prints:

      1   2   3   4   5   6   7   8   9
AP-2  30  32  34  80  33  35  36  38  37
AP-8  30  32  99  31  33  35  36  38  37
AP-9  30  32  34  31  33  99  88  38  37
Sign up to request clarification or add additional context in comments.

6 Comments

Thank you. I am quite new to pandas and had never seen an implementation like the one you present here. I think I like yours better.
@MarkS It is a matter of preference how you write your code. I however prefer more rows than less (think it is more clear that way).
Is there a reason you went with m2 = ~(df >= 100).any(1) as opposed to m2 = (df <= 100).all(1) ?
@MarkS Mostly because how you formulated the problem: if any column value is greater than 100 then exclude it.
Oh, I think I see. While equivalent in the results they produce, over a large data set yours probably runs faster since .any would terminate quicker. .all would have to check every column every time, .any would find one instance that violates my requirement and opt out right then and there.
|
1

Ah, I got it. I needed .all(1) for the <= 100.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.