24

I am trying to color points of a pandas dataframe depending on TWO conditions. Example:

IF value of col1 > a AND value of col2 - value of col3 < b THEN value of col4 = string
ELSE value of col4 = other string.

I have tried so many different ways now and everything I found online was only depending on one condition.

My example code always raises the Error:

The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

Here's the code. Tried several variations without success.

df = pd.DataFrame()

df['A'] = range(10)
df['B'] = range(11,21,1)
df['C'] = range(20,10,-1)

borderE = 3.
ex = 0.

#print df

df['color'] = np.where(all([df.A < borderE, df.B - df.C < ex]), 'r', 'b')

Btw: I understand, what it says but not how to handle it.

4 Answers 4

34

Selection criteria uses Boolean indexing:

df['color'] = np.where(((df.A < borderE) & ((df.B - df.C) < ex)), 'r', 'b')

>>> df
   A   B   C color
0  0  11  20     r
1  1  12  19     r
2  2  13  18     r
3  3  14  17     b
4  4  15  16     b
5  5  16  15     b
6  6  17  14     b
7  7  18  13     b
8  8  19  12     b
9  9  20  11     b
Sign up to request clarification or add additional context in comments.

Comments

14

wrap the IF in a function and apply it:

def color(row):
    borderE = 3.
    ex = 0.
    if (row.A > borderE) and( row.B - row.C < ex) :
        return "somestring"
    else:
        return "otherstring"

df.loc[:, 'color'] = df.apply(color, axis = 1)

Yields:

  A   B   C        color
0  0  11  20  otherstring
1  1  12  19  otherstring
2  2  13  18  otherstring
3  3  14  17  otherstring
4  4  15  16   somestring
5  5  16  15  otherstring
6  6  17  14  otherstring
7  7  18  13  otherstring
8  8  19  12  otherstring
9  9  20  11  otherstring

1 Comment

I had to check whether a string from column A is present in a list from column B and this method came to the rescue!. My condition int the function looks like this if row['category'].upper() in eval(row['title_topo predictions']) where category is string value, title_topo predictions is a list of strings.
0

for me @Alexander 's solution didn't work in my specific case, I had to use lists to pass the two conditions, and then transpose the output, My solution also work for this case:

df['color'] = np.where([df.A < borderE] and [df.B - df.C < ex], 'r', 'b').transpose()

Yields:

    A   B   C   color
0   0   11  20  r
1   1   12  19  r
2   2   13  18  r
3   3   14  17  b
4   4   15  16  b
5   5   16  15  b
6   6   17  14  b
7   7   18  13  b
8   8   19  12  b
9   9   20  11  b

Comments

0

The error in the question occurred because OP used all() function instead of the bitwise-& operator to chain multiple comparisons together.1

Another way to chain multiple comparisons is to evaluate an expression via eval() method. The following checks if A value is less than borderE AND (B - C) value is less than ex for each row.

df.eval('A < @borderE and B - C < @ex')

Using its outcome as a numpy.where() condition, you can assign values by

df['color'] = np.where(df.eval('A < @borderE and B - C < @ex'), 'r', 'b')

If you installed numexpr (pip install numexpr) as recommended in the pandas documentation, this method should perform as well (and better if you have a lot of conditions to reduce) as chaining via &. The advantage is that (i) it's much more readable (imo) and (ii) you don't need to worry about brackets (), and/& etc. anymore because the order of precedence inside the string expression is the same as that in Python.

1: all() by design doesn't work on pandas Series (or numpy arrays) because the pandas developers felt that it's ambiguous when to return True: if any element is True or if all elements are True; in fact, it's not clear if a single bool should be returned or a boolean Series should be returned when you call all(). You can explicitly tell it that you want a single bool returned by calling items(), all() etc. on the Series object (which the error is saying) but if you do so (e.g. all([(df.A < borderE).all(), (df.B - df.C < ex).all()])), the conditions will be aggregated into a single boolean value, not the desired boolean vector.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.