How to check if a list of numpy arrays contains a given test array?

Question

I have a list of numpy arrays, say,

a = [np.random.rand(3, 3), np.random.rand(3, 3), np.random.rand(3, 3)]

and I have a test array, say

b = np.random.rand(3, 3)

I want to check whether a contains b or not. However

b in a

throws the following error:

ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

What is the proper way for what I want?

what do you mean by list comprehension? In my understanding list comprehension means something like — Guldam Kwak
– Guldam Kwak, Commented Aug 1, 2018 at 10:40

Nils Werner · Accepted Answer · 2018-08-01 10:57:15Z

5

You can just make one array of shape (3, 3, 3) out of a:

a = np.asarray(a)

And then compare it with b (we're comparing floats here, so we should use isclose())

np.all(np.isclose(a, b), axis=(1, 2))

For example:

a = [np.random.rand(3,3),np.random.rand(3,3),np.random.rand(3,3)]
a = np.asarray(a)
b = a[1, ...]       # set b to some value we know will yield True

np.all(np.isclose(a, b), axis=(1, 2))
# array([False,  True, False])

answered Aug 1, 2018 at 10:57

Nils Werner

37.2k7 gold badges84 silver badges108 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Omar Al Zeidi · Accepted Answer · 2018-08-01 14:48:01Z

As highlighted by @jotasi the truth value is ambiguous due to element-wise comparison within the array. There was a previous answer to this question here. Overall your task can be done in various ways:

list-to-array:

You can use the "in" operator by converting the list to a (3,3,3)-shaped array as follows:

    >>> a = [np.random.rand(3, 3), np.random.rand(3, 3), np.random.rand(3, 3)]
    >>> a= np.asarray(a)
    >>> b= a[1].copy()
    >>> b in a
    True

np.all:

>>> any(np.all((b==a),axis=(1,2)))
True

list-comperhension: This done by iterating over each array:
```
>>> any([(b == a_s).all() for a_s in a])
True
```

Below is a speed comparison of the three approaches above:

Speed Comparison

import numpy as np
import perfplot

perfplot.show(
    setup=lambda n: np.asarray([np.random.rand(3*3).reshape(3,3) for i in range(n)]),
    kernels=[
        lambda a: a[-1] in a,
        lambda a: any(np.all((a[-1]==a),axis=(1,2))),
        lambda a: any([(a[-1] == a_s).all() for a_s in a])
        ],
    labels=[
        'in', 'np.all', 'list_comperhension'
        ],
    n_range=[2**k for k in range(1,20)],
    xlabel='Array size',
    logx=True,
    logy=True,
    )

FHTMitchell · Accepted Answer · 2018-08-01 10:47:41Z

Ok so in doesn't work because it's effectively doing

def in_(obj, iterable):
    for elem in iterable:
        if obj == elem:
            return True
    return False

Now, the problem is that for two ndarrays a and b, a == b is an array (try it), not a boolean, so if a == b fails. The solution is do define a new function

def array_in(arr, list_of_arr):
     for elem in list_of_arr:
        if (arr == elem).all():
            return True
     return False

a = [np.arange(5)] * 3
b = np.ones(5)

array_in(b, a) # --> False

samorr · Accepted Answer · 2018-08-01 10:50:58Z

This error is because if a and b are numpy arrays then a == b doesn't return True or False, but array of boolean values after comparing a and b element-wise.

You can try something like this:

np.any([np.all(a_s == b) for a_s in a])

[np.all(a_s == b) for a_s in a] Here you are creating list of boolean values, iterating through elements of a and checking if all elements in b and particular element of a are the same.
With np.any you can check if any element in your array is True

jotasi · Accepted Answer · 2018-08-01 10:52:40Z

As pointed out in this answer, the documentation states that:

For container types such as list, tuple, set, frozenset, dict, or collections.deque, the expression x in y is equivalent to any(x is e or x == e for e in y).

a[0]==b is an array, though, containing an element-wise comparison of a[0] and b. The overall truth value of this array is obviously ambiguous. Are they the same if all elements match, or if most match of if at least one matches? Therefore, numpy forces you to be explicit in what you mean. What you want to know, is to test whether all elements are the same. You can do that by using numpy's all method:

any((b is e) or (b == e).all() for e in a)

or put in a function:

def numpy_in(arrayToTest, listOfArrays):
    return any((arrayToTest is e) or (arrayToTest == e).all()
               for e in listOfArrays)

Upasana Mittal · Accepted Answer · 2018-08-01 11:35:15Z

0

Use array_equal from numpy

    import numpy as np
    a = [np.random.rand(3,3),np.random.rand(3,3),np.random.rand(3,3)]
    b = np.random.rand(3,3)

    for i in a:
        if np.array_equal(b,i):
            print("yes")

edited Aug 1, 2018 at 11:35

answered Aug 1, 2018 at 10:59

Upasana Mittal

2,6991 gold badge17 silver badges21 bronze badges

1 Comment

jotasi Over a year ago

Actually, (a==b).all() is not slower than np.array_equal(a, b). The main difference is, that np.array_equal tests the shape of the array first.

Collectives™ on Stack Overflow

How to check if a list of numpy arrays contains a given test array?

6 Answers 6

Comments

Comments

Comments

Comments

Comments

1 Comment

Linked

Hot Network Questions

Collectives™ on Stack Overflow

6 Answers 6

Comments

Comments

Comments

Comments

Comments

1 Comment

Linked

Related