4

I want to check how many numpy array elements inside numpy array are different. The solution should not contain list comprehension. Something along these lines (note that a and b differ in the last array):

a = np.array( [[1,1,1],[2,2,2],[3,3,3],[4,4,4],[5,5,5]] )
b = np.array( [[1,1,1],[2,2,2],[3,3,3],[4,4,4],[5,0,0]] )
y = diff_count( a,b )
print y

>> 1
1
  • 2
    Why 1? Two Elements are different. Commented Apr 12, 2018 at 8:04

4 Answers 4

2

Approach #1

Perform element-wise comparison for non-equality and then get ANY reduction along last axis and finally count -

(a!=b).any(-1).sum()

Approach #2

Probably faster one with np.count_nonzero for counting booleans -

np.count_nonzero((a!=b).any(-1))

Approach #3

Much faster one with views -

# https://stackoverflow.com/a/45313353/ @Divakar
def view1D(a, b): # a, b are arrays
    a = np.ascontiguousarray(a)
    b = np.ascontiguousarray(b)
    void_dt = np.dtype((np.void, a.dtype.itemsize * a.shape[1]))
    return a.view(void_dt).ravel(),  b.view(void_dt).ravel()

a1D,b1D = view1D(a,b)
out = np.count_nonzero(a1D!=b1D)

Benchmarking

In [32]: np.random.seed(0)
    ...: m,n = 10000,100
    ...: a = np.random.randint(0,9,(m,n))
    ...: b = a.copy()
    ...: 
    ...: # Let's set 10% of rows as different ones
    ...: b[np.random.choice(len(a), len(a)//10, replace=0)] = 0

In [33]: %timeit (a!=b).any(-1).sum() # app#1 from this soln
    ...: %timeit np.count_nonzero((a!=b).any(-1)) # app#2
    ...: %timeit np.any(a - b, axis=1).sum() # @Graipher's soln
1000 loops, best of 3: 1.14 ms per loop
1000 loops, best of 3: 1.08 ms per loop
100 loops, best of 3: 2.33 ms per loop

In [34]: %%timeit  # app#3
    ...: a1D,b1D = view1D(a,b)
    ...: out = np.count_nonzero((a1D!=b1D).any(-1))
1000 loops, best of 3: 797 µs per loop
Sign up to request clarification or add additional context in comments.

Comments

1

You can try it using np.ravel(). If you want element wise comparison.

(a.ravel()!=b.ravel()).sum()
(a-b).any(axis=0).sum()

above lines gives 2 as output.

If you want row wise comparison, you can use.

(a-b).any(axis=1).sum()

This gives 1 as output.

1 Comment

Outputs 2. Think OP wants to compare on per row basis and not element-wise.
0

You can use numpy.any for this:

y = np.any(a - b, axis=1).sum()

Comments

0

Would this work?

y=sum(a[i]!=b[i]for i in range len(a))

Sorry that I can’t test this myself right now.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.