3

I have two arrays both from text file. By observation, it totally looks the same. However when I test the equivalence of the two arrays, they fail - element wise, shape wise etc.. I used the numpy test answered here.

Here are the two matrices.

import numpy as np

class TextMatrixAssertions(object):
    def assertArrayEqual(self, dataX, dataY):
        x = np.loadtxt(dataX)
        y = np.loadtxt(dataY)

        if not np.array_equal(x, y):
            raise Exception("array_equal fail.")

        if not np.array_equiv(x, y):
            raise Exception("array_equiv fail.")

        if not np.allclose(x, y):
            raise Exception("allclose fail.")

dataX = "MyMatrix.txt"
dataY = "MyMatrix2.txt"
test = TextMatrixAssertions()
test.assertArrayEqual(dataX, dataY)

I want to know if there is really some difference between the two arrays or if not, what is causing the failures.

3
  • Presumably printing your values makes them appear the same? I would try doing a print(repr(x)) and print(repr(y)) and see if that makes it more clear how the values differ. docs.python.org/3/library/functions.html#repr tries to print "a string that would yield an object with the same value when passed to eval()" Commented Sep 10, 2019 at 6:45
  • 1
    You do realize that your raise statements abort the execution of your method, right? So in case array_equal() returns False, allclose() is never reached. Commented Sep 10, 2019 at 7:11
  • Yes, I comment else to check others. Commented Sep 10, 2019 at 7:11

3 Answers 3

10

They are not equal, they have 54 different elements.

np.sum(x!=y)

54

To find what elements are different you can do this:

np.where(x!=y)


(array([  1,   5,   7,  11,  19,  24,  32,  48,  82,  92,  97, 111, 114,
        119, 128, 137, 138, 146, 153, 154, 162, 165, 170, 186, 188, 204,
        215, 246, 256, 276, 294, 300, 305, 316, 318, 333, 360, 361, 390,
        419, 420, 421, 423, 428, 429, 429, 439, 448, 460, 465, 467, 471,
        474, 487]),
 array([18, 18, 18, 17, 17, 16, 15, 12,  8,  6,  5,  4,  3,  3,  2,  1,  1,
        26,  0, 25, 24, 24, 24, 23, 22, 20, 20, 17, 16, 14, 11, 11, 11, 10,
        10,  9,  7,  7,  5,  1,  1,  1, 26,  1,  0, 25, 23, 21, 19, 18, 18,
        17, 17, 14]))
Sign up to request clarification or add additional context in comments.

1 Comment

How do you find their indexes?
0

You should try first your code with a smaller and simpler matrix to test your function.

For example:

import numpy as np
from io import StringIO



class TextMatrixAssertions(object):
    def assertArrayEqual(self, dataX, dataY):
        x = np.loadtxt(dataX)
        y = np.loadtxt(dataY)

        if not np.array_equal(x, y):
            raise Exception("array_equal fail.")

        if not np.array_equiv(x, y):
            raise Exception("array_equiv fail.")

        if not np.allclose(x, y):
            raise Exception("allclose fail.")

        return True

a = StringIO(u"0 1\n2 3")
b = StringIO(u"0 1\n2 3")
test = TextMatrixAssertions()
test.assertArrayEqual(a,b)

Output

True

So I guess your problem is with your file, not your code. You can also try to load the same file in x and y and see the output.

To see what elements are different you can try with not_equal

Example

a = StringIO(u"0 1\n2 3")
c = StringIO(u"0 1\n2 4")
x = np.loadtxt(a)
y = np.loadtxt(c)
np.not_equal(x,y)

Output

array([[False, False],
       [False,  True]])

2 Comments

Yes. I am aware that functions dont have problem since I use it on other arrays I test. But for this specific matrices, I just can't see the differences.
Then use the not_equal function to see what elements are different
-1

One more solution. You can see the value of the elements that are not equal. If you run the below code, than you will see that elements that have nan values are not equal and hence are causing to raise an exception.

import numpy as np

class TextMatrixAssertions(object):
    def assertArrayEqual(self, dataX, dataY):
        x = np.loadtxt(dataX)
        y = np.loadtxt(dataY)

        if not np.array_equal(x, y):
            not_equal_idx = np.where(x != y)
            for idx1, idx2 in zip(not_equal_idx[0],not_equal_idx[1]):
                print(x[idx1][idx2])
                print(y[idx1][idx2])
            raise Exception("array_equal fail.")

        if not np.array_equiv(x, y):
            raise Exception("array_equiv fail.")

        if not np.allclose(x, y):
            raise Exception("allclose fail.")

dataX = "MyMatrix.txt"
dataY = "MyMatrix2.txt"
test = TextMatrixAssertions()
test.assertArrayEqual(dataX, dataY)

output:

nan
nan
nan
...
nan

10 Comments

This is needlessly complicated.
@Nils Werner, can you comment why is this complicated? The idea is to explain to the OP of the reasons why his code is not behaving as he expected. How does my code complicate the understanding of this?
Using indices to access the differing elements is not the most efficient, the for loop is unnecessary and using x[i][j] bad style and can have unintended consequences.
I do not understand you. maybe you mean some corner cases when the input files are empty or have different lengths. however for the input provided by the OP and for the purposes of explaining him why the code is raising an exception, my code does the job.
Yes, its just needlessly complicated :-) idx = x != y; print(x[idx], y[idx]) does the same, but simpler and faster.
|

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.