Loading .npy File Loads an Empty Array

Question

I have a TfIDF matrix of size

tr_tfidf_q1.shape, tr_tfidf_q2.shape which gives 
( (404288, 83766), (404288, 83766) )

Now I save it using

np.save('tr_tfidf_q1.npy', tr_tfidf_q1)

When I load the file like this

f = np.load('tr_tfidf_q1.npy') 
f.shape() ## returns an empty array.
()

Thanks in advance.

What's the size of the file (from OS)?

hpaulj
– hpaulj

2017-04-17 16:58:43 +00:00
Commented Apr 17, 2017 at 16:58 — hpaulj
– hpaulj, Commented Apr 17, 2017 at 16:58
Its around 37MB. But i can it now as an array as well.

Anurag Upadhyaya
– Anurag Upadhyaya

2017-04-17 17:09:04 +00:00
Commented Apr 17, 2017 at 17:09 — Anurag Upadhyaya
– Anurag Upadhyaya, Commented Apr 17, 2017 at 17:09

hpaulj · Accepted Answer · 2017-04-17 17:23:32Z

In [172]: from scipy import sparse
In [173]: M=sparse.csr_matrix(np.eye(10))
In [174]: np.save('test.npy',M)


In [175]: f=np.load('test.npy')
In [176]: f
Out[176]: 
array(<10x10 sparse matrix of type '<class 'numpy.float64'>'
    with 10 stored elements in Compressed Sparse Row format>, dtype=object)

Note the dtype=object wrapper. This has shape (), 0d. A sparse matrix is not a regular array, or subclass. So np.save resorts to wrapping it in an object array, and letting the object's own pickle method take care of the writing.

In [177]: f.item()
Out[177]: 
<10x10 sparse matrix of type '<class 'numpy.float64'>'
    with 10 stored elements in Compressed Sparse Row format>
In [178]: f.shape
Out[178]: ()

Using pickle directly:

In [181]: with open('test.pkl','wb') as f:
     ...:     pickle.dump(M,f)

In [182]: with open('test.pkl','rb') as f:
     ...:     M1=pickle.load(f)    
In [183]: M1
Out[183]: 
<10x10 sparse matrix of type '<class 'numpy.float64'>'
    with 10 stored elements in Compressed Sparse Row format>

The newest scipy release has new function for saving sparse matrices

https://docs.scipy.org/doc/scipy/reference/generated/scipy.sparse.save_npz.html

Anurag Upadhyaya · Accepted Answer · 2023-05-05 11:33:54Z

0

I solved it myself.

f = np.load('tr_tfidf.npy')
f ## returns the below.

array(<404288x83766 sparse matrix of type '<class 'numpy.float64'>'
with 2117757 stored elements in Compressed Sparse Row format>, dtype=object)

I belive XYZ.shape works with references as well.

edited May 5, 2023 at 11:33

answered Apr 17, 2017 at 16:54

Anurag Upadhyaya

2336 silver badges13 bronze badges

1 Comment

hpaulj Over a year ago

A csr_matrix is not a regular array, and is not saved directly by np.save. Instead it wraps it in a 0d object array, and the sparse matrix is pickled. So f.shape is the shape of that wrapper. f.item() should give you the sparse matrix itself.

Collectives™ on Stack Overflow

Loading .npy File Loads an Empty Array

2 Answers 2

Comments

1 Comment

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

1 Comment

Related