45

I working on different shapes of arrays and I want to save them all with numpy.save, so, consider I have

mat1 = numpy.arange(8).reshape(4, 2)
mat2 = numpy.arange(9).reshape(2, 3)
numpy.save('mat.npy', numpy.array([mat1, mat2]))

It works. But when I have two matrices with one dimension of same size it's not working.

mat1 = numpy.arange(8).reshape(2, 4)
mat2 = numpy.arange(10).reshape(2, 5)
numpy.save('mat.npy', numpy.array([mat1, mat2]))

It causes
Traceback (most recent call last): File "<input>", line 1, in <module> ValueError: could not broadcast input array from shape (2,4) into shape (2)

And note that the problem caused by numpy.array([mat1, mat2]) and not by numpy.save

I know that such array is possible:

>> numpy.array([[[1, 2]], [[1, 2], [3, 4]]]) array([[[1, 2]], [[1, 2], [3, 4]]], dtype=object)

So, all of what I want is to save two arrays as mat1 and mat2 at once.

3
  • Have you considered using np.savez or pickle with a binary protocol instead? savez saves multiple arrays, save only saves a single array. Commented Feb 1, 2016 at 14:48
  • It works on my computer. What version of python are you using? Commented Feb 1, 2016 at 14:51
  • If the first dimension of mat1 and mat2 are the same, np.array(...) produces this error. You can get around this error by initializing a np.empty((2,),object) array, and filling it with the element arrays. Also do that if all the dimensions are the same (to prevent concatenation). Commented Feb 13, 2020 at 2:35

2 Answers 2

95

If you'd like to save multiple arrays in the same format as np.save, use np.savez.

For example:

import numpy as np

arr1 = np.arange(8).reshape(2, 4)
arr2 = np.arange(10).reshape(2, 5)
np.savez('mat.npz', name1=arr1, name2=arr2)

data = np.load('mat.npz')
print data['name1']
print data['name2']

If you have several arrays, you can expand the arguments:

import numpy as np

data = [np.arange(8).reshape(2, 4), np.arange(10).reshape(2, 5)]
np.savez('mat.npz', *data)

container = np.load('mat.npz')
data = [container[key] for key in container]

Note that the order is not preserved. If you do need to preserve order, you might consider using pickle instead.

If you use pickle, be sure to specify the binary protocol, otherwise the you'll write things using ascii pickle, which is particularly inefficient for numpy arrays. With a binary protocol, ndarrays more or less pickle to the same format as np.save/np.savez. For example:

# Note: This is Python2.x specific. It's identical except for the import on 3.x
import cPickle as pickle
import numpy as np

data = [np.arange(8).reshape(2, 4), np.arange(10).reshape(2, 5)]

with open('mat.pkl', 'wb') as outfile:
    pickle.dump(data, outfile, pickle.HIGHEST_PROTOCOL)

with open('mat.pkl', 'rb') as infile:
    result = pickle.load(infile)

In this case, result and data will have identical contents and the order of the input list of arrays will be preserved.

Sign up to request clarification or add additional context in comments.

8 Comments

Consider I have list of arrays and I want to save them all, and after that to load all.
Also... there is solution for the general problem I wrote?
@Dubon - I'm not quite sure what you're referring to by "the general problem". If you mean writing/reading arbitrary python objects to/from disk, pickle is what you're looking for.
@Dubon - You're getting an object array. It's not a "real" array in the same sense as the others. It's basically a very inefficient list. You're better off using a list instead of creating an object array. As you've noted, this particular result is a 1D array of other arrays. It won't broadcast like a 2D or 3D array because it's 1D. You also won't be able to use mathematical operations in quite the same way (or rather, you'll be hit with some nasty surprises). If you're not already very familiar with numpy, don't use object arrays.
To complement Joe's comment and see why he is right when he says It's not a "real" array in the same sense as the others , see this.
|
5

Small addition: if you'd like to use numpy.savez() and preserve names associated with the saved arrays (instead of arr_0, arr_1, ...) you can pass a dictionary as **kwargs using the double-star operator.

d = {}
d['a'] = np.random.randint(10, size=5)
d['b'] = np.random.randint(10, size=5)
print(d)
# {'a': array([8, 9, 5, 0, 0]), 'b': array([1, 7, 6, 9, 2])}

np.savez("test", **d)
container = np.load("test.npz")

e = {name: container[name] for name in container}
print(e)
# {'a': array([8, 9, 5, 0, 0]), 'b': array([1, 7, 6, 9, 2])}

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.