Saving Numpy Structure Array to *.mat file

Question

I am using numpy.loadtext to generate a structured Numpy array from a CSV data file that I would like to save to a MAT file for colleagues who are more familiar with MATLAB than Python.

Sample case:

import numpy as np
import scipy.io

mydata = np.array([(1, 1.0), (2, 2.0)], dtype=[('foo', 'i'), ('bar', 'f')])
scipy.io.savemat('test.mat', mydata)

When I attempt to use scipy.io.savemat on this array, the following error is thrown:

Traceback (most recent call last):
  File "C:/Project Data/General Python/test.py", line 6, in <module>
    scipy.io.savemat('test.mat', mydata)
  File "C:\python35\lib\site-packages\scipy\io\matlab\mio.py", line 210, in savemat
    MW.put_variables(mdict)
  File "C:\python35\lib\site-packages\scipy\io\matlab\mio5.py", line 831, in put_variables
    for name, var in mdict.items():
AttributeError: 'numpy.ndarray' object has no attribute 'items'

I'm a Python novice (at best), but I'm assuming this is because savemat is set up to handle dicts and the structure of Numpy's structured arrays is not compatible.

I can get around this error by pulling my data into a dict:

tmp = {}
for varname in mydata.dtype.names:
    tmp[varname] = mydata[varname]

scipy.io.savemat('test.mat', tmp)

Which loads into MATLAB fine:

>> mydata = load('test.mat')

mydata = 

    foo: [1 2]
    bar: [1 2]

But this seems like a very inefficient method since I'm duplicating the data in memory. Is there a smarter way to accomplish this?

Don't worry about potential data copies. savemat has to manipulate the data so it can write it in a MATLAB compatible form. File writing takes more time than array copy. Focus on the best MATLAB data structure. — hpaulj
– hpaulj, Commented Feb 29, 2016 at 19:06

MB-F · Accepted Answer · 2016-02-29 18:36:32Z

17

You can do scipy.io.savemat('test.mat', {'mydata': mydata}).

This creates a struct mydata with fields foo and bar in the file.

Alternatively, you can pack your loop in a dict comprehension:

tmp = {varname: mydata[varname] for varname in mydata.dtype.names}

I don't think creating a temprorary dictionary duplicates data in memory, because Python generally only stores references, and numpy in particular tries to create views into the original data whenever possible.

answered Feb 29, 2016 at 18:36

MB-F

23.8k5 gold badges70 silver badges127 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

hpaulj Over a year ago

In quick time tests, saving tmp is faster than saving mydata. But time shouldn't be the big issue here.

Collectives™ on Stack Overflow

Saving Numpy Structure Array to *.mat file

1 Answer 1

1 Comment

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

1 Comment

Linked

Related