Dump a NumPy array into a csv file

Question

How do I dump a 2D NumPy array into a csv file in a human-readable format?

cs95 · Accepted Answer · 2017-08-26 05:39:44Z

1270

numpy.savetxt saves an array to a text file.

import numpy
a = numpy.asarray([ [1,2,3], [4,5,6], [7,8,9] ])
numpy.savetxt("foo.csv", a, delimiter=",")

edited Aug 26, 2017 at 5:39

cs95

406k106 gold badges744 silver badges793 bronze badges

answered May 21, 2011 at 10:10

Jim Brissom

33.2k4 gold badges41 silver badges33 bronze badges

Sign up to request clarification or add additional context in comments.

13 Comments

Ehtesh Choudhury Over a year ago

is this preferred over looping through the array by dimension? I'm guessing so.

Andrea Zonca Over a year ago

you can also change the format of each figure with the fmt keyword. default is '%.18e', this can be hard to read, you can use '%.3e' so only 3 decimals are shown.

Dexter Over a year ago

Andrea, Yes I used %10.5f. It was pretty convenient.

Ébe Isaac Over a year ago

Your method works well for numerical data, but it throws an error for numpy.array of strings. Could you prescribe a method to save as csv for an numpy.array object containing strings?

Luis Over a year ago

@ÉbeIsaac You can specify the format as string as well: fmt='%s'

|

Mateen Ulhaq · Accepted Answer · 2023-07-13 07:38:02Z

244

Use the pandas library's DataFrame.to_csv. It does take some extra memory, but it's very fast and easy to use.

import pandas as pd 
df = pd.DataFrame(np_array)
df.to_csv("path/to/file.csv")

If you don't want a header or index, use:

df.to_csv("path/to/file.csv", header=False, index=False)

edited Jul 13, 2023 at 7:38

Mateen Ulhaq

27.8k21 gold badges121 silver badges155 bronze badges

answered Dec 12, 2016 at 8:38

maxbellec

17.7k10 gold badges38 silver badges44 bronze badges

12 Comments

mork Over a year ago

I find it again and again that the best csv exports are when 'piped' into pandas' to_csv

Tex Over a year ago

Not good. This creates a df and consumes extra memory for nothing

thepunitsingh Over a year ago

worked like charm, it's very fast - tradeoff for extra memory usage. parameters header=None, index=None remove header row and index column.

Dave C Over a year ago

The numpy.savetxt method is great, but it puts a hash symbol at the start of the header line.

Milind R Over a year ago

@DaveC : You have to set the comments keyword argument to '', the # will be suppressed.

|

YakovL · Accepted Answer · 2018-01-07 17:57:45Z

61

tofile is a convenient function to do this:

import numpy as np
a = np.asarray([ [1,2,3], [4,5,6], [7,8,9] ])
a.tofile('foo.csv',sep=',',format='%10.5f')

The man page has some useful notes:

This is a convenience function for quick storage of array data. Information on endianness and precision is lost, so this method is not a good choice for files intended to archive data or transport data between machines with different endianness. Some of these problems can be overcome by outputting the data as text files, at the expense of speed and file size.

Note. This function does not produce multi-line csv files, it saves everything to one line.

edited Jan 7, 2018 at 17:57

YakovL

8,43813 gold badges74 silver badges117 bronze badges

answered May 12, 2015 at 11:37

Lee

31.4k31 gold badges124 silver badges187 bronze badges

3 Comments

Peter Over a year ago

As far as I can tell, this does not produce a csv file, but puts everything on a single line.

Lee Over a year ago

@Peter, good point, thanks, I've updated the answer. For me it does save ok in csv format (albeit limited to one line). Also, it's clear that the asker's intent is to "dump it in human-readable format" - so I think the answer is relevant and useful.

eaydin Over a year ago

Actually, np.savetext() provides the newline argument, not np.tofile()

Daksh Gupta · Accepted Answer · 2018-10-26 03:40:25Z

As already discussed, the best way to dump the array into a CSV file is by using .savetxt(...)method. However, there are certain things we should know to do it properly.

For example, if you have a numpy array with dtype = np.int32 as

   narr = np.array([[1,2],
                 [3,4],
                 [5,6]], dtype=np.int32)

and want to save using savetxt as

np.savetxt('values.csv', narr, delimiter=",")

It will store the data in floating point exponential format as

1.000000000000000000e+00,2.000000000000000000e+00
3.000000000000000000e+00,4.000000000000000000e+00
5.000000000000000000e+00,6.000000000000000000e+00

You will have to change the formatting by using a parameter called fmt as

np.savetxt('values.csv', narr, fmt="%d", delimiter=",")

to store data in its original format

Saving Data in Compressed gz format

Also, savetxt can be used for storing data in .gz compressed format which might be useful while transferring data over network.

We just need to change the extension of the file as .gz and numpy will take care of everything automatically

np.savetxt('values.gz', narr, fmt="%d", delimiter=",")

Hope it helps

Mike T · Accepted Answer · 2021-03-25 22:03:49Z

Writing record arrays as CSV files with headers requires a bit more work.

This example reads from a CSV file (example.csv) and writes its contents to another CSV file (out.csv).

import numpy as np

# Write an example CSV file with headers on first line
with open('example.csv', 'w') as fp:
    fp.write('''\
col1,col2,col3
1,100.1,string1
2,222.2,second string
''')

# Read it as a Numpy record array
ar = np.recfromcsv('example.csv', encoding='ascii')
print(repr(ar))
# rec.array([(1, 100.1, 'string1'), (2, 222.2, 'second string')], 
#           dtype=[('col1', '<i8'), ('col2', '<f8'), ('col3', '<U13')])

# Write as a CSV file with headers on first line
with open('out.csv', 'w') as fp:
    fp.write(','.join(ar.dtype.names) + '\n')
    np.savetxt(fp, ar, '%s', ',')

Note that the above example cannot handle values which are strings with commas. To always enclose non-numeric values within quotes, use the csv built-in module:

import csv

with open('out2.csv', 'w', newline='') as fp:
    writer = csv.writer(fp, quoting=csv.QUOTE_NONNUMERIC)
    writer.writerow(ar.dtype.names)
    writer.writerows(ar.tolist())

This is where pandas again helps. You can do: pd.DataFrame(out, columns=['col1', 'col2']), etc

Ege Kaan Gürkan · Accepted Answer · 2022-05-12 20:00:58Z

11

To store a NumPy array to a text file, import savetxt from the NumPy module

consider your Numpy array name is train_df:

import numpy as np
np.savetxt('train_df.txt', train_df, fmt='%s')

OR

from numpy import savetxt
savetxt('train_df.txt', train_df, fmt='%s')

edited May 12, 2022 at 20:00

Ege Kaan Gürkan

2,9132 gold badges15 silver badges24 bronze badges

answered Jul 25, 2021 at 14:49

Hemang Dhanani

1931 silver badge5 bronze badges

1 Comment

Atybzz Over a year ago

Since you are calling np.savetext(..., you don't need the import call from numpy import savetxt. If you do import it, you can simply call it as savetext(...

DrDEE · Accepted Answer · 2019-02-11 21:27:54Z

I believe you can also accomplish this quite simply as follows:

Convert Numpy array into a Pandas dataframe
Save as CSV

e.g. #1:

    # Libraries to import
    import pandas as pd
    import nump as np

    #N x N numpy array (dimensions dont matter)
    corr_mat    #your numpy array
    my_df = pd.DataFrame(corr_mat)  #converting it to a pandas dataframe

e.g. #2:

    #save as csv 
    my_df.to_csv('foo.csv', index=False)   # "foo" is the name you want to give
                                           # to csv file. Make sure to add ".csv"
                                           # after whatever name like in the code

Rimjhim . · Accepted Answer · 2017-03-07 10:49:58Z

5

if you want to write in column:

    for x in np.nditer(a.T, order='C'): 
            file.write(str(x))
            file.write("\n")

Here 'a' is the name of numpy array and 'file' is the variable to write in a file.

If you want to write in row:

    writer= csv.writer(file, delimiter=',')
    for x in np.nditer(a.T, order='C'): 
            row.append(str(x))
    writer.writerow(row)

answered Mar 7, 2017 at 10:49

Rimjhim .

831 silver badge4 bronze badges

Comments

Tamil Selvan S · Accepted Answer · 2018-11-08 11:48:48Z

In Python we use csv.writer() module to write data into csv files. This module is similar to the csv.reader() module.

import csv

person = [['SN', 'Person', 'DOB'],
['1', 'John', '18/1/1997'],
['2', 'Marie','19/2/1998'],
['3', 'Simon','20/3/1999'],
['4', 'Erik', '21/4/2000'],
['5', 'Ana', '22/5/2001']]

csv.register_dialect('myDialect',
delimiter = '|',
quoting=csv.QUOTE_NONE,
skipinitialspace=True)

with open('dob.csv', 'w') as f:
    writer = csv.writer(f, dialect='myDialect')
    for row in person:
       writer.writerow(row)

f.close()

A delimiter is a string used to separate fields. The default value is comma(,).

This has already been suggested: stackoverflow.com/a/41009026/8881141 Please only add new approaches, don't repeat previously published suggestions.

Giorgos Myrianthous · Accepted Answer · 2022-05-21 11:48:03Z

3

numpy.savetxt() method is used to save a NumPy array into an output text file, however by default it will make use of scientific notation.

If you'd like to avoid this, then you need to specify an appropriate format using fmt argument. For example,

import numpy as np

np.savetxt('output.csv', arr, delimiter=',', fmt='%f')

answered May 21, 2022 at 11:48

Giorgos Myrianthous

40.3k21 gold badges153 silver badges173 bronze badges

Comments

Mr Poin · Accepted Answer · 2016-10-17 17:23:37Z

2

If you want to save your numpy array (e.g. your_array = np.array([[1,2],[3,4]])) to one cell, you could convert it first with your_array.tolist().

Then save it the normal way to one cell, with delimiter=';' and the cell in the csv-file will look like this [[1, 2], [2, 4]]

Then you could restore your array like this: your_array = np.array(ast.literal_eval(cell_string))

edited Oct 17, 2016 at 17:23

answered Oct 17, 2016 at 16:50

Mr Poin

614 bronze badges

1 Comment

PirateApp Over a year ago

well that is literally going to destroy all the memory savings for using a numpy array

Hemen Ashodia · Accepted Answer · 2017-09-29 01:39:39Z

2

You can also do it with pure python without using any modules.

# format as a block of csv text to do whatever you want
csv_rows = ["{},{}".format(i, j) for i, j in array]
csv_text = "\n".join(csv_rows)

# write it to a file
with open('file.csv', 'w') as f:
    f.write(csv_text)

edited Sep 29, 2017 at 1:39

Hemen Ashodia

5093 silver badges17 bronze badges

answered Sep 7, 2017 at 7:05

Greg

5,6081 gold badge30 silver badges32 bronze badges

2 Comments

remram Over a year ago

This uses a lot of memory. Prefer looping over each row and format&write it.

Greg Over a year ago

@remram it depends on your data, but yes if it is big it can use a lot of memory

cottontail · Accepted Answer · 2023-04-05 04:47:11Z

As other answers mentioned, it's important to pass the fmt= in order to save a "human-readable" file. In fact, if you pass a separate format for each column, you don't need to pass a delimiter.

arr = np.arange(9).reshape(3, 3)
np.savetxt('out.csv', arr, fmt='%f,%.2f,%.1f')

It saves a file whose contents look like:

0.000000,1.00,2.0
3.000000,4.00,5.0
6.000000,7.00,8.0

Now to read the file from csv, use np.loadtxt():

np.loadtxt('out.csv', delimiter=',')

If you want to append to an existing file (as well as create a new file), use a context manager and open a file with mode='ab'.

with open('out.csv', 'ab') as f:
    np.savetxt(f, arr, delimiter=',', fmt='%.1f')

Collectives™ on Stack Overflow

Dump a NumPy array into a csv file

13 Answers 13

13 Comments

12 Comments

3 Comments

Saving Data in Compressed gz format

1 Comment

2 Comments

1 Comment

1 Comment

Comments

1 Comment

Comments

1 Comment

2 Comments

Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

13 Answers 13

13 Comments

12 Comments

3 Comments

Saving Data in Compressed gz format

1 Comment

2 Comments

1 Comment

1 Comment

Comments

1 Comment

Comments

1 Comment

2 Comments

Comments

Linked

Related