I wonder how to add new DataFrame data onto the end of an existing csv file? The to_csv doesn't mention such functionality.
4 Answers
You can append using to_csv by passing a file which is open in append mode:
with open(file_name, 'a') as f:
df.to_csv(f, header=False)
Use header=None, so as not to append the column names.
In fact, pandas has a wrapper to do this in to_csv using the mode argument (see Joe's answer):
df.to_csv(f, mode='a', header=False)
10 Comments
perigee
Also need to close the file by f.close(). Andy, you make my day. It works like a charm, I'm from c/c++ ethnic and need to learn the python philosophy. Any suggestion?
perigee
Andy, really appreciated :-D (cannot use @ symbol :-()
Ezekiel Kruglick
Bonus points that this closes the file after to_csv. I have some code that hits to_csv alot and was finding the files left open on later iterations.
Andy Hayden
@EzekielKruglick Were you passing an open file to
to_csv or the filename? I recall a related issue where not closing the file led to a 99% speedup of their code (IIRC they were appending to the same file tens of thousands of times).lesolorzanov
@perigee when "with" is used the file is closed automatically always. blog.lerner.co.il/dont-use-python-close-files-answer-depends
|
You can also pass the file mode as an argument to the to_csv method
df.to_csv(file_name, header=False, mode = 'a')
Comments
A little helper function I use (based on Joe Hooper's answer) with some header checking safeguards to handle it all:
def appendDFToCSV_void(df, csvFilePath, sep=","):
import os
if not os.path.isfile(csvFilePath):
df.to_csv(csvFilePath, mode='a', index=False, sep=sep)
elif len(df.columns) != len(pd.read_csv(csvFilePath, nrows=1, sep=sep).columns):
raise Exception("Columns do not match!! Dataframe has " + str(len(df.columns)) + " columns. CSV file has " + str(len(pd.read_csv(csvFilePath, nrows=1, sep=sep).columns)) + " columns.")
elif not (df.columns == pd.read_csv(csvFilePath, nrows=1, sep=sep).columns).all():
raise Exception("Columns and column order of dataframe and csv file do not match!!")
else:
df.to_csv(csvFilePath, mode='a', index=False, sep=sep, header=False)
1 Comment
floatingice
Is there an API setting for the 3rd test case, column order not matching between dataframe and csv? I want to write without headers, but have the columns be implicitly reordered.
Thank to Andy, the complete solution:
f = open(filename, 'a') # Open file as append mode
df.to_csv(f, header = False)
f.close()
1 Comment
Andy Hayden
Just to mention, this is essentially equivalent to above but after this you're left with a closed file (f), whereas with
with it cleans up that for you. :)