I'm trying to create csv file, save it, read it later and then add (concat) data to the bottom - and do this process multiple times. As an example, my setup is:
import pandas as pd
df3 = pd.DataFrame(columns=('col1','col2'))
df3.to_csv('example.csv', sep=',')
print(df3)
which generates a blank csv file only containing column headers that looks like this (this is what i want my data to look like):
Empty DataFrame
Columns: [col1, col2]
Index: []
Then, I generate a new dataframe with row information (index), open the old (df3) csv file and .concat() to the file.
df1 = pd.DataFrame({'col1':list("abc"),'col2':list("def")})
df3 = pd.read_csv('example.csv', sep=',')
print(df3)
print(df1)
df3 = pd.concat([df3, df1], ignore_index=True)
print(df3)
df3.to_csv('example.csv', sep=',')
but when I read the example.csv file (df3) it actually generates a dataframe that looks like this:
Empty DataFrame
Columns: [Unnamed: 0, col1, col2]
Index: []
There is now an extra column.
My actual code constrains the .read_csv/.to_csv and it throws an error because what I'm trying to read/write in isn't what I sent it (I don't think).
I've tried adding ignore_index=True to the method but that doesn't do it. I've also tried reading back exactly what I put in, but it still generates the Unnamed column.
There is some information here on bad data within the column - not quite on point.
There is obviously a simple answer to this, I just can't figure it out.
df3.to_csv('example.csv', sep=',', index=False).append(), but based on the docs that doens't seem like the right approach.index=has several variables. I now understand that the extra column is the index column from the original dataframe. So I could also.read_csv()withindex=0and that puts theUnnamed:column to the df's index. Right?Unnamed:as a column. Theto_csvmethod will write a numerical index for each row unless you specify it asFalse.