Pandas: Append existing CSV file, extra columns

Question

I'm trying to create csv file, save it, read it later and then add (concat) data to the bottom - and do this process multiple times. As an example, my setup is:

import pandas as pd

df3 = pd.DataFrame(columns=('col1','col2'))
df3.to_csv('example.csv', sep=',')
print(df3)

which generates a blank csv file only containing column headers that looks like this (this is what i want my data to look like):

Empty DataFrame
Columns: [col1, col2]
Index: []

Then, I generate a new dataframe with row information (index), open the old (df3) csv file and .concat() to the file.

df1 = pd.DataFrame({'col1':list("abc"),'col2':list("def")})
df3 = pd.read_csv('example.csv', sep=',')
print(df3)
print(df1)
df3 = pd.concat([df3, df1], ignore_index=True)
print(df3)
df3.to_csv('example.csv', sep=',')

but when I read the example.csv file (df3) it actually generates a dataframe that looks like this:

Empty DataFrame
Columns: [Unnamed: 0, col1, col2]
Index: []

There is now an extra column.

My actual code constrains the .read_csv/.to_csv and it throws an error because what I'm trying to read/write in isn't what I sent it (I don't think).

I've tried adding ignore_index=True to the method but that doesn't do it. I've also tried reading back exactly what I put in, but it still generates the Unnamed column.

There is some information here on bad data within the column - not quite on point.

There is obviously a simple answer to this, I just can't figure it out.

Perhaps the better approach is .append(), but based on the docs that doens't seem like the right approach. — Bill Armstrong
– Bill Armstrong, Commented Apr 8, 2018 at 17:07
@roganjosh - that worked... I now see that the index= has several variables. I now understand that the extra column is the index column from the original dataframe. So I could also .read_csv() with index=0 and that puts the Unnamed: column to the df's index. Right? — Bill Armstrong
– Bill Armstrong, Commented Apr 8, 2018 at 17:11
From what I can tell, you will no-longer have Unnamed: as a column. The to_csv method will write a numerical index for each row unless you specify it as False. — roganjosh
– roganjosh, Commented Apr 8, 2018 at 17:16

aliciawyy · Accepted Answer · 2018-04-08 19:16:53Z

2

When you read the csv file into df3, you can use

df3 = pd.read_csv('example.csv', sep=',', index_col=0)

Then you won't have the unnamed column.

answered Apr 8, 2018 at 19:16

aliciawyy

1513 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Bill Armstrong Over a year ago

Based on my question this is in fact the solution. You can also use @roganjosh comment and save the csv without the index. Both worked. Thanks.

Collectives™ on Stack Overflow

Pandas: Append existing CSV file, extra columns

1 Answer 1

1 Comment

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

1 Comment

Linked

Related