2

I'm trying to create csv file, save it, read it later and then add (concat) data to the bottom - and do this process multiple times. As an example, my setup is:

import pandas as pd

df3 = pd.DataFrame(columns=('col1','col2'))
df3.to_csv('example.csv', sep=',')
print(df3)

which generates a blank csv file only containing column headers that looks like this (this is what i want my data to look like):

Empty DataFrame
Columns: [col1, col2]
Index: []

Then, I generate a new dataframe with row information (index), open the old (df3) csv file and .concat() to the file.

df1 = pd.DataFrame({'col1':list("abc"),'col2':list("def")})
df3 = pd.read_csv('example.csv', sep=',')
print(df3)
print(df1)
df3 = pd.concat([df3, df1], ignore_index=True)
print(df3)
df3.to_csv('example.csv', sep=',')

but when I read the example.csv file (df3) it actually generates a dataframe that looks like this:

Empty DataFrame
Columns: [Unnamed: 0, col1, col2]
Index: []

There is now an extra column.

My actual code constrains the .read_csv/.to_csv and it throws an error because what I'm trying to read/write in isn't what I sent it (I don't think).

I've tried adding ignore_index=True to the method but that doesn't do it. I've also tried reading back exactly what I put in, but it still generates the Unnamed column.

There is some information here on bad data within the column - not quite on point.

There is obviously a simple answer to this, I just can't figure it out.

4
  • 1
    df3.to_csv('example.csv', sep=',', index=False) Commented Apr 8, 2018 at 17:07
  • Perhaps the better approach is .append(), but based on the docs that doens't seem like the right approach. Commented Apr 8, 2018 at 17:07
  • @roganjosh - that worked... I now see that the index= has several variables. I now understand that the extra column is the index column from the original dataframe. So I could also .read_csv() with index=0 and that puts the Unnamed: column to the df's index. Right? Commented Apr 8, 2018 at 17:11
  • From what I can tell, you will no-longer have Unnamed: as a column. The to_csv method will write a numerical index for each row unless you specify it as False. Commented Apr 8, 2018 at 17:16

1 Answer 1

2

When you read the csv file into df3, you can use

df3 = pd.read_csv('example.csv', sep=',', index_col=0)

Then you won't have the unnamed column.

Sign up to request clarification or add additional context in comments.

1 Comment

Based on my question this is in fact the solution. You can also use @roganjosh comment and save the csv without the index. Both worked. Thanks.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.