2

I have 6 csv files with identical headers. I am trying to remove index 0 and merge them into a single dataframe. One problem I keep running into is I only seem to access the last file in glob.

import glob
import csv
import pandas as pd

for item in glob.glob("*.csv"):
    with open(item, 'r') as csvfile:
        reader = csv.reader(csvfile, delimiter=',')
        for row in reader:
            print(row)

any thoughts?

2
  • Did you try printing item before opening the file? Or just printing glob.glob("*.csv")? What happened? Commented Apr 30, 2018 at 18:31
  • 1
    The example code you posted works for me. Commented Apr 30, 2018 at 18:39

1 Answer 1

4
import glob
import pandas as pd

dfs = []
for file in glob.glob("*.csv"):
    dfs.append(pd.read_csv(file))
pd.concat(dfs)

Or even in a single line:

pd.concat([pd.read_csv(file) for file in glob.glob("*.csv")])

pandas has a function to read a single .csv file. So I suggest using pd.read_csv(filename) (see here for details) in your loop to make a DataFrame for every csv file. You can append all your DataFrames to a list.

After your loop you can concat all DataFrames using pd.concat([df1, df2, ...]), passing that list (see here for details).

Sign up to request clarification or add additional context in comments.

1 Comment

Brilliant. I knew I was swirling around the right answer but just missing it.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.