How can i merge more csv files in python?

Question

def StatsUnion(filename1, filename2, filename3):
    with open(filename1) as inputfile, open(filename3, 'w', newline='') as outputfile:
        writer = csv.writer(outputfile)
        for row in csv.reader(inputfile):
            if any(field.strip() for field in row):
                writer.writerow(row)

    with open(filename2) as inputfile, open(filename3, 'a', newline='') as outputfile:
        writer = csv.writer(outputfile)
        for row in csv.reader(inputfile):
            if any(field.strip() for field in row):
                writer.writerow(row)

Here my function that works for merging 2 CSV file in a new one. Is there a way to make it for more CSV files in an easy way ? Columns would be always the same

Can you clarify what exactly the issue is? Please see How to Ask, help center. As an aside, you should use the lower_case_with_underscores style for function and variable names. — AMC
– AMC, Commented May 1, 2020 at 14:41
Every file reports stats of one year hence if I have to look for more years together I need to merge everything in one file — Al Pan
– Al Pan, Commented May 1, 2020 at 14:43

Tomerikoo · Accepted Answer · 2020-05-01 14:52:28Z

1

You can take advantage of var-args (*args) and run that code in a loop for any number of input files:

def stats_union(out_file, *args):
    with open(out_file, 'w', newline='') as outputfile:
        writer = csv.writer(outputfile)
        for in_file in args:
            with open(in_file) as inputfile:
                for row in csv.reader(inputfile):
                    if any(field.strip() for field in row):
                        writer.writerow(row)

Now you can call it with any number of input files, only difference is that the output file should always be first. So your example would be:

stats_union(filename3, filename1, filename2)

edited May 1, 2020 at 14:52

answered May 1, 2020 at 14:30

Tomerikoo

19.5k16 gold badges57 silver badges68 bronze badges

Sign up to request clarification or add additional context in comments.

5 Comments

Al Pan Over a year ago

it works with 3 files (1 output and 2 input but if I put f.e. 1 output and 3 inputs it writes more times the first ones and i get a lot of rows doubled

Tomerikoo Over a year ago

Well it's hard to debug like that. Make sure you are actually passing different files and that their contents are different. Works fine for me with 4 input files...

Al Pan Over a year ago

Well, it's a football file, every row is like that 27.05.17,Tottenham-Arsenal,1 - 2 (date, teams, result) but every year has 380 rows and with 3 years I should get 1140 rows but I get 3420 (3 times more)

Tomerikoo Over a year ago

Maybe you didn't copy the code right, make sure it is not copied twice (or more) (sounds like it...) and make sure you are calling it correctly. As I said, it runs fine for me

Al Pan Over a year ago

Perfect, Thx I found a mistake of mine

SajanGohil · Accepted Answer · 2020-05-01 14:54:30Z

You can do that simply using pandas, here's an example

def StatsUnion(out_file, *args):
    ip = []
    for i in args:
        ip.append(pd.read_csv(i)) #read csv at path i in args, and store dataframe in a list
    out_df = pd.concat(ip, axis=0) # concatenate all dataframes in list along the rows (axis = 1) for columns
    out_df.to_csv(out_file, index=False)

Here, I am reading csv files from a path provided in args (args has paths for inividual files) and then concatenating them.

Collectives™ on Stack Overflow

How can i merge more csv files in python?

2 Answers 2

5 Comments

Comments

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

5 Comments

Comments

Related