-1
def StatsUnion(filename1, filename2, filename3):
    with open(filename1) as inputfile, open(filename3, 'w', newline='') as outputfile:
        writer = csv.writer(outputfile)
        for row in csv.reader(inputfile):
            if any(field.strip() for field in row):
                writer.writerow(row)

    with open(filename2) as inputfile, open(filename3, 'a', newline='') as outputfile:
        writer = csv.writer(outputfile)
        for row in csv.reader(inputfile):
            if any(field.strip() for field in row):
                writer.writerow(row)

Here my function that works for merging 2 CSV file in a new one. Is there a way to make it for more CSV files in an easy way ? Columns would be always the same

3
  • Why can't you do this the same way as you did this? Commented May 1, 2020 at 14:30
  • Can you clarify what exactly the issue is? Please see How to Ask, help center. As an aside, you should use the lower_case_with_underscores style for function and variable names. Commented May 1, 2020 at 14:41
  • Every file reports stats of one year hence if I have to look for more years together I need to merge everything in one file Commented May 1, 2020 at 14:43

2 Answers 2

1

You can take advantage of var-args (*args) and run that code in a loop for any number of input files:

def stats_union(out_file, *args):
    with open(out_file, 'w', newline='') as outputfile:
        writer = csv.writer(outputfile)
        for in_file in args:
            with open(in_file) as inputfile:
                for row in csv.reader(inputfile):
                    if any(field.strip() for field in row):
                        writer.writerow(row)

Now you can call it with any number of input files, only difference is that the output file should always be first. So your example would be:

stats_union(filename3, filename1, filename2)
Sign up to request clarification or add additional context in comments.

5 Comments

it works with 3 files (1 output and 2 input but if I put f.e. 1 output and 3 inputs it writes more times the first ones and i get a lot of rows doubled
Well it's hard to debug like that. Make sure you are actually passing different files and that their contents are different. Works fine for me with 4 input files...
Well, it's a football file, every row is like that 27.05.17,Tottenham-Arsenal,1 - 2 (date, teams, result) but every year has 380 rows and with 3 years I should get 1140 rows but I get 3420 (3 times more)
Maybe you didn't copy the code right, make sure it is not copied twice (or more) (sounds like it...) and make sure you are calling it correctly. As I said, it runs fine for me
Perfect, Thx I found a mistake of mine
1

You can do that simply using pandas, here's an example

def StatsUnion(out_file, *args):
    ip = []
    for i in args:
        ip.append(pd.read_csv(i)) #read csv at path i in args, and store dataframe in a list
    out_df = pd.concat(ip, axis=0) # concatenate all dataframes in list along the rows (axis = 1) for columns
    out_df.to_csv(out_file, index=False)

Here, I am reading csv files from a path provided in args (args has paths for inividual files) and then concatenating them.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.