34

I have a bunch of csv files with the same columns but in different order. We are trying to upload them with SQL*Plus but we need the columns with a fixed column arrange.

Example

required order: A B C D E F

csv file: A C D E B (sometimes a column is not in the csv because it is not available)

is it achievable with python? we are using Access+Macros to do it... but it is too time consuming

PS. Sorry if anyone get upset for my English skills.

1
  • Yes it is. Use a regex pattern and away you go. Commented Oct 7, 2015 at 20:08

5 Answers 5

44

You can use the csv module to read, reorder, and then and write your file.

Sample File:

$ cat file.csv
A,B,C,D,E
a1,b1,c1,d1,e1
a2,b2,c2,d2,e2

Code

import csv

with open('file.csv', 'r') as infile, open('reordered.csv', 'a') as outfile:
    # output dict needs a list for new column ordering
    fieldnames = ['A', 'C', 'D', 'E', 'B']
    writer = csv.DictWriter(outfile, fieldnames=fieldnames)
    # reorder the header first
    writer.writeheader()
    for row in csv.DictReader(infile):
        # writes the reordered rows to the new file
        writer.writerow(row)

output

$ cat reordered.csv
A,C,D,E,B
a1,c1,d1,e1,b1
a2,c2,d2,e2,b2
Sign up to request clarification or add additional context in comments.

5 Comments

Really nice use of DictReader/DictWriter.
Beautiful solution. However, how can I make it work on input files that are delimited by a semicolon?
@user1192748 the docs for csv.reader python2 python3 define delimiter= and quotechar= kwargs to specify the delimiter and quoiting character.
Keep in mind that DictWriter for Python 2, doesn't support unicode characters as mentioned docs.python.org/2/library/csv.html
As long as the Unicode characters are encoded with UTF-8 and there is no special stuff like left-to-right markers, the above code will just work fine in Python2 as the unreadable Unicode contents will simply be copied into the correct columns. That's thanks to UTF-8 not using byte values in the ASCII range to encode multi-byte characters.
12

So one way to tackle this problem is to use pandas library which can be easily install using pip. Basically, you can download csv file to pandas dataframe then re-order the column and save it back to csv file. For example, if your sample.csv looks like below:

A,C,B,E,D                                                                                                                
a1,b1,c1,d1,e1                                                                                                           
a2,b2,c2,d2,e2 

Here is a snippet to solve the problem.

import pandas as pd
df = pd.read_csv('/path/to/sample.csv')
df_reorder = df[['A', 'B', 'C', 'D', 'E']] # rearrange column here
df_reorder.to_csv('/path/to/sample_reorder.csv', index=False)

Comments

3
csv_in  = open("<filename>.csv", "r")
csv_out = open("<filename>.csv", "w")

for line in csv_in:
    field_list = line.split(',')    # split the line at commas
    output_line = ','.join(field_list[0],   # rejoin with commas, new order
                           field_list[2],
                           field_list[3],
                           field_list[4],
                           field_list[1]
                           )
    csv_out.write(output_line)

csv_in.close()
csv_out.close()

3 Comments

What if there are quoted strings containing commas?
output_line = ','.join(field_list[3], # rejoin with commas, new order TypeError: str.join() takes exactly one argument (4 given) when I was trying to use this with python3 (and a 4 column csv-file). Is this a pyhton2 script?
Yes, it's Python 2.7. The switch to Python 3 was expensive in project terms, so most large organizations stayed with 2.7 as long as they could.
1

You can use something similar to this to change the order, replacing ';' with ',' in your case. Because you said you needed to do multiple .csv files, you could use the glob module for a list of your files

for file_name in glob.glob('<Insert-your-file-filter-here>*.csv'):
    #Do the work here

Comments

1

The csv module allows you to read csv files with their values associated to their column names. This in turn allows you to arbitrarily rearrange columns, without having to explicitly permute lists.

for row in csv.DictReader(open("foo.csv")):
  print row["b"], row["a"]

2 1
22 21

Given the file foo.csv:

a,b,d,e,f
1,2,3,4,5
21,22,23,24,25

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.