change order of columns in csv (python)

Question

I made a script, which reads a given input-file (csv), manipulates the data somehow and writes an output-file (csv).

In my case, my given input-file looks like this:

| sku | article_name |
| 1   | MyArticle    |

For my output-file, I need to re-arrange these columns (there are plenty more, but I think i might be able to solve it, when someone shows me the way)

My output-file should look like this:

| article_name | another_column | sku |
| MyArticle    |                | 1   |

Note, that here is a new column, that isn't in the source csv-file, but it has to be printed anyway (the order is important as well)

This is what I have so far:

#!/usr/bin/env python
# -*- coding: latin_1 -*-

import csv
import argparse
import sys


header_mappings = {'attr_artikel_bezeichnung1': 'ARTICLE LABEL',
                   'sku': 'ARTICLE NUMBER',
                   'Article label locale': 'Article label locale',
                   'attr_purchaseprice': 'EK-Preis',
                   'attr_salesPrice': 'EuroNettoPreis',
                   'attr_salesunit': 'Einheit',
                   'attr_salesvatcode': 'MwSt.-Satz',
                   'attr_suppliercode': 'Lieferantennummer',
                   'attr_suppliersitemcode': 'Artikelnummer Lieferant',
                   'attr_isbatchitem': 'SNWarenausgang'}

row_mapping = {'Einheit': {'pc': 'St.'},
               'MwSt.-Satz': {'3': '19'}}


def remap_header(header):
    for h_map in header_mappings:
        if h_map in header:
            yield header_mappings.get(h_map), header.get(h_map)


def map_header(header):
    for elem in header:
        yield elem, header.index(elem)


def read_csv(filename):
    with open(filename, 'rb') as incsv:
        csv_reader = csv.reader(incsv, delimiter=';')
        for r in csv_reader:
            yield r


def add_header(header, fields=()):
    for f in fields:
        header.append(f)

    return header


def duplicate(csv_row, header_name, fields):
    csv_row[new_csv_header.index(fields)] = csv_row[new_csv_header.index(header_name)]
    return csv_row


def do_new_row(csv_row):
    for header_name in new_csv_header:
        for r_map in row_mapping:
            row_content = csv_row[mapped_header.get(r_map)]
            if row_content in row_mapping.get(r_map):
                csv_row[mapped_header.get(r_map)] = row_mapping.get(r_map).get(row_content)
        try:
            yield csv_row[mapped_header.get(header_name)]
        except TypeError:
            continue


if __name__ == '__main__':

    parser = argparse.ArgumentParser()
    parser.add_argument('-i', '--infile', metavar='CSV')
    parser.add_argument('-o', '--outfile', metavar='CSV')

    args = parser.parse_args()
    arguments = vars(args)
    if len(sys.argv[1:]) == 0:
        parser.print_usage()
        sys.exit(0)

    # print arguments
    # parse_csv(**arguments)
    """
    """
    csv_reader_iter = read_csv(arguments.get('infile'))

    # neuer csv header
    new_csv_header = list()
    csv_header = next(csv_reader_iter)
    for h in csv_header:
        if h in header_mappings:
            new_csv_header.append(header_mappings.get(h))

    # print new_csv_header
    new_csv_header = add_header(new_csv_header, ('Article label locale', 'Nummer'))
    mapped_header = dict(remap_header(dict(map_header(csv_header))))
    # print mapped_header

    with open(arguments.get('outfile'), 'wb') as outcsv:
        csv_writer = csv.writer(outcsv, delimiter=';')
        csv_writer.writerow(new_csv_header)
        for row in csv_reader_iter:
            row = list(do_new_row(row))
            delta = len(new_csv_header) - len(row)
            if delta > 0:
                row = row + (delta * [''])

            # duplicate(row, 'SNWarenausgang', 'SNWareneingang')
            # duplicate(row, 'SNWarenausgang', 'SNWareneingang')
            csv_writer.writerow(row)


    print "Done."
    """
    print new_csv_header
    for row in csv_reader_iter:
        row = list(do_new_row(row))
        delta = len(new_csv_header) - len(row)
        if delta > 0:
            row = row + (delta * [''])

        duplicate(row, 'Herstellernummer', 'Nummer')
        duplicate(row, 'SNWarenausgang', 'SNWareneingang')
        print row
    """

Right now, even though it says "ARTICLE LABEL" first, the sku is printed first. My guess: This is due the order of the csv-file, since sku is the first field there... right?

postelrich · Accepted Answer · 2015-10-07 15:29:14Z

2

If you use the DictWriter from the csv lib you can specify the order of the columns. Use DictReader to read in rows from your file as dicts. Then you just explicitly specify the order of the keys when you create your DictWriter.

https://docs.python.org/2/library/csv.html#csv.DictReader

answered Oct 7, 2015 at 15:29

postelrich

3,5166 gold badges44 silver badges67 bronze badges

Sign up to request clarification or add additional context in comments.

4 Comments

Saffe Over a year ago

Would I have to adjust the "do_new_row"-function? I'm sorry, I'm just used to PHP =(

Saffe Over a year ago

Can you, if possible, show me an example on my code? Where should this DictWriter be?

postelrich Over a year ago

You want to replace csv.reader with csv.DictReader and csv.writer with csv.DictWriter. The link I provided has examples of using these. You would have to modify your code because you would now be working with dicts instead of lists.

Saffe Over a year ago

I tried to replace csv.reader with your solution, but that ended up in having errors (as you already told me, since I'm using dicts instead of lists. May i ask for further help?

poke · Accepted Answer · 2015-10-12 11:23:35Z

As riotburn already suggested, you can use a DictWriter and its fieldnames argument to adjust the order of columns in the new file.

Reordering a file could look like this:

def read_csv (filename):
    with open(filename) as incsv:
       reader = csv.DictReader(incsv, delimiter=';')
       for r in reader:
           yield r

columns = ['article_name', 'another_column', 'sku']

with open('newfile.csv', 'w+') as f:
    writer = csv.DictWriter(f, columns, delimiter=';')
    writer.writeheader()

    for row in read_csv('oldfile.csv'):
        # add a property
        row['another_column'] = 'foo'

        # write row (using the order specified in columns)
        writer.writerow(row)

Collectives™ on Stack Overflow

change order of columns in csv (python)

2 Answers 2

4 Comments

Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

4 Comments

Comments

Linked

Related