Python loops through CSV, but writes header row twice

Question

I have csv files with unwanted first characters in the header row except the first column. The while loop strips the first character from the headers and writes the new header row to a new file (exit by counter). The else statement then writes the rest of the rows to the new file. The problem is the else statement begins with the header row and writes it a second time. Is there a way to have else begin an the next line with out breaking the for iterator? The actual files are 21 columns by 400,000+ rows. The unwanted character is a single space, but I used * in the example below to make it easier to see. Thanks for any help!

file.csv =

a,*b,*c,*d

1,2,3,4

import csv

reader = csv.reader(open('file.csv', 'rb'))

writer = csv.writer(open('file2.csv','wb'))

count = 0

for row in reader:
    while (count <= 0):
        row[1]=row[1][1:]
        row[2]=row[2][1:]
        row[3]=row[3][1:]
        writer.writerow([row[0], row[1], row[2], row[3]])
        count = count + 1
    else:
        writer.writerow([row[0], row[1], row[2], row[3]])

Removing these unwanted characters -- is this the only purpose of your code? — djas
– djas, Commented Aug 5, 2013 at 3:13
Yes, however, this is just a small part of optimizing a very large dataset for import to a database @djas — Streic
– Streic, Commented Aug 5, 2013 at 3:25

Community · Accepted Answer · 2017-05-23 10:25:26Z

1

If you only want to change the header and copy the remaining lines without change:

with open('file.csv', 'r') as src, open('file2.csv', 'w') as dst:
    dst.write(next(src).replace(" ", ""))     # delete whitespaces from header
    dst.writelines(line for line in src)

If you want to do additional transformations you can do something like this or this question.

edited May 23, 2017 at 10:25

CommunityBot

11 silver badge

answered Aug 5, 2013 at 4:16

elyase

41.2k12 gold badges121 silver badges123 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

djas Over a year ago

This code will delete all white spaces in the header though -- something you may or may not want to do.

JSutton · Accepted Answer · 2013-08-05 03:27:53Z

0

If all you want to do is remove spaces, you can use:

string.replace(" ", "")

answered Aug 5, 2013 at 3:27

JSutton

715 bronze badges

Comments

jrs · Accepted Answer · 2013-08-05 03:39:56Z

Hmm... It seems like your logic might be a bit backward. A bit cleaner, I think, to check if you're on the first row first. Also, a slightly more idiomatic way to remove spaces is to use string's lstrip method with no arguments to remove leading whitespace.

Why not use enumerate and check if your row is the header?

import csv

reader = csv.reader(open('file.csv', 'rb'))

writer = csv.writer(open('file2.csv','wb'))

for i, row in enumerate(reader):
    if i == 0:            
        writer.writerow([row[0], 
                         row[1].lstrip(), 
                         row[2].lstrip(), 
                         row[3].lstrip()])
    else:
        writer.writerow([row[0], row[1], row[2], row[3]])

Nice, now i see why my code was duplicating the header row. @jrs

djas · Accepted Answer · 2013-08-05 04:09:17Z

If you have 21 columns, you don't want to write row[0], ... , row[21]. Plus, you want to close your files after opening them. .next() gets your header. And strip() lets you flexibly remove unwanted leading and trailing characters.

import csv

file = 'file1.csv'
newfile = open('file2.csv','wb')
writer = csv.writer(newfile)

with open(file, 'rb') as f:
  reader = csv.reader(f)
  header = reader.next()

  newheader = []  
  for c in header:
    newheader.append(c.strip(' '))
    writer.writerow(newheader)  

  for r in reader:
    writer.writerow(r)  

newfile.close()

Collectives™ on Stack Overflow

Python loops through CSV, but writes header row twice

4 Answers 4

1 Comment

Comments

1 Comment

Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

1 Comment

Comments

1 Comment

Comments

Linked

Related