How to skip the headers when processing a csv file using Python?

Question

I am using below referred code to edit a csv using Python. Functions called in the code form upper part of the code.

Problem: I want the below referred code to start editing the csv from 2nd row, I want it to exclude 1st row which contains headers. Right now it is applying the functions on 1st row only and my header row is getting changed.

in_file = open("tmob_notcleaned.csv", "rb")
reader = csv.reader(in_file)
out_file = open("tmob_cleaned.csv", "wb")
writer = csv.writer(out_file)
row = 1
for row in reader:
    row[13] = handle_color(row[10])[1].replace(" - ","").strip()
    row[10] = handle_color(row[10])[0].replace("-","").replace("(","").replace(")","").strip()
    row[14] = handle_gb(row[10])[1].replace("-","").replace(" ","").replace("GB","").strip()
    row[10] = handle_gb(row[10])[0].strip()
    row[9] = handle_oem(row[10])[1].replace("Blackberry","RIM").replace("TMobile","T-Mobile").strip()
    row[15] = handle_addon(row[10])[1].strip()
    row[10] = handle_addon(row[10])[0].replace(" by","").replace("FREE","").strip()
    writer.writerow(row)
in_file.close()    
out_file.close()

I tried to solve this problem by initializing row variable to 1 but it didn't work.

Please help me in solving this issue.

possible duplicate of When processing CSV data, how do I ignore the first line of data? — Louis
– Louis, Commented Aug 19, 2015 at 14:21

Flow · Accepted Answer · 2021-04-20 08:50:29Z

557

Your reader variable is an iterable, by looping over it you retrieve the rows.

To make it skip one item before your loop, simply call next(reader, None) and ignore the return value.

You can also simplify your code a little; use the opened files as context managers to have them closed automatically:

with open("tmob_notcleaned.csv", "rb") as infile, open("tmob_cleaned.csv", "wb") as outfile:
   reader = csv.reader(infile)
   next(reader, None)  # skip the headers
   writer = csv.writer(outfile)
   for row in reader:
       # process each row
       writer.writerow(row)

# no need to close, the files are closed automatically when you get to this point.

If you wanted to write the header to the output file unprocessed, that's easy too, pass the output of next() to writer.writerow():

headers = next(reader, None)  # returns the headers or `None` if the input is empty
if headers:
    writer.writerow(headers)

edited Apr 20, 2021 at 8:50

Flow

24.1k15 gold badges105 silver badges159 bronze badges

answered Jan 10, 2013 at 12:07

Martijn Pieters

1.1m325 gold badges4.2k silver badges3.4k bronze badges

Sign up to request clarification or add additional context in comments.

4 Comments

Jon Clements Over a year ago

An alternative is also to use for row in islice(reader, 1, None) - although less explicit than next for most simple "skip one line" jobs, for skipping multiple header rows (or getting only certain chunks etc...) it's quite handy

Jon Clements Over a year ago

I'd consider using try: writer.write(next(reader))... except StopIteration: # handle empty reader

Martijn Pieters Over a year ago

@JonClements: Perhaps. This works well enough without having to teach about try: / except:.

ShadowRanger Over a year ago

@JonClements: Advantage to explicit next iteration is that it's "free"; islice would wrap the reader forever adding (an admittedly very small amount of) overhead to each iteration. The consume recipe from itertools can be used to skip many values quickly, without adding wrapping to subsequent usage, in the case where the islice would have a start but no end, so the overhead isn't gaining you anything.

Chad Zawistowski · Accepted Answer · 2015-03-19 23:37:41Z

189

Another way of solving this is to use the DictReader class, which "skips" the header row and uses it to allowed named indexing.

Given "foo.csv" as follows:

FirstColumn,SecondColumn
asdf,1234
qwer,5678

Use DictReader like this:

import csv
with open('foo.csv') as f:
    reader = csv.DictReader(f, delimiter=',')
    for row in reader:
        print(row['FirstColumn'])  # Access by column header instead of column number
        print(row['SecondColumn'])

answered Mar 19, 2015 at 23:37

Chad Zawistowski

2,1161 gold badge14 silver badges16 bronze badges

3 Comments

MariusSiuram Over a year ago

I feel like this is the real answer, as the question seems to be an example of XY problem.

Javier Arias Over a year ago

DictReader is definitely the way to go

BuvinJ Over a year ago

It is important to note that this only works if you omit the field names parameter when constructing the DictReader. Per the documentation: If the fieldnames parameter is omitted, the values in the first row of the file f will be used as the fieldnames. See docs.python.org/2/library/csv.html

Katriel · Accepted Answer · 2013-01-10 12:06:24Z

17

Doing row=1 won't change anything, because you'll just overwrite that with the results of the loop.

You want to do next(reader) to skip one row.

answered Jan 10, 2013 at 12:06

Katriel

124k19 gold badges141 silver badges172 bronze badges

2 Comments

user1915050 Over a year ago

I tried changing it to for row in next(reader): but it is giving me IndexError: string index out of range error

dlazesz Over a year ago

Use it before the for loop: next(reader); for row in reader: ....

bitbang · Accepted Answer · 2021-08-26 16:35:59Z

14

Simply iterate one time with next()

with open(filename) as file:

    csvreaded = csv.reader(file)
    header = next(csvreaded)

    for row in csvreaded:
        empty_list.append(row) #your csv list without header

or use [1:] at the end of reader object

with open(filename) as file:

    csvreaded = csv.reader(file)
    header = next(csvreaded)

    for row in csvreaded[1:]:
        empty_list.append(row) #your csv list without header

edited Aug 26, 2021 at 16:35

answered Aug 26, 2021 at 16:00

bitbang

2,27218 silver badges20 bronze badges

2 Comments

Blue Clouds Over a year ago

in the second example, what is the need for 'header = next(csvreaded)'

Blue Clouds Over a year ago

also gets this error "TypeError: '_csv.reader' object is not subscriptable" at this line "for row in csvreaded[1:]:"

Darío López Padial · Accepted Answer · 2020-11-04 11:30:24Z

3

Inspired by Martijn Pieters' response.

In case you only need to delete the header from the csv file, you can work more efficiently if you write using the standard Python file I/O library, avoiding writing with the CSV Python library:

with open("tmob_notcleaned.csv", "rb") as infile, open("tmob_cleaned.csv", "wb") as outfile:
   next(infile)  # skip the headers
   outfile.write(infile.read())

edited Nov 4, 2020 at 11:30

answered Oct 30, 2020 at 18:18

Darío López Padial

1611 silver badge8 bronze badges

3 Comments

Timus Over a year ago

You seem to overlook the # process each row part in Martijn's answer, which stands for all the stuff the op wants to with the rows, as well as the fact that the op wants a csv-file as output? Of course you can avoid using the csv module altogether. But what's the point, it's from the standard library?

Darío López Padial Over a year ago

In my case, I only want to remove the header from the csv file, and I don't want to process anything. For this reason, I write using the standard library, because it is faster. I will edit my comment to be more clear.

Timus Over a year ago

I see. In that case you don't need the csv module at all: Just next(infile) without instantiating a csv.reader should do it (the output of open is also an iterator).

oaklodge · Accepted Answer · 2024-02-23 15:17:38Z

-2

with open(filename, 'r') as file:
    reader = csv.DictReader(file, fieldnames=None)
    # for some reason fieldnames=None causes first row to be skipped
    for row in reader: 
        print(row) # so the the first row printed is second row in file

answered Feb 23, 2024 at 15:17

oaklodge

7577 silver badges23 bronze badges

Collectives™ on Stack Overflow

How to skip the headers when processing a csv file using Python?

6 Answers 6

4 Comments

3 Comments

2 Comments

2 Comments

3 Comments

Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

6 Answers 6

4 Comments

3 Comments

2 Comments

2 Comments

3 Comments

Comments

Linked

Related