Column in CSV to list in python?

Question

I've seen a lot of posts about doing this backwards, but I haven't been able to find any way to write the contents of a column in a csv file to a list. After I have this I'll loop through to add all of the unique value to a separate list and count the number of total unique values. This is what I have:

b=[]
c=[]
servers = []
fname=(r'file')
with open(fname, 'r') as f:
    reader = csv.reader(f)
    severities = Counter(row[3] for row in reader)
    servers = list(row[9] for row in reader)
    for row in reader:
        print (row[9])
        for servername in servers:
            if servername not in b:
                b.append(servername)

I'm open to better ways to do this. Any and all help is appreciated. Thanks in advance.

What is the expected output exactly? You cannot loop over reader twice, at least not without rewinding f to the start with f.seek(0). Not that that would be the efficient way of doing it. — Martijn Pieters
– Martijn Pieters, Commented Mar 5, 2014 at 17:47
So you want to have a Counter() of the 4th column (row[3]) and a unique list of the 10th column (row[9])? You are not using c here, is that needed at all? — Martijn Pieters
– Martijn Pieters, Commented Mar 5, 2014 at 17:48
^that's exactly what I want. The counter works fine, I want b to contain the unique values of column 10 at the end. — RonTheBear
– RonTheBear, Commented Mar 5, 2014 at 17:51

Andrea Corbellini · Accepted Answer · 2014-03-05 17:57:47Z

You are iterating over the reader three times:

severities = Counter(row[3] for row in reader)  # First time
servers = list(row[9] for row in reader)  # Second time
for row in reader:  # Third time

When you iterate the first time, you 'exhaust' it, so it won't yield any item the second and third time.

You should do something like this:

severities = Counter()
servers = []
for row in reader:
    severities[row[3]] += 1
    servers.append(row[9])
    print row[9]

This is enough to make the code work.

Here are some other tips. They aren't required to fix your code, however you'll surely benefit from them:

I think you want to place the for servername in servers loop out of the for row in reader loop.
If you use a set or an OrderedSet instead of a list for b, you can completely avoid the for servername in servers loop and replace it with a single line:
```
b.update(servers)
```

That was it. Thanks a ton I've been looking at this for far too long.

Martijn Pieters · Accepted Answer · 2014-03-05 18:01:51Z

Your best bet is to loop over reader just once and collect the counts and unique names for the servers in your loop:

severities = Counter()
servers = set()

with open(fname, 'rb') as f:
    reader = csv.reader(f)
    for row in reader:
        severities[row[3]] += 1
        servers.add(row[9])

This assumes you don't care about the order servers are listed in in the CSV file.

If order does need to be preserved, use a separate seen set:

severities = Counter()
servers = []

with open(fname, 'rb') as f:
    reader = csv.reader(f)
    seen = set()
    for row in reader:
        severities[row[3]] += 1
        if row[9] not in seen:
            servers.append(row[9])
            seen.add(row[9])

If the file is relatively small, you could also get away with transposing the columns; that's overkill here, but would look like:

with open(fname, 'rb') as f:
    reader = csv.reader(f)
    cols = zip(*reader)  # transpose the rows to columns
    severities = Counter(cols[3])
    servers = set(cols[9])

Collectives™ on Stack Overflow

Column in CSV to list in python?

2 Answers 2

1 Comment

Comments

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

1 Comment

Comments

Related