Reading csv file in Python and create a dictionary

Question

I am trying to read a csv file in python 27 to create a dictionary. CSV file looks like-

SI1440269,SI1320943,SI1321085 SI1440270,SI1320943,SI1321085,SI1320739 SI1440271,SI1320943
SI1440273,SI1321058,SI1320943,SI1320943

Number of entries in each row are not fixed. First column entries should be my keys. My code -

import csv
reader = csv.reader(open('test.csv'))

result = {}
for column in reader:
    key = column[0]
    if key in result:
        pass
    result[key] = column[1:]
print result

Output:

{'SI1440273': ['SI1321058', 'SI1320943', 'SI1320943'], '': ['', '', ''], 'SI1440271': ['SI1320943', '', ''], 'SI1440270': ['SI1320943', 'SI1321085', 'SI1320739'], 'SI1440269': ['SI1320943', 'SI1321085', '']}

How can I get rid of null values in the output? Also, how can I have my key values in the output to be in the same order as in csv file?

Edit: I want single row per 'key'

Also for the record, I believe the variable you define as column is actually a row :) — hobenkr
– hobenkr, Commented Jul 5, 2015 at 18:57
I am not sure if understand what is the expected output here. Do you want to keep only a single row per "key"? — zero323
– zero323, Commented Jul 5, 2015 at 19:00
I just ran your program and I'm getting different results: {'SI1440270 SI1320943 SI1321085 SI1320739 SI1440271 SI1320943': [], 'SI1440273 SI1321058 SI1320943 SI1320943': [], 'SI1440269 SI1320943 SI1321085': []}. Can you explain a little more what you want here? — hobenkr
– hobenkr, Commented Jul 5, 2015 at 19:02
Your for loop iterates over each row in your csv file, not each column. You can see this if you put a print statement at the top of your loop: print(column). This will print a row of your file, not a column. — hobenkr
– hobenkr, Commented Jul 5, 2015 at 19:48

Martin Evans · Accepted Answer · 2015-07-05 19:00:41Z

You could use csv.DictReader as follows:

import csv

result = {}
with open('test.csv') as csvfile:
    reader = csv.DictReader(csvfile, delimiter=" ", fieldnames=["id"], restkey="data")
    for row in reader:
        print row
        result[row["id"]] = row["data"]

print result

This would give you a per-row dictionary solution, so you could process it a line at a time. I also then append them all into one single result dictionary.

From this you will get the following output:

{'data': ['SI1320943', 'SI1321085'], 'id': 'SI1440269'}
{'data': ['SI1320943', 'SI1321085', 'SI1320739', 'SI1440271', 'SI1320943'], 'id': 'SI1440270'}
{'data': ['SI1321058', 'SI1320943', 'SI1320943'], 'id': 'SI1440273'}
{'SI1440273': ['SI1321058', 'SI1320943', 'SI1320943'], 'SI1440270': ['SI1320943', 'SI1321085', 'SI1320739', 'SI1440271', 'SI1320943'], 'SI1440269': ['SI1320943', 'SI1321085']}

Mr. A · Accepted Answer · 2015-07-05 19:25:49Z

3

try this

import csv
reader = csv.reader(open('test.csv'))

result = {row[0]:row[1:] for row in reader if row and row[0]}
print result

if you want further more to eliminate null in values then do as bellow

import csv
reader = csv.reader(open('test.csv'))

result = {row[0]:[i for i in row[1:] if i] for row in reader if row and row[0]}
print result

To preserve the order of entry

from collections import OrderedDict
result = OrderedDict()
for row in reader:
   if row and row[0]:
      result[row[0]]=[i for i in row[1:] if i]

# print result
for key in result:
   print key,":" ,result[key]

edited Jul 5, 2015 at 19:25

answered Jul 5, 2015 at 19:05

Mr. A

1,23118 silver badges28 bronze badges

5 Comments

Karvy1 Over a year ago

This solved my problem partially. I want my key values in the output to be in the same order as in csv file. Its not happening with your code.

Mr. A Over a year ago

from collections import OrderedDict and use it

Karvy1 Over a year ago

Used but didn't work. Code-

import csv from collections import OrderedDict result = OrderedDict() reader = csv.reader(open('test.csv')) result = {row[0]:[i for i in row[1:] if i] for row in reader if row and row[0]} print result

Mr. A Over a year ago

This is Just for your knowledge dictionary in python is a hash table which don't preserve any order where as if you want to preserve order of entry of key then use OrderedDict from collections I have added third sample code make use of it.

Karvy1 Over a year ago

is it compulsory that the word 'OrderedDict' comes at the beginning of the output?

hobenkr · Accepted Answer · 2015-07-05 20:21:32Z

2

As already noted this is not CSV - so readline and split would be more appropriate and use OrderedDict to keep input order:

import csv
from collections import OrderedDict
result = OrderedDict()
with open('test.csv') as f:
    for row  in f:
        row=row.strip().split()
        key = row[0]
        result[key] = row[1:]
print result

edited Jul 5, 2015 at 20:21

hobenkr

1,23411 silver badges30 bronze badges

answered Jul 5, 2015 at 19:18

Ivan

1,0959 silver badges17 bronze badges

2 Comments

Karvy1 Over a year ago

Why it is not CSV. Care to explain?

Ivan Over a year ago

CSV = Comma Separated Value - so fields are separated by comma, here I see that they are separated by space - so split is easier - csv reader gives result as noted in comment by @hobenkr

Collectives™ on Stack Overflow

Reading csv file in Python and create a dictionary

3 Answers 3

Comments

5 Comments

2 Comments

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

Comments

5 Comments

2 Comments

Related