0

I am trying to extract how many songs are release in every year from csv. my data looks like this

no,artist,name,year
"1","Bing Crosby","White Christmas","1942"
"2","Bill Haley & his Comets","Rock Around the Clock","1955"
"3","Sinead O'Connor","Nothing Compares 2 U","1990","35.554"
"4","Celine Dion","My Heart Will Go On","1998","35.405"
"5","Bryan Adams","(Everything I Do) I Do it For You","1991"
"6","The Beatles","Hey Jude","1968"
"7","Whitney Houston","I Will Always Love You","1992","34.560"
"8","Pink Floyd","Another Brick in the Wall (part 2)","1980"
"9","Irene Cara","Flashdance... What a Feeling","1983"
"10","Elton John","Candle in the Wind '97","1992"

my files consists of 3000 lines data with additional fields but i am interested to extract how many songs are released in every year

i tried to extract the year and songs and my code is here, but I am new in python and therefore I don't know how to solve my problem. my code is

from itertools import islice
import csv


filename = '/home/rob/traintask/top3000songs.csv'
data = csv.reader(open(filename))
# Read the column names from the first line of the file
fields = data.next()[3]  // I tried to read the year columns
print fields
count = 0
for row in data:
    # Zip together the field names and values
    items = zip(fields, row)
    item = {}   \\ here I am lost, i think i should make a dict and set year as key and no of songs as values, but I don't know how to do it
    # Add the value to our dictionary
    for (name, value) in items:
        item[name] = value.strip()
        print 'item: ', item

I am doing it completely wrong. but If somebody give me some hints or help that how i can count no of songs released in a year. i will be thankful.

2 Answers 2

2

2 very simple lines of code:

import pandas as pd
my_csv=pd.read_csv(filename)

and to get the number of songs per year:

songs_per_year= my_csv.groupby('year')['name'].count()
Sign up to request clarification or add additional context in comments.

6 Comments

thanks for the reply. but how i can calculate that how many songs are released for every year. I know I had to make a loop but I don't have any logic for that. and second it is possible to use simple csv and not pandas?
my_csv.groupby('year')['names'].count()
@rob BTW your CSV is problematic since some rows has 5 fields and not 4.
songs_per_year = my_csv.groupby('year').size() is cleaner, I think.
@andre I partially agree, even though in my POV it is a bit less readable since it is less clear we are counting the songs.
|
1

You can use a Counter object from the collections module..

>>> from collections import Counter
>>> from csv import reader
>>> 
>>> YEAR = 3
>>> with open('file.txt') as f:
...     next(f, None) # discard header
...     year2rel = Counter(int(line[YEAR]) for line in reader(f))
... 
>>> year2rel
Counter({1992: 2, 1942: 2, 1955: 1, 1990: 1, 1991: 1, 1968: 1, 1980: 1, 1983: 1})

2 Comments

thanx a lot. I will give it a try and will come back soon.
@thanks a lot timgeb. your solution also works fine and also beniev. I will accept his answer because if repled first. but I am really thankful to you for your help.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.