First of all I'm a newbie to Python. I'm trying to combine multiple data into a single CSV. Following is the CSV format,
file1.csv
Country of Residence,2014-04,2015-04
NORTH AMERICA ,"5,514","6,160"
Canada ,"2,417","2,864"
U.S.A. ,"3,097","3,296"
LATIN AMERICA & THE CARIBBEAN ,281,293
WESTERN EUROPE ,"37,369","34,964"
Austria ,893,666
Belgium ,867,995
file2.csv
Country of Residence,2014-11,2015-11
LATIN AMERICA & THE CARIBBEAN ,373,418
Argentina ,47,50
Brazil ,68,122
Chille ,24,30
Colombia ,31,25
Others ,203,191
WESTERN EUROPE-OTHERS ,1330,1367
Croatia ,77,72
Greece ,408,452
Ireland ,428,343
Finland ,149,178
Portugal ,211,261
Others ,57,61
In the final csv, I would like to have a unique header list as,
Country of Residence,2014-04,2015-04,2014-05,2015-05,..2014-11,2014-11
NORTH AMERICA ,"5,514","6,160",NaN,Nan,...
Portugal, Nan,Nan,Nan,Nan,.....,211,261
Also I would like have the country list to be unique, so I can fill the numbers by reading the csv list.
In the following code I get unique column headers but I don't know how to make the Country column unique and add a number based on country and month of the year..
Any help is greatly appreciated.
for filename in glob.iglob(os.path.join('/Documents/stats/csv','*.csv')):
with open(filename,'rb') as f:
csvIn = csv.reader(f)
hdr = csvIn.next()
hdr[0] = hdr[0].replace('\xef\xbb\xbf','')
hdrList.append((len(hdr),hdr))
hdrList.sort()
hdrs = []
template = []
for t in hdrList:
for f in t[1]:
print(f)
if
if not (f in hdrs):
hdrs.append(f)
template.append('')